nep-cmp New Economics Papers
on Computational Economics
Issue of 2026–01–26
33 papers chosen by
Stan Miles, Thompson Rivers University


  1. Resisting Manipulative Bots in Memecoin Copy Trading: A Multi-Agent Approach with Chain-of-Thought Reasoning By Yichen Luo; Yebo Feng; Jiahua Xu; Yang Liu
  2. Reinforcement Learning Based Computationally Efficient Conditional Choice Simulation Estimation of Dynamic Discrete Choice Models By Ahmed Khwaja; Sonal Srivastava
  3. Can Machine Learning Improve the Design of Set-Aside Auctions? By Schmidt, Lorenz; Ritter, Matthias; Mußhoff, Oliver; Odening, Martin
  4. On the use of case estimate and transactional payment data in neural networks for individual loss reserving By Benjamin Avanzi; Matthew Lambrianidis; Greg Taylor; Bernard Wong
  5. Predicting Mexico-to-US Migration with Machine Learning for Counterfactual Analysis By Chawla, Parth; Taylor, J. Edward
  6. AI-Trader: Benchmarking Autonomous Agents in Real-Time Financial Markets By Tianyu Fan; Yuhao Yang; Yangqin Jiang; Yifei Zhang; Yuxuan Chen; Chao Huang
  7. Instruction Finetuning LLaMA-3-8B Model Using LoRA for Financial Named Entity Recognition By Zhiming Lian
  8. Replication Study on “Machine Learning from a ‘Universe’ of Signals: The Role of Feature Engineering” (Li et al., 2025) By Cen, Huang; Wanying, Liao; He, Leng; Sheetal, Abhishek
  9. When Algorithms Rate Performance: Do Large Language Models Replicate Human Evaluation Biases? By Rainer Michael Rilke; Dirk Sliwka
  10. The Agentic Regulator: Risks for AI in Finance and a Proposed Agent-based Framework for Governance By Eren Kurshan; Tucker Balch; David Byrd
  11. Improving Financial Forecasting with a Synergistic LLM-Transformer Architecture: A Hybrid Approach to Stock Price Prediction By Sayed Akif Hussain; Chen Qiu-shi; Syed Amer Hussain; Syed Atif Hussain; Asma Komal; Muhammad Imran Khalid
  12. Empirical Mode Decomposition and Graph Transformation of the MSCI World Index: A Multiscale Topological Analysis for Graph Neural Network Modeling By Agust\'in M. de los Riscos; Julio E. Sandubete; Diego Carmona-Fern\'andez; Le\'on Bele\~na
  13. LLM Collusion By Shengyu Cao; Ming Hu
  14. Inflation Attitudes of Large Language Models By Nikoleta Anesti; Edward Hill; Andreas Joseph
  15. Explainable Prediction of Economic Time Series Using IMFs and Neural Networks By Pablo Hidalgo; Julio E. Sandubete; Agust\'in Garc\'ia-Garc\'ia
  16. Spatial-Dynamic Adoption of AI Weeding Robots: Insights from A Choice Experiment and an Agent-Based Model By Essakkat, Kaouter; Wu, Linghui; Atallah, Shady S.; Khanna, Madhu
  17. Transfer Learning (Il)liquidity By Andrea Conti; Giacomo Morelli
  18. Limits To (Machine) Learning By Zhimin Chen; Bryan T. Kelly; Semyon Malamud
  19. How AI Agents Follow the Herd of AI? Network Effects, History, and Machine Optimism By Yu Liu; Wenwen Li; Yifan Dou; Guangnan Ye
  20. Reasoning Models Ace the CFA Exams By Jaisal Patel; Yunzhe Chen; Kaiwen He; Keyi Wang; David Li; Kairong Xiao; Xiao-Yang Liu
  21. Predicting St. Louis Housing Prices with Machine Learning on Market and Assessor Data By Adler, Brian; Brown, Anne
  22. DeepSVM: Learning Stochastic Volatility Models with Physics-Informed Deep Operator Networks By Kieran A. Malandain; Selim Kalici; Hakob Chakhoyan
  23. Stochastic Volatility Modelling with LSTM Networks: A Hybrid Approach for S&P 500 Index Volatility Forecasting By Anna Perekhodko; Robert \'Slepaczuk
  24. Automatic debiased machine learning and sensitivity analysis for sample selection models By Jakob Bjelac; Victor Chernozhukov; Phil-Adrian Klotz; Jannis Kueck; Theresa M. A. Schmitz
  25. Uni-FinLLM: A Unified Multimodal Large Language Model with Modular Task Heads for Micro-Level Stock Prediction and Macro-Level Systemic Risk Assessment By Gongao Zhang; Haijiang Zeng; Lu Jiang
  26. Reinforcement Learning for Option Hedging: Static Implied-Volatility Fit versus Shortfall-Aware Performance By Ziheng Chen; Minxuan Hu; Jiayu Yi; Wenxi Sun
  27. DeePM: Regime-Robust Deep Learning for Systematic Macro Portfolio Management By Kieran Wood; Stephen J. Roberts; Stefan Zohren
  28. Ill-Conditioned Orthogonal Scores in Double Machine Learning By Gabriel Saco
  29. XGBoost Forecasting of NEPSE Index Log Returns with Walk Forward Validation By Sahaj Raj Malla; Shreeyash Kayastha; Rumi Suwal; Harish Chandra Bhandari; Rajendra Adhikari
  30. Unveiling Hedge Funds: Topic Modeling and Sentiment Correlation with Fund Performance By Chang Liu
  31. Emerging Trends in Tax Fraud Detection Using Artificial Intelligence-Based Technologies By James Alm; Rida Belahouaoui
  32. Double Machine Learning of Continuous Treatment Effects with General Instrumental Variables By Shuyuan Chen; Peng Zhang; Yifan Cui
  33. MIRAGE Model Documentation Version 2.0 By Antoine Bouët; Lionel Fontagné; Christophe Gouel; Houssein Guimbard; Cristina Mitaritonna

  1. By: Yichen Luo; Yebo Feng; Jiahua Xu; Yang Liu
    Abstract: The launch of \$Trump coin ignited a wave in meme coin investment. Copy trading, as a strategy-agnostic approach that eliminates the need for deep trading knowledge, quickly gains widespread popularity in the meme coin market. However, copy trading is not a guarantee of profitability due to the prevalence of manipulative bots, the uncertainty of the followed wallets' future performance, and the lag in trade execution. Recently, large language models (LLMs) have shown promise in financial applications by effectively understanding multi-modal data and producing explainable decisions. However, a single LLM struggles with complex, multi-faceted tasks such as asset allocation. These challenges are even more pronounced in cryptocurrency markets, where LLMs often lack sufficient domain-specific knowledge in their training data. To address these challenges, we propose an explainable multi-agent system for meme coin copy trading. Inspired by the structure of an asset management team, our system decomposes the complex task into subtasks and coordinates specialized agents to solve them collaboratively. Employing few-shot chain-of-though (CoT) prompting, each agent acquires professional meme coin trading knowledge, interprets multi-modal data, and generates explainable decisions. Using a dataset of 1, 000 meme coin projects' transaction data, our empirical evaluation shows that the proposed multi-agent system outperforms both traditional machine learning models and single LLMs, achieving 73% and 70% precision in identifying high-quality meme coin projects and key opinion leader (KOL) wallets, respectively. The selected KOLs collectively generated a total profit of \$500, 000 across these projects.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.08641
  2. By: Ahmed Khwaja; Sonal Srivastava
    Abstract: Dynamic discrete choice (DDC) models have found widespread application in marketing. However, estimating these becomes challenging in "big data" settings with high-dimensional state-action spaces. To address this challenge, this paper develops a Reinforcement Learning (RL)-based two-step ("computationally light") Conditional Choice Simulation (CCS) estimation approach that combines the scalability of machine learning with the transparency, explainability, and interpretability of structural models, which is particularly valuable for counterfactual policy analysis. The method is premised on three insights: (1) the CCS ("forward simulation") approach is a special case of RL algorithms, (2) starting from an initial state-action pair, CCS updates the corresponding value function only after each simulation path has terminated, whereas RL algorithms may update for all the state-action pairs visited along a simulated path, and (3) RL focuses on inferring an agent's optimal policy with known reward functions, whereas DDC models focus on estimating the reward functions presupposing optimal policies. The procedure's computational efficiency over CCS estimation is demonstrated using Monte Carlo simulations with a canonical machine replacement and a consumer food purchase model. Framing CCS estimation of DDC models as an RL problem increases their applicability and scalability to high-dimensional marketing problems while retaining both interpretability and tractability.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.02069
  3. By: Schmidt, Lorenz; Ritter, Matthias; Mußhoff, Oliver; Odening, Martin
    Keywords: Agricultural Finance, Farm Management
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:ags:aaea25:360670
  4. By: Benjamin Avanzi; Matthew Lambrianidis; Greg Taylor; Bernard Wong
    Abstract: The use of neural networks trained on individual claims data has become increasingly popular in the actuarial reserving literature. We consider how to best input historical payment data in neural network models. Additionally, case estimates are also available in the format of a time series, and we extend our analysis to assessing their predictive power. In this paper, we compare a feed-forward neural network trained on summarised transactions to a recurrent neural network equipped to analyse a claim's entire payment history and/or case estimate development history. We draw conclusions from training and comparing the performance of the models on multiple, comparable highly complex datasets simulated from SPLICE (Avanzi, Taylor and Wang, 2023). We find evidence that case estimates will improve predictions significantly, but that equipping the neural network with memory only leads to meagre improvements. Although the case estimation process and quality will vary significantly between insurers, we provide a standardised methodology for assessing their value.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.05274
  5. By: Chawla, Parth; Taylor, J. Edward
    Keywords: Research and Development/Tech Change/Emerging Technologies
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:ags:aaea25:361223
  6. By: Tianyu Fan; Yuhao Yang; Yangqin Jiang; Yifei Zhang; Yuxuan Chen; Chao Huang
    Abstract: Large Language Models (LLMs) have demonstrated remarkable potential as autonomous agents, approaching human-expert performance through advanced reasoning and tool orchestration. However, decision-making in fully dynamic and live environments remains highly challenging, requiring real-time information integration and adaptive responses. While existing efforts have explored live evaluation mechanisms in structured tasks, a critical gap remains in systematic benchmarking for real-world applications, particularly in finance where stringent requirements exist for live strategic responsiveness. To address this gap, we introduce AI-Trader, the first fully-automated, live, and data-uncontaminated evaluation benchmark for LLM agents in financial decision-making. AI-Trader spans three major financial markets: U.S. stocks, A-shares, and cryptocurrencies, with multiple trading granularities to simulate live financial environments. Our benchmark implements a revolutionary fully autonomous minimal information paradigm where agents receive only essential context and must independently search, verify, and synthesize live market information without human intervention. We evaluate six mainstream LLMs across three markets and multiple trading frequencies. Our analysis reveals striking findings: general intelligence does not automatically translate to effective trading capability, with most agents exhibiting poor returns and weak risk management. We demonstrate that risk control capability determines cross-market robustness, and that AI trading strategies achieve excess returns more readily in highly liquid markets than policy-driven environments. These findings expose critical limitations in current autonomous agents and provide clear directions for future improvements. The code and evaluation data are open-sourced to foster community research: https://github.com/HKUDS/AI-Trader.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.10971
  7. By: Zhiming Lian
    Abstract: Particularly, financial named-entity recognition (NER) is one of the many important approaches to translate unformatted reports and news into structured knowledge graphs. However, free, easy-to-use large language models (LLMs) often fail to differentiate organisations as people, or disregard an actual monetary amount entirely. This paper takes Meta's Llama 3 8B and applies it to financial NER by combining instruction fine-tuning and Low-Rank Adaptation (LoRA). Each annotated sentence is converted into an instruction-input-output triple, enabling the model to learn task descriptions while fine-tuning with small low-rank matrices instead of updating all weights. Using a corpus of 1, 693 sentences, our method obtains a micro-F1 score of 0.894 compared with Qwen3-8B, Baichuan2-7B, T5, and BERT-Base. We present dataset statistics, describe training hyperparameters, and perform visualizations of entity density, learning curves, and evaluation metrics. Our results show that instruction tuning combined with parameter-efficient fine-tuning enables state-of-the-art performance on domain-sensitive NER.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.10043
  8. By: Cen, Huang; Wanying, Liao; He, Leng; Sheetal, Abhishek (The Hong Kong Polytechnic University)
    Abstract: This paper replicates and extends the study of Li et al. (2025) to investigate the role of feature engineering in machine learning (ML)-based cross-sectional stock return prediction. We construct a 3-tier feature system with 78 effective features, including basic financial ratios, financial change features, and growth quality features, using CRSP and Compustat data. Through a recursive rolling window approach from 1969 to 2018, we compare the performance of boosted regression trees (BRT), neural networks (NN), and the newly added extreme gradient boosting (XGBoost) models. The results show that XGBoost produces the highest accuracy in prediction since it captures statistical correlations among features efficiently, while it underperforms in terms of investment return due to its sensitivity to limited feature quality and the gap between statistical fitting and economic profitability. On the contrary, the BRT model generates the most robust performance for a strategy since it is more tolerant of noisy features within an incomplete information environment. Compared with Li et al. (2025), our strategy exhibits a lower Sharpe ratio and an insignificant risk-adjusted alpha. It is mainly due to the smaller number of features and the different sample period. This paper confirms the core conclusion of the original paper that feature engineering rather than model complexity is crucial for ML investment strategies. It offers empirical knowledge regarding real-time portfolio construction.
    Date: 2026–01–05
    URL: https://d.repec.org/n?u=RePEc:osf:socarx:3fh8x_v2
  9. By: Rainer Michael Rilke (WHU - Otto Beisheim School of Management); Dirk Sliwka (University of Cologne)
    Abstract: A large body of research across management, psychology, accounting, and economics shows that subjective performance evaluations are systematically biased: ratings cluster near the midpoint of scales and are often excessively lenient. As organizations increasingly adopt large language models (LLMs) for evaluative tasks, little is known about how these systems perform when assessing human performance. We document that, in the absence of clear objective standards and when individuals are rated independently, LLMs reproduce the familiar patterns of human raters. However, LLMs generate greater dispersion and accuracy when evaluating multiple individuals simultaneously. With noisy but objective performance signals, LLMs provide substantially more accurate evaluations than human raters, as they (i) are less subject to biases arising from concern for the evaluated employee and (ii) make fewer mistakes in information processing closely approximating rational Bayesian benchmarks.
    Keywords: Performance Evaluation, Large Language Models, Signal Objectivity, Algorithmic Judgment, Gen-AI
    JEL: J24 J28 M12 M53
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:ajk:ajkdps:384
  10. By: Eren Kurshan; Tucker Balch; David Byrd
    Abstract: Generative and agentic artificial intelligence is entering financial markets faster than existing governance can adapt. Current model-risk frameworks assume static, well-specified algorithms and one-time validations; large language models and multi-agent trading systems violate those assumptions by learning continuously, exchanging latent signals, and exhibiting emergent behavior. Drawing on complex adaptive systems theory, we model these technologies as decentralized ensembles whose risks propagate along multiple time-scales. We then propose a modular governance architecture. The framework decomposes oversight into four layers of "regulatory blocks": (i) self-regulation modules embedded beside each model, (ii) firm-level governance blocks that aggregate local telemetry and enforce policy, (iii) regulator-hosted agents that monitor sector-wide indicators for collusive or destabilizing patterns, and (iv) independent audit blocks that supply third-party assurance. Eight design strategies enable the blocks to evolve as fast as the models they police. A case study on emergent spoofing in multi-agent trading shows how the layered controls quarantine harmful behavior in real time while preserving innovation. The architecture remains compatible with today's model-risk rules yet closes critical observability and control gaps, providing a practical path toward resilient, adaptive AI governance in financial systems.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.11933
  11. By: Sayed Akif Hussain; Chen Qiu-shi; Syed Amer Hussain; Syed Atif Hussain; Asma Komal; Muhammad Imran Khalid
    Abstract: This study proposes a novel hybrid deep learning framework that integrates a Large Language Model (LLM) with a Transformer architecture for stock price forecasting. The research addresses a critical theoretical gap in existing approaches that empirically combine textual and numerical data without a formal understanding of their interaction mechanisms. We conceptualise a prompt-based LLM as a mathematically defined signal generator, capable of extracting directional market sentiment and an associated confidence score from financial news. These signals are then dynamically fused with structured historical price features through a noise-robust gating mechanism, enabling the Transformer to adaptively weigh semantic and quantitative information. Empirical evaluations demonstrate that the proposed Hybrid LLM-Transformer model significantly outperforms a Vanilla Transformer baseline, reducing the Root Mean Squared Error (RMSE) by 5.28% (p = 0.003). Moreover, ablation and robustness analyses confirm the model's stability under noisy conditions and its capacity to maintain interpretability through confidence-weighted attention. The findings provide both theoretical and empirical support for a paradigm shift from empirical observation to formalised modelling of LLM-Transformer interactions, paving the way toward explainable, noise-resilient, and semantically enriched financial forecasting systems.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.02878
  12. By: Agust\'in M. de los Riscos; Julio E. Sandubete; Diego Carmona-Fern\'andez; Le\'on Bele\~na
    Abstract: This study applies Empirical Mode Decomposition (EMD) to the MSCI World index and converts the resulting intrinsic mode functions (IMFs) into graph representations to enable modeling with graph neural networks (GNNs). Using CEEMDAN, we extract nine IMFs spanning high-frequency fluctuations to long-term trends. Each IMF is transformed into a graph using four time-series-to-graph methods: natural visibility, horizontal visibility, recurrence, and transition graphs. Topological analysis shows clear scale-dependent structure: high-frequency IMFs yield dense, highly connected small-world graphs, whereas low-frequency IMFs produce sparser networks with longer characteristic path lengths. Visibility-based methods are more sensitive to amplitude variability and typically generate higher clustering, while recurrence graphs better preserve temporal dependencies. These results provide guidance for designing GNN architectures tailored to the structural properties of decomposed components, supporting more effective predictive modeling of financial time series.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.12526
  13. By: Shengyu Cao; Ming Hu
    Abstract: We study how delegating pricing to large language models (LLMs) can facilitate collusion in a duopoly when both sellers rely on the same pre-trained model. The LLM is characterized by (i) a propensity parameter capturing its internal bias toward high-price recommendations and (ii) an output-fidelity parameter measuring how tightly outputs track that bias; the propensity evolves through retraining. We show that configuring LLMs for robustness and reproducibility can induce collusion via a phase transition: there exists a critical output-fidelity threshold that pins down long-run behavior. Below it, competitive pricing is the unique long-run outcome. Above it, the system is bistable, with competitive and collusive pricing both locally stable and the realized outcome determined by the model's initial preference. The collusive regime resembles tacit collusion: prices are elevated on average, yet occasional low-price recommendations provide plausible deniability. With perfect fidelity, full collusion emerges from any interior initial condition. For finite training batches of size $b$, infrequent retraining (driven by computational costs) further amplifies collusion: conditional on starting in the collusive basin, the probability of collusion approaches one as $b$ grows, since larger batches dampen stochastic fluctuations that might otherwise tip the system toward competition. The indeterminacy region shrinks at rate $O(1/\sqrt{b})$.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.01279
  14. By: Nikoleta Anesti; Edward Hill; Andreas Joseph
    Abstract: This paper investigates the ability of Large Language Models (LLMs), specifically GPT-3.5-turbo (GPT), to form inflation perceptions and expectations based on macroeconomic price signals. We compare the LLM's output to household survey data and official statistics, mimicking the information set and demographic characteristics of the Bank of England's Inflation Attitudes Survey (IAS). Our quasi-experimental design exploits the timing of GPT's training cut-off in September 2021 which means it has no knowledge of the subsequent UK inflation surge. We find that GPT tracks aggregate survey projections and official statistics at short horizons. At a disaggregated level, GPT replicates key empirical regularities of households' inflation perceptions, particularly for income, housing tenure, and social class. A novel Shapley value decomposition of LLM outputs suited for the synthetic survey setting provides well-defined insights into the drivers of model outputs linked to prompt content. We find that GPT demonstrates a heightened sensitivity to food inflation information similar to that of human respondents. However, we also find that it lacks a consistent model of consumer price inflation. More generally, our approach could be used to evaluate the behaviour of LLMs for use in the social sciences, to compare different models, or to assist in survey design.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.14306
  15. By: Pablo Hidalgo; Julio E. Sandubete; Agust\'in Garc\'ia-Garc\'ia
    Abstract: This study investigates the contribution of Intrinsic Mode Functions (IMFs) derived from economic time series to the predictive performance of neural network models, specifically Multilayer Perceptrons (MLP) and Long Short-Term Memory (LSTM) networks. To enhance interpretability, DeepSHAP is applied, which estimates the marginal contribution of each IMF while keeping the rest of the series intact. Results show that the last IMFs, representing long-term trends, are generally the most influential according to DeepSHAP, whereas high-frequency IMFs contribute less and may even introduce noise, as evidenced by improved metrics upon their removal. Differences between MLP and LSTM highlight the effect of model architecture on feature relevance distribution, with LSTM allocating importance more evenly across IMFs.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.12499
  16. By: Essakkat, Kaouter; Wu, Linghui; Atallah, Shady S.; Khanna, Madhu
    Keywords: Productivity Analysis, Research and Development/Tech Change/Emerging Technologies
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:ags:aaea25:361092
  17. By: Andrea Conti; Giacomo Morelli
    Abstract: The estimation of the Risk Neutral Density (RND) implicit in option prices is challenging, especially in illiquid markets. We introduce the Deep Log-Sum-Exp Neural Network, an architecture that leverages Deep and Transfer learning to address RND estimation in the presence of irregular and illiquid strikes. We prove key statistical properties of the model and the consistency of the estimator. We illustrate the benefits of transfer learning to improve the estimation of the RND in severe illiquidity conditions through Monte Carlo simulations, and we test it empirically on SPX data, comparing it with popular estimation methods. Overall, our framework shows recovery of the RND in conditions of extreme illiquidity with as few as three option quotes.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.11731
  18. By: Zhimin Chen (Nanyang Business School, Nanyang Technological University); Bryan T. Kelly (Yale SOM; AQR Capital Management, LLC; National Bureau of Economic Research (NBER)); Semyon Malamud (Ecole Polytechnique Federale de Lausanne; Centre for Economic Policy Research (CEPR); Swiss Finance Institute)
    Abstract: Machine learning (ML) methods are highly flexible, but their ability to approximate the true data-generating process is fundamentally constrained by finite samples. We characterize a universal lower bound, the Limits-to-Learning Gap (LLG), quantifying the unavoidable discrepancy between a model's empirical fit and the population benchmark. Recovering the true population R 2 , therefore, requires correcting observed predictive performance by this bound. Using a broad set of variables, including excess returns, yields, credit spreads, and valuation ratios, we find that the implied LLGs are large. This indicates that standard ML approaches can substantially understate true predictability in financial data. We also derive LLG-based refinements to the classic Hansen and Jagannathan (1991) bounds, analyze implications for parameter learning in general-equilibrium settings, and show that the LLG provides a natural mechanism for generating excess volatility.
    Keywords: machine learning, asset pricing, predictability, big data, limits to learning, excess volatility, stochastic discount factor, kernel methods
    JEL: C13 C32 C55 C58 G12 G17
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:chf:rpseri:rp25106
  19. By: Yu Liu; Wenwen Li; Yifan Dou; Guangnan Ye
    Abstract: Understanding decision-making in multi-AI-agent frameworks is crucial for analyzing strategic interactions in network-effect-driven contexts. This study investigates how AI agents navigate network-effect games, where individual payoffs depend on peer participatio--a context underexplored in multi-agent systems despite its real-world prevalence. We introduce a novel workflow design using large language model (LLM)-based agents in repeated decision-making scenarios, systematically manipulating price trajectories (fixed, ascending, descending, random) and network-effect strength. Our key findings include: First, without historical data, agents fail to infer equilibrium. Second, ordered historical sequences (e.g., escalating prices) enable partial convergence under weak network effects but strong effects trigger persistent "AI optimism"--agents overestimate participation despite contradictory evidence. Third, randomized history disrupts convergence entirely, demonstrating that temporal coherence in data shapes LLMs' reasoning, unlike humans. These results highlight a paradigm shift: in AI-mediated systems, equilibrium outcomes depend not just on incentives, but on how history is curated, which is impossible for human.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.11943
  20. By: Jaisal Patel; Yunzhe Chen; Kaiwen He; Keyi Wang; David Li; Kairong Xiao; Xiao-Yang Liu
    Abstract: Previous research has reported that large language models (LLMs) demonstrate poor performance on the Chartered Financial Analyst (CFA) exams. However, recent reasoning models have achieved strong results on graduate-level academic and professional examinations across various disciplines. In this paper, we evaluate state-of-the-art reasoning models on a set of mock CFA exams consisting of 980 questions across three Level I exams, two Level II exams, and three Level III exams. Using the same pass/fail criteria from prior studies, we find that most models clear all three levels. The models that pass, ordered by overall performance, are Gemini 3.0 Pro, Gemini 2.5 Pro, GPT-5, Grok 4, Claude Opus 4.1, and DeepSeek-V3.1. Specifically, Gemini 3.0 Pro achieves a record score of 97.6% on Level I. Performance is also strong on Level II, led by GPT-5 at 94.3%. On Level III, Gemini 2.5 Pro attains the highest score with 86.4% on multiple-choice questions while Gemini 3.0 Pro achieves 92.0% on constructed-response questions.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.08270
  21. By: Adler, Brian; Brown, Anne
    Abstract: Housing markets are more complex than a simple supply-demand relationship. Prices are set by complex market and spatial neighborhood dynamics. Certain cities like St. Louis, MO have experienced dramatic population decline marked by extreme vacancy and abandonment. Amidst its population decline, St. Louis simultaneously demonstrates neighborhoods with sharp housing shortages and competition alongside others with entrenched vacancy and disinvestment mere blocks away from one another. We use supervised machine learning models to predict housing prices with a diverse feature set that incorporates spatial aspects of vacancy among other traditional housing amenities in St. Louis. Our results show how proximity to vacancy may impact a home’s value even more than its number of bedrooms. These findings, we expect, may prompt policymakers to combat vacancy even more urgently to maintain neighborhood market stability.
    Date: 2026–01–06
    URL: https://d.repec.org/n?u=RePEc:osf:socarx:s9v4u_v1
  22. By: Kieran A. Malandain; Selim Kalici; Hakob Chakhoyan
    Abstract: Real-time calibration of stochastic volatility models (SVMs) is computationally bottlenecked by the need to repeatedly solve coupled partial differential equations (PDEs). In this work, we propose DeepSVM, a physics-informed Deep Operator Network (PI-DeepONet) designed to learn the solution operator of the Heston model across its entire parameter space. Unlike standard data-driven deep learning (DL) approaches, DeepSVM requires no labelled training data. Rather, we employ a hard-constrained ansatz that enforces terminal payoffs and static no-arbitrage conditions by design. Furthermore, we use Residual-based Adaptive Refinement (RAR) to stabilize training in difficult regions subject to high gradients. Overall, DeepSVM achieves a final training loss of $10^{-5}$ and predicts highly accurate option prices across a range of typical market dynamics. While pricing accuracy is high, we find that the model's derivatives (Greeks) exhibit noise in the at-the-money (ATM) regime, highlighting the specific need for higher-order regularization in physics-informed operator learning.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.07162
  23. By: Anna Perekhodko; Robert \'Slepaczuk
    Abstract: Accurate volatility forecasting is essential in banking, investment, and risk management, because expectations about future market movements directly influence current decisions. This study proposes a hybrid modelling framework that integrates a Stochastic Volatility model with a Long Short Term Memory neural network. The SV model improves statistical precision and captures latent volatility dynamics, especially in response to unforeseen events, while the LSTM network enhances the model's ability to detect complex nonlinear patterns in financial time series. The forecasting is conducted using daily data from the S and P 500 index, covering the period from January 1 1998 to December 31 2024. A rolling window approach is employed to train the model and generate one step ahead volatility forecasts. The performance of the hybrid SV-LSTM model is evaluated through both statistical testing and investment simulations. The results show that the hybrid approach outperforms both the standalone SV and LSTM models and contributes to the development of volatility modelling techniques, providing a foundation for improving risk assessment and strategic investment planning in the context of the S and P 500.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.12250
  24. By: Jakob Bjelac; Victor Chernozhukov; Phil-Adrian Klotz; Jannis Kueck; Theresa M. A. Schmitz
    Abstract: In this paper, we extend the Riesz representation framework to causal inference under sample selection, where both treatment assignment and outcome observability are non-random. Formulating the problem in terms of a Riesz representer enables stable estimation and a transparent decomposition of omitted variable bias into three interpretable components: a data-identified scale factor, outcome confounding strength, and selection confounding strength. For estimation, we employ the ForestRiesz estimator, which accounts for selective outcome observability while avoiding the instability associated with direct propensity score inversion. We assess finite-sample performance through a simulation study and show that conventional double machine learning approaches can be highly sensitive to tuning parameters due to their reliance on inverse probability weighting, whereas the ForestRiesz estimator delivers more stable performance by leveraging automatic debiased machine learning. In an empirical application to the gender wage gap in the U.S., we find that our ForestRiesz approach yields larger treatment effect estimates than a standard double machine learning approach, suggesting that ignoring sample selection leads to an underestimation of the gender wage gap. Sensitivity analysis indicates that implausibly strong unobserved confounding would be required to overturn our results. Overall, our approach provides a unified, robust, and computationally attractive framework for causal inference under sample selection.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.08643
  25. By: Gongao Zhang; Haijiang Zeng; Lu Jiang
    Abstract: Financial institutions and regulators require systems that integrate heterogeneous data to assess risks from stock fluctuations to systemic vulnerabilities. Existing approaches often treat these tasks in isolation, failing to capture cross-scale dependencies. We propose Uni-FinLLM, a unified multimodal large language model that uses a shared Transformer backbone and modular task heads to jointly process financial text, numerical time series, fundamentals, and visual data. Through cross-modal attention and multi-task optimization, it learns a coherent representation for micro-, meso-, and macro-level predictions. Evaluated on stock forecasting, credit-risk assessment, and systemic-risk detection, Uni-FinLLM significantly outperforms baselines. It raises stock directional accuracy to 67.4% (from 61.7%), credit-risk accuracy to 84.1% (from 79.6%), and macro early-warning accuracy to 82.3%. Results validate that a unified multimodal LLM can jointly model asset behavior and systemic vulnerabilities, offering a scalable decision-support engine for finance.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.02677
  26. By: Ziheng Chen; Minxuan Hu; Jiayu Yi; Wenxi Sun
    Abstract: We extend the Q-learner in Black-Scholes (QLBS) framework by incorporating risk aversion and trading costs, and propose a novel Replication Learning of Option Pricing (RLOP) approach. Both methods are fully compatible with standard reinforcement learning algorithms and operate under market frictions. Using SPY and XOP option data, we evaluate performance along static and dynamic dimensions. Adaptive-QLBS achieves higher static pricing accuracy in implied volatility space, while RLOP delivers superior dynamic hedging performance by reducing shortfall probability. These results highlight the importance of evaluating option pricing models beyond static fit, emphasizing realized hedging outcomes.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.01709
  27. By: Kieran Wood; Stephen J. Roberts; Stefan Zohren
    Abstract: We propose DeePM (Deep Portfolio Manager), a structured deep-learning macro portfolio manager trained end-to-end to maximize a robust, risk-adjusted utility. DeePM addresses three fundamental challenges in financial learning: (1) it resolves the asynchronous "ragged filtration" problem via a Directed Delay (Causal Sieve) mechanism that prioritizes causal impulse-response learning over information freshness; (2) it combats low signal-to-noise ratios via a Macroeconomic Graph Prior, regularizing cross-asset dependence according to economic first principles; and (3) it optimizes a distributionally robust objective where a smooth worst-window penalty serves as a differentiable proxy for Entropic Value-at-Risk (EVaR) - a window-robust utility encouraging strong performance in the most adverse historical subperiods. In large-scale backtests from 2010-2025 on 50 diversified futures with highly realistic transaction costs, DeePM attains net risk-adjusted returns that are roughly twice those of classical trend-following strategies and passive benchmarks, solely using daily closing prices. Furthermore, DeePM improves upon the state-of-the-art Momentum Transformer architecture by roughly fifty percent. The model demonstrates structural resilience across the 2010s "CTA (Commodity Trading Advisor) Winter" and the post-2020 volatility regime shift, maintaining consistent performance through the pandemic, inflation shocks, and the subsequent higher-for-longer environment. Ablation studies confirm that strictly lagged cross-sectional attention, graph prior, principled treatment of transaction costs, and robust minimax optimization are the primary drivers of this generalization capability.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.05975
  28. By: Gabriel Saco
    Abstract: Double Machine Learning is often justified by nuisance-rate conditions, yet finite-sample reliability also depends on the conditioning of the orthogonal-score Jacobian. This conditioning is typically assumed rather than tracked. When residualized treatment variance is small, the Jacobian is ill-conditioned and small systematic nuisance errors can be amplified, so nominal confidence intervals may look precise yet systematically under-cover. Our main result is an exact identity for the cross-fitted PLR-DML estimator, with no Taylor approximation. From this identity, we derive a stochastic-order bound that separates oracle noise from a conditioning-amplified nuisance remainder and yields a sufficiency condition for root-n-inference. We further connect the amplification factor to semiparametric efficiency geometry via the Riesz representer and use a triangular-array framework to characterize regimes as residual treatment variation weakens. These results motivate an out-of-fold diagnostic that summarizes the implied amplification scale. We do not propose universal thresholds. Instead, we recommend reporting the diagnostic alongside cross-learner sensitivity summaries as a fragility assessment, illustrated in simulation and an empirical example.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.07083
  29. By: Sahaj Raj Malla; Shreeyash Kayastha; Rumi Suwal; Harish Chandra Bhandari; Rajendra Adhikari
    Abstract: This study develops a robust machine learning framework for one-step-ahead forecasting of daily log-returns in the Nepal Stock Exchange (NEPSE) Index using the XGBoost regressor. A comprehensive feature set is engineered, including lagged log-returns (up to 30 days) and established technical indicators such as short- and medium-term rolling volatility measures and the 14-period Relative Strength Index. Hyperparameter optimization is performed using Optuna with time-series cross-validation on the initial training segment. Out-of-sample performance is rigorously assessed via walk-forward validation under both expanding and fixed-length rolling window schemes across multiple lag configurations, simulating real-world deployment and avoiding lookahead bias. Predictive accuracy is evaluated using root mean squared error, mean absolute error, coefficient of determination (R-squared), and directional accuracy on both log-returns and reconstructed closing prices. Empirical results show that the optimal configuration, an expanding window with 20 lags, outperforms tuned ARIMA and Ridge regression benchmarks, achieving the lowest log-return RMSE (0.013450) and MAE (0.009814) alongside a directional accuracy of 65.15%. While the R-squared remains modest, consistent with the noisy nature of financial returns, primary emphasis is placed on relative error reduction and directional prediction. Feature importance analysis and visual inspection further enhance interpretability. These findings demonstrate the effectiveness of gradient boosting ensembles in modeling nonlinear dynamics in volatile emerging market time series and establish a reproducible benchmark for NEPSE Index forecasting.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.08896
  30. By: Chang Liu
    Abstract: The hedge fund industry presents significant challenges for investors due to its opacity and limited disclosure requirements. This pioneering study introduces two major innovations in financial text analysis. First, we apply topic modeling to hedge fund documents-an unexplored domain for automated text analysis-using a unique dataset of over 35, 000 documents from 1, 125 hedge fund managers. We compared three state-of-the-art methods: Latent Dirichlet Allocation (LDA), Top2Vec, and BERTopic. Our findings reveal that LDA with 20 topics produces the most interpretable results for human users and demonstrates higher robustness in topic assignments when the number of topics varies, while Top2Vec shows superior classification performance. Second, we establish a novel quantitative framework linking document sentiment to fund performance, transforming qualitative information traditionally requiring expert interpretation into systematic investment signals. In sentiment analysis, contrary to expectations, the general-purpose DistilBERT outperforms the finance-specific FinBERT in generating sentiment scores, demonstrating superior adaptability to diverse linguistic patterns found in hedge fund documents that extend beyond specialized financial news text. Furthermore, sentiment scores derived using DistilBERT in combination with Top2Vec show stronger correlations with subsequent fund performance compared to other model combinations. These results demonstrate that automated topic modeling and sentiment analysis can effectively process hedge fund documents, providing investors with new data-driven decision support tools.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.06620
  31. By: James Alm (Tulane University); Rida Belahouaoui (Cadi Ayyad University)
    Abstract: This study examines the role of artificial intelligence (AI) tools in enhancing tax fraud detection within the ambit of the OECD Tax Administration 3.0, focusing on how these technologies streamline the detection process through a new "Adaptive AI Tax Oversight" (AATO) framework. Through a textometric systematic review covering the period from 2014 to 2024, we examine the integration of AI in tax fraud detection. The methodology emphasizes the evaluation of AI's predictive, analytical, and procedural benefits in identifying and combating tax fraud. The research underscores AI's significant impact on increasing detection accuracy, predictive capabilities, and operational efficiency in tax administrations. Key findings reveal the ways by which the development and application of the AATO framework improves the tax fraud detection process, and the implications offer a roadmap for global tax authorities to utilize AI in bolstering detection efforts, potentially lowering compliance expenses and improving regulatory frameworks.
    Keywords: Artificial intelligence, tax fraud, AATO framework, blockchain, neural networks, data mining
    JEL: C45 H26
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:tul:wpaper:2511
  32. By: Shuyuan Chen; Peng Zhang; Yifan Cui
    Abstract: Estimating causal effects of continuous treatments is a common problem in practice, for example, in studying dose-response functions. Classical analyses typically assume that all confounders are fully observed, whereas in real-world applications, unmeasured confounding often persists. In this article, we propose a novel framework for local identification of dose-response functions using instrumental variables, thereby mitigating bias induced by unobserved confounders. We introduce the concept of a uniform regular weighting function and consider covering the treatment space with a finite collection of open sets. On each of these sets, such a weighting function exists, allowing us to identify the dose-response function locally within the corresponding region. For estimation, we develop an augmented inverse probability weighting score for continuous treatments under a debiased machine learning framework with instrumental variables. We further establish the asymptotic properties when the dose-response function is estimated via kernel regression or empirical risk minimization. Finally, we conduct both simulation and empirical studies to assess the finite-sample performance of the proposed methods.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.01471
  33. By: Antoine Bouët; Lionel Fontagné; Christophe Gouel; Houssein Guimbard; Cristina Mitaritonna
    Abstract: MIRAGE is a multi-region, multi-sector computable general equilibrium (CGE) model, initially devoted to trade policy analysis and more recently applied to long-term growth and environmental issues. It incorporates energy, carbon pricing, imperfect competition, and rigid investment allocation, in a sequential dynamic setup where installed capital is assumed to be immobile. The model provides trade analysis with detailed treatment of trade costs and Armington specifications, drawing upon a detailed measure of trade barriers through the MAcMap-HS6 database. Production features nested CES functions with capital-energy bundles under both perfect and imperfect competition frameworks, while final demand follows a LES-CES utility function. The sequential dynamic framework enables longterm simulations by combining total factor productivity calibration with macroeconomic projections from the MaGE model. The most recent version offers significant improvements in electricity sector modeling with renewable energy representation, base-load and peak-load dinstinctions, and detailed greenhouse gas (GHG) emissions accounting with carbon market mechanisms. This documentation provides complete technical specifications, calibration procedures, and implementation guidelines for researchers and policymakers using MIRAGE for economic policy analysis.
    Keywords: Computable General Equilibrium;Trade Policy;Environnemental Policy
    JEL: C68 F1 Q54 Q56 Q40
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:cii:cepidt:2026-01

This nep-cmp issue is ©2026 by Stan Miles. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.