nep-cmp New Economics Papers
on Computational Economics
Issue of 2026–01–19
seventeen papers chosen by
Stan Miles, Thompson Rivers University


  1. Revisiting exchange rate predictability: Does machine learning help? By Uluc Aysun; Melanie Guldi
  2. Structured Event Representation and Stock Return Predictability By Gang Li; Dandan Qiao; Mingxuan Zheng
  3. Learning the Macroeconomic Language By Siddhartha Chib; Fei Tan
  4. Structural Reinforcement Learning for Heterogeneous Agent Macroeconomics By Yucheng Yang; Chiyuan Wang; Andreas Schaab; Benjamin Moll
  5. Computing XVA for American basket derivatives by machine learning techniques By Ludovic Goudenège; Andrea Molent; Antonino Zanette
  6. Redefining Regions in Space and Time: A Deep Learning Method for Spatio-Temporal Clustering By Quintana Pablo; Herrera-Gomez Marcos
  7. Inferring Latent Market Forces: Evaluating LLM Detection of Gamma Exposure Patterns via Obfuscation Testing By Christopher Regan; Ying Xie
  8. A Novel Deep Learning Framework for Economic Video Analysis and Tactical Insight Extraction By Zare, Hassan; Mousavi, Ebrahim
  9. Multimodal LLMs for Historical Dataset Construction from Archival Image Scans: German Patents (1877-1918) By Niclas Griesshaber; Jochen Streb
  10. Deep Hedging with Reinforcement Learning: A Practical Framework for Option Risk Management By Travon Lucius; Christian Koch Jr; Jacob Starling; Julia Zhu; Miguel Urena; Carrie Hu
  11. Scaling Laws for Economic Productivity: Experimental Evidence in LLM-Assisted Consulting, Data Analyst, and Management Tasks By Ali Merali
  12. Generative AI for Analysts By Jian Xue; Qian Zhang; Wu Zhu
  13. Inefficient forecast narratives: A BERT-based approach By Foltas, Alexander
  14. Explainable Artificial Intelligence for Economic Time Series: A Comprehensive Review and a Systematic Taxonomy of Methods and Concepts By Agust\'in Garc\'ia-Garc\'ia; Pablo Hidalgo; Julio E. Sandubete
  15. Will AI Trade? A Computational Inversion of the No-Trade Theorem By Hanyu Li; Xiaotie Deng
  16. Branch-Price-and-Cut for the Vehicle Routing Problem With Simultaneous Delivery and Pickup, Time Windows, and Load-Dependent Cost By Carolin Hasse; Stefan Irnich
  17. LLM Personas as a Substitute for Field Experiments in Method Benchmarking By Enoch Hyunwook Kang

  1. By: Uluc Aysun (University of Central Florida, Orlando, FL); Melanie Guldi (University of Central Florida, Orlando, FL)
    Abstract: We revisit the exchange-rate predictability puzzle by asking whether standard, widely used machine-learning (ML) algorithms convincingly improve exchange rate forecasting once evaluation is disciplined and implementation is made robust. Using monthly data from January 1986 to February 2025, we study US dollar to British pound as the baseline case (in both levels and monthly percent changes). We compare five ML methods -- random forests, neural networks, LASSO, gradient boosting, and linear support-vector classification -- against canonical benchmarks (random walk and ARIMA) in a rolling one-step-ahead out-of-sample forecasting design. To mitigate sensitivity to stochastic estimation, we average forecasts across multiple random seeds and assess performance using RMSE and Diebold-Mariano tests. We find that ML does not improve level forecasts and typically underperforms ARIMA. For exchange-rate changes, ML methods consistently outperform the random-walk benchmark, but only neural networks -- under a specific design -- reliably beat ARIMA. A theory-based UIP/PPP filtering approach improves accuracy for both ML and univariate methods, yet does not change the overall ranking. Extensive robustness checks across windows, currencies, frequencies, and tuning choices confirm that ML’s advantages are limited and fragile relative to conventional univariate benchmarks.
    Keywords: Machine learning, exchange rates, forecasting, theoretical filtering, random walk, ARIMA.
    JEL: C53 F31 F37 G17
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:cfl:wpaper:2026-01ua
  2. By: Gang Li; Dandan Qiao; Mingxuan Zheng
    Abstract: We find that event features extracted by large language models (LLMs) are effective for text-based stock return prediction. Using a pre-trained LLM to extract event features from news articles, we propose a novel deep learning model based on structured event representation (SER) and attention mechanisms to predict stock returns in the cross-section. Our SER-based model provides superior performance compared with other existing text-driven models to forecast stock returns out of sample and offers highly interpretable feature structures to examine the mechanisms underlying the stock return predictability. We further provide various implications based on SER and highlight the crucial benefit of structured model inputs in stock return predictability.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.19484
  3. By: Siddhartha Chib; Fei Tan
    Abstract: We show how state-of-the-art large language models (LLMs), seemingly inapplicable to the small samples typical of macroeconomics, can be trained to learn the language of macroeconomy. We estimate a large-scale dynamic stochastic general equilibrium (DSGE) model on an initial segment of the data and obtain a posterior distribution over structural parameters. We sample from this posterior to generate millions of theory-consistent synthetic panels that, when mixed with actual macroeconomic data, form the training corpus for a time-series transformer with attention. The trained model is then used to forecast out-of-sample through 2025. The results show that this hybrid forecaster, which combines the theoretical coherence of DSGE models with the representational power of modern LLMs, successfully learns the macroeconomic language.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.21031
  4. By: Yucheng Yang; Chiyuan Wang; Andreas Schaab; Benjamin Moll
    Abstract: We present a new approach to formulating and solving heterogeneous agent models with aggregate risk. We replace the cross-sectional distribution with low-dimensional prices as state variables and let agents learn equilibrium price dynamics directly from simulated paths. To do so, we introduce a structural reinforcement learning (SRL) method which treats prices via simulation while exploiting agents' structural knowledge of their own individual dynamics. Our SRL method yields a general and highly efficient global solution method for heterogeneous agent models that sidesteps the Master equation and handles problems traditional methods struggle with, in particular nontrivial market-clearing conditions. We illustrate the approach in the Krusell-Smith model, the Huggett model with aggregate shocks, and a HANK model with a forward-looking Phillips curve, all of which we solve globally within minutes.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.18892
  5. By: Ludovic Goudenège (Université Paris-Saclay); Andrea Molent (Università degli Studi di Udine - University of Udine [Italie]); Antonino Zanette (MATHRISK - Mathematical Risk Handling - UPEM - Université Paris-Est Marne-la-Vallée - Centre Inria de Paris - Inria - Institut National de Recherche en Informatique et en Automatique - ENPC - École nationale des ponts et chaussées - IP Paris - Institut Polytechnique de Paris)
    Abstract: Total value adjustment (XVA) is the change in value to be added to the price of a derivative to account for the bilateral default risk and the funding costs. In this paper, we compute such a premium for American basket derivatives whose payoff depends on multiple underlyings. In particular, in our model, those underlyings are supposed to follow the multidimensional Black-Scholes stochastic model. In order to determine the XVA, we follow the approach introduced by (Burgard and Kjaer in SSRN Electronic J 7:1–19, 2010) and afterward applied by (Arregui et al. in Appl Math Comput 308:31–53, 2017), (Arregui et al. in Int J Comput Math 96:2157–2176, 2019) for the one-dimensional American derivatives. The evaluation of the XVA for basket derivatives is particularly challenging as the presence of several underlings leads to a high-dimensional control problem. We tackle such an obstacle by resorting to Gaussian Process Regression, a machine learning technique that allows one to address the curse of dimensionality effectively. Moreover, the use of numerical techniques, such as control variates, turns out to be a powerful tool to improve the accuracy of the proposed methods. The paper includes the results of several numerical experiments that confirm the goodness of the proposed methodologies.
    Keywords: Control variates, Basket option, Gaussian process regression, XVA, American options, Transaction costs, Greeks, Hedging
    Date: 2025–08–08
    URL: https://d.repec.org/n?u=RePEc:hal:journl:hal-05421581
  6. By: Quintana Pablo; Herrera-Gomez Marcos
    Abstract: Identifying regions that are both spatially contiguous and internally homogeneous remains a core challenge in spatial analysis and regional economics, especially with the increasing complexity of modern datasets. These limitations are particularly problematic when working with socioeconomic data that evolve over time. This paper presents a novel methodology for spatio-temporal regionalization—Spatial Deep Embedded Clustering (SDEC)—which integrates deep learning with spatially constrained clustering to effectively process time series data. The approach uses autoencoders to capture hidden temporal patterns and reduce dimensionality before clustering, ensuring that both spatial contiguity and temporal coherence are maintained. Through Monte Carlo simulations, we show that SDEC significantly outperforms traditional methods in capturing complex temporal patterns while preserving spatial structure. Using empirical examples, we demonstrate that the proposed framework provides a robust, scalable, and data-driven tool for researchers and policymakers working in public health, urban planning, and regional economic analysis.
    JEL: C1 C4 C45 C63
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:aep:anales:4831
  7. By: Christopher Regan; Ying Xie
    Abstract: We introduce obfuscation testing, a novel methodology for validating whether large language models detect structural market patterns through causal reasoning rather than temporal association. Testing three dealer hedging constraint patterns (gamma positioning, stock pinning, 0DTE hedging) on 242 trading days (95.6% coverage) of S&P 500 options data, we find LLMs achieve 71.5% detection rate using unbiased prompts that provide only raw gamma exposure values without regime labels or temporal context. The WHO-WHOM-WHAT causal framework forces models to identify the economic actors (dealers), affected parties (directional traders), and structural mechanisms (forced hedging) underlying observed market dynamics. Critically, detection accuracy (91.2%) remains stable even as economic profitability varies quarterly, demonstrating that models identify structural constraints rather than profitable patterns. When prompted with regime labels, detection increases to 100%, but the 71.5% unbiased rate validates genuine pattern recognition. Our findings suggest LLMs possess emergent capabilities for detecting complex financial mechanisms through pure structural reasoning, with implications for systematic strategy development, risk management, and our understanding of how transformer architectures process financial market dynamics.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.17923
  8. By: Zare, Hassan; Mousavi, Ebrahim
    Abstract: This paper presents a novel deep learning framework for video analysis focused on automated key object detection and tactical action recognition within economic activity contexts. The proposed system integrates enhanced motion estimation for robust tracking of functional objects and state-of-the art 3D pose estimation to extract participant postures relevant to economic decision-making behavior. A deep semantic tactical ontology is employed to model the complex relationships between individuals, objects, and their actions, enabling interpretable and rule-based tactical insight extraction for economic interaction patterns beyond conventional classification. Evaluations conducted on benchmark datasets demonstrate high accuracy with approximately 91% in object detection and 96% in action recognition, highlighting the framework’s applicability to dynamic economic environments involving multi-agent interactions. Comparative analysis against baseline methods shows the effectiveness of the framework in handling complex scenarios with occlusions and rapidly changing economic behaviors. Future work will focus on enhancing preprocessing techniques, automating ontology rule learning, and extending the approach to a wider range of economically oriented domains. This research contributes to advancing intelligent analytics by bridging deep learning with semantic reasoning, fostering improved real-time tactical feedback and decision support in economic environments.
    Keywords: Economic Video Analysis; Tactical Action Recognition; Deep Learning; Semantic Ontology; 3D Pose Estimation
    JEL: C0 C01 L0 L00 P0 R1 R11 R13
    Date: 2025–03–08
    URL: https://d.repec.org/n?u=RePEc:pra:mprapa:127062
  9. By: Niclas Griesshaber; Jochen Streb
    Abstract: We leverage multimodal large language models (LLMs) to construct a dataset of 306, 070 German patents (1877-1918) from 9, 562 archival image scans using our LLM-based pipeline powered by Gemini-2.5-Pro and Gemini-2.5-Flash-Lite. Our benchmarking exercise provides tentative evidence that multimodal LLMs can create higher quality datasets than our research assistants, while also being more than 795 times faster and 205 times cheaper in constructing the patent dataset from our image corpus. About 20 to 50 patent entries are embedded on each page, arranged in a double-column format and printed in Gothic and Roman fonts. The font and layout complexity of our primary source material suggests to us that multimodal LLMs are a paradigm shift in how datasets are constructed in economic history. We open-source our benchmarking and patent datasets as well as our LLM-based data pipeline, which can be easily adapted to other image corpora using LLM-assisted coding tools, lowering the barriers for less technical researchers. Finally, we explain the economics of deploying LLMs for historical dataset construction and conclude by speculating on the potential implications for the field of economic history.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.19675
  10. By: Travon Lucius; Christian Koch Jr; Jacob Starling; Julia Zhu; Miguel Urena; Carrie Hu
    Abstract: We present a reinforcement-learning (RL) framework for dynamic hedging of equity index option exposures under realistic transaction costs and position limits. We hedge a normalized option-implied equity exposure (one unit of underlying delta, offset via SPY) by trading the underlying index ETF, using the option surface and macro variables only as state information and not as a direct pricing engine. Building on the "deep hedging" paradigm of Buehler et al. (2019), we design a leak-free environment, a cost-aware reward function, and a lightweight stochastic actor-critic agent trained on daily end-of-day panel data constructed from SPX/SPY implied volatility term structure, skew, realized volatility, and macro rate context. On a fixed train/validation/test split, the learned policy improves risk-adjusted performance versus no-hedge, momentum, and volatility-targeting baselines (higher point-estimate Sharpe); only the GAE policy's test-sample Sharpe is statistically distinguishable from zero, although confidence intervals overlap with a long-SPY benchmark so we stop short of claiming formal dominance. Turnover remains controlled and the policy is robust to doubled transaction costs. The modular codebase, comprising a data pipeline, simulator, and training scripts, is engineered for extensibility to multi-asset overlays, alternative objectives (e.g., drawdown or CVaR), and intraday data. From a portfolio management perspective, the learned overlay is designed to sit on top of an existing SPX or SPY allocation, improving the portfolio's mean-variance trade-off with controlled turnover and drawdowns. We discuss practical implications for portfolio overlays and outline avenues for future work.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.12420
  11. By: Ali Merali
    Abstract: This paper derives `Scaling Laws for Economic Impacts' -- empirical relationships between the training compute of Large Language Models (LLMs) and professional productivity. In a preregistered experiment, over 500 consultants, data analysts, and managers completed professional tasks using one of 13 LLMs. We find that each year of AI model progress reduced task time by 8%, with 56% of gains driven by increased compute and 44% by algorithmic progress. However, productivity gains were significantly larger for non-agentic analytical tasks compared to agentic workflows requiring tool use. These findings suggest continued model scaling could boost U.S. productivity by approximately 20% over the next decade.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.21316
  12. By: Jian Xue; Qian Zhang; Wu Zhu
    Abstract: We study how generative artificial intelligence (AI) transforms the work of financial analysts. Using the 2023 launch of FactSet's AI platform as a natural experiment, we find that adoption produces markedly richer and more comprehensive reports -- featuring 40% more distinct information sources, 34% broader topical coverage, and 25% greater use of advanced analytical methods -- while also improving timeliness. However, forecast errors rise by 59% as AI-assisted reports convey a more balanced mix of positive and negative information that is harder to synthesize, particularly for analysts facing heavier cognitive demands. Placebo tests using other data vendors confirm that these effects are unique to FactSet's AI integration. Overall, our findings reveal both the productivity gains and cognitive limits of generative AI in financial information production.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.19705
  13. By: Foltas, Alexander
    Abstract: This paper contributes to previous research on the efficient integration of forecasters' narratives into business cycle forecasts. Using a Bidirectional Encoder Representations from Transformers (BERT) model, I quantify 19, 300 paragraphs from German business cycle reports (1998-2021) and use them to predict the direction of consumption forecast errors. By testing the model on an evaluation sample, I find a highly significant correlation of modest strength between predicted and actual sign of the forecast error. The correlation coefficient is substantially higher for 12.8% of paragraphs with a predicted class probability of 85% or higher. By qualitatively reviewing 150 of such high-probability paragraphs, I find recurring narratives correlated with consumption forecast errors. Underestimations of consumption growth often mention rising employment, increasing wages and transfer payments, low inflation, decreasing taxes, crisis-related fiscal support, and reduced relevance of marginal employment. Conversely, overestimated consumption forecasts present opposing narratives. Forecasters appear to particularly underestimate these factors when they disproportionately affect low-income households.
    Abstract: Diese Studie leistet einen Beitrag zur bisherigen Forschung hinsichtlich der effizienten Einbindung von Prognosenarrativen in Konjunkturprognosen. Unter Verwendung eines BERT-Modells (Bidirectional Encoder Representations from Transformers) quantifiziere ich 19.300 Absätze aus deutschen Konjunkturberichten (1998-2021) und nutze diese um die Richtung von Konsumprognosefehlern vorherzusagen. Durch die Überprüfung des Modells anhand einer Evaluationsstichprobe stelle ich eine hochsignifikante Korrelation moderater Stärke zwischen dem vorhergesagten und dem tatsächlichen Vorzeichen des Prognosefehlers fest. Der Korrelationskoeffizient ist für jene 12, 8 % der Absätze wesentlich höher, die eine vorhergesagte Klassenwahrscheinlichkeit von 85 % oder mehr aufweisen. Eine qualitative Untersuchung von 150 dieser Absätze mit hoher Wahrscheinlichkeit zeigt wiederkehrende Narrative, die mit Fehlern in der Konsumprognose korrelieren. Unterschätzungen des Konsumwachstums erwähnen häufig steigende Beschäftigung, zunehmende Löhne und Transferzahlungen, niedrige Inflation, sinkende Steuern, krisenbedingte fiskalische Unterstützung sowie eine abnehmende Bedeutung geringfügiger Beschäftigung. Umgekehrt weisen überschätzte Konsumprognosen gegensätzliche Narrative auf. Prognostiker scheinen diese Faktoren insbesondere dann zu unterschätzen, wenn sie einkommensschwache Haushalte überproportional betreffen.
    Keywords: Macroeconomic forecasting, Evaluating forecasts, Business cycles, Consumption forecasting, Natural language processing, Language Modeling, Machine learning, Judgmental forecasting
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:zbw:hwwiwp:334496
  14. By: Agust\'in Garc\'ia-Garc\'ia; Pablo Hidalgo; Julio E. Sandubete
    Abstract: Explainable Artificial Intelligence (XAI) is increasingly required in computational economics, where machine-learning forecasters can outperform classical econometric models but remain difficult to audit and use for policy. This survey reviews and organizes the growing literature on XAI for economic time series, where autocorrelation, non-stationarity, seasonality, mixed frequencies, and regime shifts can make standard explanation techniques unreliable or economically implausible. We propose a taxonomy that classifies methods by (i) explanation mechanism: propagation-based approaches (e.g., Integrated Gradients, Layer-wise Relevance Propagation), perturbation and game-theoretic attribution (e.g., permutation importance, LIME, SHAP), and function-based global tools (e.g., Accumulated Local Effects); (ii) time-series compatibility, including preservation of temporal dependence, stability over time, and respect for data-generating constraints. We synthesize time-series-specific adaptations such as vector- and window-based formulations (e.g., Vector SHAP, WindowSHAP) that reduce lag fragmentation and computational cost while improving interpretability. We also connect explainability to causal inference and policy analysis through interventional attributions (Causal Shapley values) and constrained counterfactual reasoning. Finally, we discuss intrinsically interpretable architectures (notably attention-based transformers) and provide guidance for decision-grade applications such as nowcasting, stress testing, and regime monitoring, emphasizing attribution uncertainty and explanation dynamics as indicators of structural change.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.12506
  15. By: Hanyu Li; Xiaotie Deng
    Abstract: Classic no-trade theorems attribute trade to heterogeneous beliefs. We re-examine this conclusion for AI agents, asking if trade can arise from computational limitations, under common beliefs. We model agents' bounded computational rationality within an unfolding game framework, where computational power determines the complexity of its strategy. Our central finding inverts the classic paradigm: a stable no-trade outcome (Nash equilibrium) is reached only when "almost rational" agents have slightly different computational power. Paradoxically, when agents possess identical power, they may fail to converge to equilibrium, resulting in persistent strategic adjustments that constitute a form of trade. This instability is exacerbated if agents can strategically under-utilize their computational resources, which eliminates any chance of equilibrium in Matching Pennies scenarios. Our results suggest that the inherent computational limitations of AI agents can lead to situations where equilibrium is not reached, creating a more lively and unpredictable trade environment than traditional models would predict.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.17952
  16. By: Carolin Hasse (Johannes-Gutenberg University, Germany); Stefan Irnich (Johannes-Gutenberg University, Germany)
    Abstract: The vehicle routing problem with load-dependent cost is an extension of the classical capacitated vehicle routing problem in which the cost of traveling along an arc is dependent on the load carried by the vehicle. For the benefit of generalization, this work considers the vehicle routing problem with simultaneous delivery and pickup, time windows, and load-dependent cost (VRPSDPTW-LDC). We utilize both continuous and discontinuous monotonically non-decreasing load-dependent cost functions. These cost structures are justified by real-life applications: First and foremost, transportation cost rises in load due to increasing fuel cost. In addition, cost functions may also show discontinuities due to toll-by-weight schemes, weight restricted passage, and lift axles that may be raised when the vehicle is empty or lightly loaded, therefore decreasing tire wear. We employ a fully equipped branch-price-and-cut algorithm to solve the VRPSDPTW-LDC. A major complication in its development is the consistent handling of the load-dependent cost in the columngeneration subproblem when solved by bidirectional labeling algorithms. Indeed, in the VRPSDPTW-LDC, the precise load on board is not known when a partial path is constructed. We provide a unifying description of the associated resource extension function for forward and backward labeling. In several computational experiments, we analyze algorithmic components of the branch-price-and-cut algorithm, and give managerial insights on the impact of the cost structure on key metrics such as total cost, the number of routes, and the average load carried in an optimal solution.
    Keywords: routing, load-dependent cost, simultaneous delivery and pickup, branch-price-and-cut, dynamic-programming labeling algorithm
    Date: 2025–11–06
    URL: https://d.repec.org/n?u=RePEc:jgu:wpaper:2509
  17. By: Enoch Hyunwook Kang
    Abstract: Field experiments (A/B tests) are often the most credible benchmark for methods in societal systems, but their cost and latency create a major bottleneck for iterative method development. LLM-based persona simulation offers a cheap synthetic alternative, yet it is unclear whether replacing humans with personas preserves the benchmark interface that adaptive methods optimize against. We prove an if-and-only-if characterization: when (i) methods observe only the aggregate outcome (aggregate-only observation) and (ii) evaluation depends only on the submitted artifact and not on the algorithm's identity or provenance (algorithm-blind evaluation), swapping humans for personas is just panel change from the method's point of view, indistinguishable from changing the evaluation population (e.g., New York to Jakarta). Furthermore, we move from validity to usefulness: we define an information-theoretic discriminability of the induced aggregate channel and show that making persona benchmarking as decision-relevant as a field experiment is fundamentally a sample-size question, yielding explicit bounds on the number of independent persona evaluations required to reliably distinguish meaningfully different methods at a chosen resolution.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.21080

This nep-cmp issue is ©2026 by Stan Miles. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.