nep-cmp New Economics Papers
on Computational Economics
Issue of 2026–02–02
23 papers chosen by
Stan Miles, Thompson Rivers University


  1. Artificial Intelligence–Based Forecasting of Oil Prices: Evidence from Neural Network Models By Ficura, Milan; Ibragimov, Rustam; Janda, Karel
  2. Stochastic Deep Learning: A Probabilistic Framework for Modeling Uncertainty in Structured Temporal Data By James Rice
  3. Variational Quantum Circuit-Based Reinforcement Learning for Dynamic Portfolio Optimization By Vincent Gurgul; Ying Chen; Stefan Lessmann
  4. Fake Date Tests: Can We Trust In-sample Accuracy of LLMs in Macroeconomic Forecasting? By Alexander Eliseev; Sergei Seleznev
  5. In-Season US Corn Acreage Forecasting Using Machine Learning By Ac-Pangan, Walter; Hendricks, Nathan P.
  6. Integrating LSTM Networks with Neural Levy Processes for Financial Forecasting By Mohammed Alruqimi; Luca Di Persio
  7. Generating Alpha: A Hybrid AI-Driven Trading System Integrating Technical Analysis, Machine Learning and Financial Sentiment for Regime-Adaptive Equity Strategies By Varun Narayan Kannan Pillai; Akshay Ajith; Sumesh K J
  8. Teaching Economics to the Machines By Hui Chen; Yuhan Cheng; Yanchu Liu; Ke Tang
  9. PriceSeer: Evaluating Large Language Models in Real-Time Stock Prediction By Bohan Liang; Zijian Chen; Qi Jia; Kaiwei Zhang; Kaiyuan Ji; Guangtao Zhai
  10. Look-Ahead-Bench: a Standardized Benchmark of Look-ahead Bias in Point-in-Time LLMs for Finance By Mostapha Benhenda
  11. Trade-R1: Bridging Verifiable Rewards to Stochastic Environments via Process-Level Reasoning Verification By Rui Sun; Yifan Sun; Sheng Xu; Li Zhao; Jing Li; Daxin Jiang; Cheng Hua; Zuo Bai
  12. Incorporating Cognitive Biases into Reinforcement Learning for Financial Decision-Making By Liu He
  13. Can Large Language Models Improve Venture Capital Exit Timing After IPO? By Mohammadhossien Rashidi
  14. Bayesian Robust Financial Trading with Adversarial Synthetic Market Data By Haochong Xia; Simin Li; Ruixiao Xu; Zhixia Zhang; Hongxiang Wang; Zhiqian Liu; Teng Yao Long; Molei Qin; Chuqiao Zong; Bo An
  15. The Limits of Complexity: Why Feature Engineering Beats Deep Learning in Investor Flow Prediction By Sungwoo Kang
  16. Forecasting the U.S. Treasury Yield Curve: A Distributionally Robust Machine Learning Approach By Jinjun Liu; Ming-Yen Cheng
  17. Forecasting Equity Correlations with Hybrid Transformer Graph Neural Network By Jack Fanshawe; Rumi Masih; Alexander Cameron
  18. Enhancing Portfolio Optimization with Deep Learning Insights By Brandon Luo; Jim Skufca
  19. Riesz Representer Fitting under Bregman Divergence: A Unified Framework for Debiased Machine Learning By Masahiro Kato
  20. Nonlinear Regression Modeling via Machine Learning Techniques with Applications in Business and Economics By Sunil K Sapra
  21. Manipulation in Prediction Markets: An Agent-based Modeling Experiment By Bridget Smart; Ebba Mark; Anne Bastian; Josefina Waugh
  22. Superharddata: Liability-Grounded Information as Training Substrate for Aligned Artificial Intelligence By Beier, Gregory Caldwell
  23. LLM-Generated Counterfactual Stress Scenarios for Portfolio Risk Simulation via Hybrid Prompt-RAG Pipeline By Masoud Soleimani

  1. By: Ficura, Milan; Ibragimov, Rustam; Janda, Karel
    Abstract: This working paper investigates the application of modern artificial intelligence techniques to financial time-series forecasting, with a specific focus on crude oil futures markets. Building on advances in deep learning and natural language processing, the study evaluates the predictive performance and economic relevance of several neural network architectures, including univariate and multivariate LSTM, CNN, and N-HiTS models. In addition to statistical accuracy, the models are assessed through trading-based performance metrics and factor regressions to examine the presence of economically and statistically significant returns. The paper contributes to the growing literature on AI-driven asset price forecasting by demonstrating that multivariate deep learning models incorporating additional market information and sentiment measures can improve both forecast precision and trading performance in commodity markets.
    Keywords: Artificial intelligence, Deep learning, Oil futures, Time-series forecasting
    JEL: C45 Q47 G13 G17
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:zbw:esprep:335571
  2. By: James Rice
    Abstract: I propose a novel framework that integrates stochastic differential equations (SDEs) with deep generative models to improve uncertainty quantification in machine learning applications involving structured and temporal data. This approach, termed Stochastic Latent Differential Inference (SLDI), embeds an It\^o SDE in the latent space of a variational autoencoder, allowing for flexible, continuous-time modeling of uncertainty while preserving a principled mathematical foundation. The drift and diffusion terms of the SDE are parameterized by neural networks, enabling data-driven inference and generalizing classical time series models to handle irregular sampling and complex dynamic structure. A central theoretical contribution is the co-parameterization of the adjoint state with a dedicated neural network, forming a coupled forward-backward system that captures not only latent evolution but also gradient dynamics. I introduce a pathwise-regularized adjoint loss and analyze variance-reduced gradient flows through the lens of stochastic calculus, offering new tools for improving training stability in deep latent SDEs. My paper unifies and extends variational inference, continuous-time generative modeling, and control-theoretic optimization, providing a rigorous foundation for future developments in stochastic probabilistic machine learning.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.05227
  3. By: Vincent Gurgul; Ying Chen; Stefan Lessmann
    Abstract: This paper presents a Quantum Reinforcement Learning (QRL) solution to the dynamic portfolio optimization problem based on Variational Quantum Circuits. The implemented QRL approaches are quantum analogues of the classical neural-network-based Deep Deterministic Policy Gradient and Deep Q-Network algorithms. Through an empirical evaluation on real-world financial data, we show that our quantum agents achieve risk-adjusted performance comparable to, and in some cases exceeding, that of classical Deep RL models with several orders of magnitude more parameters. However, while quantum circuit execution is inherently fast at the hardware level, practical deployment on cloud-based quantum systems introduces substantial latency, making end-to-end runtime currently dominated by infrastructural overhead and limiting practical applicability. Taken together, our results suggest that QRL is theoretically competitive with state-of-the-art classical reinforcement learning and may become practically advantageous as deployment overheads diminish. This positions QRL as a promising paradigm for dynamic decision-making in complex, high-dimensional, and non-stationary environments such as financial markets. The complete codebase is released as open source at: https://github.com/VincentGurgul/qrl-dpo -public
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.18811
  4. By: Alexander Eliseev; Sergei Seleznev
    Abstract: Large language models (LLMs) are a type of machine learning tool that economists have started to apply in their empirical research. One such application is macroeconomic forecasting with backtesting of LLMs, even though they are trained on the same data that is used to estimate their forecasting performance. Can these in-sample accuracy results be extrapolated to the model's out-of-sample performance? To answer this question, we developed a family of prompt sensitivity tests and two members of this family, which we call the fake date tests. These tests aim to detect two types of biases in LLMs' in-sample forecasts: lookahead bias and context bias. According to the empirical results, none of the modern LLMs tested in this study passed our first test, signaling the presence of lookahead bias in their in-sample forecasts.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.07992
  5. By: Ac-Pangan, Walter; Hendricks, Nathan P.
    Keywords: Marketing
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:ags:aaea25:360869
  6. By: Mohammed Alruqimi; Luca Di Persio
    Abstract: This paper investigates an optimal integration of deep learning with financial models for robust asset price forecasting. Specifically, we developed a hybrid framework combining a Long Short-Term Memory (LSTM) network with the Merton-L\'evy jump-diffusion model. To optimise this framework, we employed the Grey Wolf Optimizer (GWO) for the LSTM hyperparameter tuning, and we explored three calibration methods for the Merton-Levy model parameters: Artificial Neural Networks (ANNs), the Marine Predators Algorithm (MPA), and the PyTorch-based TorchSDE library. To evaluate the predictive performance of our hybrid model, we compared it against several benchmark models, including a standard LSTM and an LSTM combined with the Fractional Heston model. This evaluation used three real-world financial datasets: Brent oil prices, the STOXX 600 index, and the IT40 index. Performance was assessed using standard metrics, including Mean Squared Error (MSE), Mean Absolute Error(MAE), Mean Squared Percentage Error (MSPE), and the coefficient of determination (R2). Our experimental results demonstrate that the hybrid model, combining a GWO-optimized LSTM network with the Levy-Merton Jump-Diffusion model calibrated using an ANN, outperformed the base LSTM model and all other models developed in this study.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.07860
  7. By: Varun Narayan Kannan Pillai; Akshay Ajith; Sumesh K J
    Abstract: The intricate behavior patterns of financial markets are influenced by fundamental, technical, and psychological factors. During times of high volatility and regime shifts causes many traditional strategies like trend-following or mean-reversion to fail. This paper proposes a hybrid AI-based trading strategy that combines (1) trend-following and directional momentum capture via EMA and MACD, (2) detection of price normalization through mean-reversion using RSI and Bollinger Bands, (3) market psychological interpretation through sentiment analysis using FinBERT, (4) signal generation through machine learning using XGBoost and (5)dynamically adjusting exposure with market regime filtering based on volatility and return environments. The system achieved a final portfolio value of $235, 492.83, yielding a return of 135.49% on initial investment over a period of 24 months. The hybrid model outperformed major benchmark indexes like S&P 500 and NASDAQ-100 over the same period showing strong flexibility and lower downside risk with superior profits validating the use of multi-modal AI in algorithmic trading.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.19504
  8. By: Hui Chen; Yuhan Cheng; Yanchu Liu; Ke Tang
    Abstract: Structural economic models, while parsimonious and interpretable, often exhibit poor data fit and limited forecasting performance. Machine learning models, by contrast, offer substantial flexibility but are prone to overfitting and weak out-of-distribution generalization. We propose a theory-guided transfer learning framework that integrates structural restrictions from economic theory into machine learning models. The approach pre-trains a neural network on synthetic data generated by a structural model and then fine-tunes it using empirical data, allowing potentially misspecified economic restrictions to inform and regularize learning on empirical data. Applied to option pricing, our model substantially outperforms both structural and purely data-driven benchmarks, with especially large gains in small samples, under unstable market conditions, and when model misspecification is limited. Beyond performance, the framework provides diagnostics for improving structural models and introduces a new model-comparison metric based on data-model complementarity.
    JEL: C45 C52 G13
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:nbr:nberwo:34713
  9. By: Bohan Liang; Zijian Chen; Qi Jia; Kaiwei Zhang; Kaiyuan Ji; Guangtao Zhai
    Abstract: Stock prediction, a subject closely related to people's investment activities in fully dynamic and live environments, has been widely studied. Current large language models (LLMs) have shown remarkable potential in various domains, exhibiting expert-level performance through advanced reasoning and contextual understanding. In this paper, we introduce PriceSeer, a live, dynamic, and data-uncontaminated benchmark specifically designed for LLMs performing stock prediction tasks. Specifically, PriceSeer includes 110 U.S. stocks from 11 industrial sectors, with each containing 249 historical data points. Our benchmark implements both internal and external information expansion, where LLMs receive extra financial indicators, news, and fake news to perform stock price prediction. We evaluate six cutting-edge LLMs under different prediction horizons, demonstrating their potential in generating investment strategies after obtaining accurate price predictions for different sectors. Additionally, we provide analyses of LLMs' suboptimal performance in long-term predictions, including the vulnerability to fake news and specific industries. The code and evaluation data will be open-sourced at https://github.com/BobLiang2113/PriceSee r.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.06088
  10. By: Mostapha Benhenda (LAGA)
    Abstract: We introduce Look-Ahead-Bench, a standardized benchmark measuring look-ahead bias in Point-in-Time (PiT) Large Language Models (LLMs) within realistic and practical financial workflows. Unlike most existing approaches that primarily test inner lookahead knowledge via Q\\&A, our benchmark evaluates model behavior in practical scenarios. To distinguish genuine predictive capability from memorization-based performance, we analyze performance decay across temporally distinct market regimes, incorporating several quantitative baselines to establish performance thresholds. We evaluate prominent open-source LLMs -- Llama 3.1 (8B and 70B) and DeepSeek 3.2 -- against a family of Point-in-Time LLMs (Pitinf-Small, Pitinf-Medium, and frontier-level model Pitinf-Large) from PiT-Inference. Results reveal significant lookahead bias in standard LLMs, as measured with alpha decay, unlike Pitinf models, which demonstrate improved generalization and reasoning abilities as they scale in size. This work establishes a foundation for the standardized evaluation of temporal bias in financial LLMs and provides a practical framework for identifying models suitable for real-world deployment. Code is available on GitHub: https://github.com/benstaf/lookaheadbenc h
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.13770
  11. By: Rui Sun; Yifan Sun; Sheng Xu; Li Zhao; Jing Li; Daxin Jiang; Cheng Hua; Zuo Bai
    Abstract: Reinforcement Learning (RL) has enabled Large Language Models (LLMs) to achieve remarkable reasoning in domains like mathematics and coding, where verifiable rewards provide clear signals. However, extending this paradigm to financial decision is challenged by the market's stochastic nature: rewards are verifiable but inherently noisy, causing standard RL to degenerate into reward hacking. To address this, we propose Trade-R1, a model training framework that bridges verifiable rewards to stochastic environments via process-level reasoning verification. Our key innovation is a verification method that transforms the problem of evaluating reasoning over lengthy financial documents into a structured Retrieval-Augmented Generation (RAG) task. We construct a triangular consistency metric, assessing pairwise alignment between retrieved evidence, reasoning chains, and decisions to serve as a validity filter for noisy market returns. We explore two reward integration strategies: Fixed-effect Semantic Reward (FSR) for stable alignment signals, and Dynamic-effect Semantic Reward (DSR) for coupled magnitude optimization. Experiments on different country asset selection demonstrate that our paradigm reduces reward hacking, with DSR achieving superior cross-market generalization while maintaining the highest reasoning consistency.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.03948
  12. By: Liu He
    Abstract: Financial markets are influenced by human behavior that deviates from rationality due to cognitive biases. Traditional reinforcement learning (RL) models for financial decision-making assume rational agents, potentially overlooking the impact of psychological factors. This study integrates cognitive biases into RL frameworks for financial trading, hypothesizing that such models can exhibit human-like trading behavior and achieve better risk-adjusted returns than standard RL agents. We introduce biases, such as overconfidence and loss aversion, into reward structures and decision-making processes and evaluate their performance in simulated and real-world trading environments. Despite its inconclusive or negative results, this study provides insights into the challenges of incorporating human-like biases into RL, offering valuable lessons for developing robust financial AI systems.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.08247
  13. By: Mohammadhossien Rashidi
    Abstract: Exit timing after an IPO is one of the most consequential decisions for venture capital (VC) investors, yet existing research focuses mainly on describing when VCs exit rather than evaluating whether those choices are economically optimal. Meanwhile, large language models (LLMs) have shown promise in synthesizing complex financial data and textual information but have not been applied to post-IPO exit decisions. This study introduces a framework that uses LLMs to estimate the optimal time for VC exit by analyzing monthly post IPO information financial performance, filings, news, and market signals and recommending whether to sell or continue holding. We compare these LLM generated recommendations with the actual exit dates observed for VCs and compute the return differences between the two strategies. By quantifying gains or losses associated with following the LLM, this study provides evidence on whether AI-driven guidance can improve exit timing and complements traditional hazard and real-options models in venture capital research.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.00810
  14. By: Haochong Xia; Simin Li; Ruixiao Xu; Zhixia Zhang; Hongxiang Wang; Zhiqian Liu; Teng Yao Long; Molei Qin; Chuqiao Zong; Bo An
    Abstract: Algorithmic trading relies on machine learning models to make trading decisions. Despite strong in-sample performance, these models often degrade when confronted with evolving real-world market regimes, which can shift dramatically due to macroeconomic changes-e.g., monetary policy updates or unanticipated fluctuations in participant behavior. We identify two challenges that perpetuate this mismatch: (1) insufficient robustness in existing policy against uncertainties in high-level market fluctuations, and (2) the absence of a realistic and diverse simulation environment for training, leading to policy overfitting. To address these issues, we propose a Bayesian Robust Framework that systematically integrates a macro-conditioned generative model with robust policy learning. On the data side, to generate realistic and diverse data, we propose a macro-conditioned GAN-based generator that leverages macroeconomic indicators as primary control variables, synthesizing data with faithful temporal, cross-instrument, and macro correlations. On the policy side, to learn robust policy against market fluctuations, we cast the trading process as a two-player zero-sum Bayesian Markov game, wherein an adversarial agent simulates shifting regimes by perturbing macroeconomic indicators in the macro-conditioned generator, while the trading agent-guided by a quantile belief network-maintains and updates its belief over hidden market states. The trading agent seeks a Robust Perfect Bayesian Equilibrium via Bayesian neural fictitious self-play, stabilizing learning under adversarial market perturbations. Extensive experiments on 9 financial instruments demonstrate that our framework outperforms 9 state-of-the-art baselines. In extreme events like the COVID, our method shows improved profitability and risk management, offering a reliable solution for trading under uncertain and shifting market dynamics.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.17008
  15. By: Sungwoo Kang
    Abstract: The application of machine learning to financial prediction has accelerated dramatically, yet the conditions under which complex models outperform simple alternatives remain poorly understood. This paper investigates whether advanced signal processing and deep learning techniques can extract predictive value from investor order flows beyond what simple feature engineering achieves. Using a comprehensive dataset of 2.79 million observations spanning 2, 439 Korean equities from 2020--2024, we apply three methodologies: \textit{Independent Component Analysis} (ICA) to recover latent market drivers, \textit{Wavelet Coherence} analysis to characterize multi-scale correlation structure, and \textit{Long Short-Term Memory} (LSTM) networks with attention mechanisms for non-linear prediction. Our results reveal a striking finding: a parsimonious linear model using market capitalization-normalized flows (``Matched Filter'' preprocessing) achieves a Sharpe ratio of 1.30 and cumulative return of 272.6\%, while the full ICA-Wavelet-LSTM pipeline generates a Sharpe ratio of only 0.07 with a cumulative return of $-5.1\%$. The raw LSTM model collapsed to predicting the unconditional mean, achieving a hit rate of 47.5\% -- worse than random. We conclude that in low signal-to-noise financial environments, domain-specific feature engineering yields substantially higher marginal returns than algorithmic complexity. These findings establish important boundary conditions for the application of deep learning to financial prediction.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.07131
  16. By: Jinjun Liu; Ming-Yen Cheng
    Abstract: We study U.S. Treasury yield curve forecasting under distributional uncertainty and recast forecasting as an operations research and managerial decision problem. Rather than minimizing average forecast error, the forecaster selects a decision rule that minimizes worst case expected loss over an ambiguity set of forecast error distributions. To this end, we propose a distributionally robust ensemble forecasting framework that integrates parametric factor models with high dimensional nonparametric machine learning models through adaptive forecast combinations. The framework consists of three machine learning components. First, a rolling window Factor Augmented Dynamic Nelson Siegel model captures level, slope, and curvature dynamics using principal components extracted from economic indicators. Second, Random Forest models capture nonlinear interactions among macro financial drivers and lagged Treasury yields. Third, distributionally robust forecast combination schemes aggregate heterogeneous forecasts under moment uncertainty, penalizing downside tail risk via expected shortfall and stabilizing second moment estimation through ridge regularized covariance matrices. The severity of the worst case criterion is adjustable, allowing the forecaster to regulate the trade off between robustness and statistical efficiency. Using monthly data, we evaluate out of sample forecasts across maturities and horizons from one to twelve months ahead. Adaptive combinations deliver superior performance at short horizons, while Random Forest forecasts dominate at longer horizons. Extensions to global sovereign bond yields confirm the stability and generalizability of the proposed framework.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.04608
  17. By: Jack Fanshawe; Rumi Masih; Alexander Cameron
    Abstract: This paper studies forward-looking stock-stock correlation forecasting for S\&P 500 constituents and evaluates whether learned correlation forecasts can improve graph-based clustering used in basket trading strategies. We cast 10-day ahead correlation prediction in Fisher-z space and train a Temporal-Heterogeneous Graph Neural Network (THGNN) to predict residual deviations from a rolling historical baseline. The architecture combines a Transformer-based temporal encoder, which captures non-stationary, complex, temporal dependencies, with an edge-aware graph attention network that propagates cross-asset information over the equity network. Inputs span daily returns, technicals, sector structure, previous correlations, and macro signals, enabling regime-aware forecasts and attention-based feature and neighbor importance to provide interpretability. Out-of-sample results from 2019-2024 show that the proposed model meaningfully reduces correlation forecasting error relative to rolling-window estimates. When integrated into a graph-based clustering framework, forward-looking correlations produce adaptable and economically meaningfully baskets, particularly during periods of market stress. These findings suggest that improvements in correlation forecasts translate into meaningful gains during portfolio construction tasks.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.04602
  18. By: Brandon Luo; Jim Skufca
    Abstract: Our work focuses on deep learning (DL) portfolio optimization, tackling challenges in long-only, multi-asset strategies across market cycles. We propose training models with limited regime data using pre-training techniques and leveraging transformer architectures for state variable inclusion. Evaluating our approach against traditional methods shows promising results, demonstrating our models' resilience in volatile markets. These findings emphasize the evolving landscape of DL-driven portfolio optimization, stressing the need for adaptive strategies to navigate dynamic market conditions and improve predictive accuracy.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.07942
  19. By: Masahiro Kato
    Abstract: Estimating the Riesz representer is a central problem in debiased machine learning for causal and structural parameter estimation. Various methods for Riesz representer estimation have been proposed, including Riesz regression and covariate balancing. This study unifies these methods within a single framework. Our framework fits a Riesz representer model to the true Riesz representer under a Bregman divergence, which includes the squared loss and the Kullback--Leibler (KL) divergence as special cases. We show that the squared loss corresponds to Riesz regression, and the KL divergence corresponds to tailored loss minimization, where the dual solutions correspond to stable balancing weights and entropy balancing weights, respectively, under specific model specifications. We refer to our method as generalized Riesz regression, and we refer to the associated duality as automatic covariate balancing. Our framework also generalizes density ratio fitting under a Bregman divergence to Riesz representer estimation, and it includes various applications beyond density ratio estimation. We also provide a convergence analysis for both cases where the model class is a reproducing kernel Hilbert space (RKHS) and where it is a neural network.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.07752
  20. By: Sunil K Sapra (California State University, Los Angeles, CA, USA)
    Abstract: The paper demonstrates applications of machine learning techniques to economic data. The techniques include nonlinear regression, generalized additive models (GAM), regression trees, bagging, random forest, boosting, and multivariate adaptive regression splines (MARS). Their relative model fitting and forecasting performance are studied. Common algorithms for implementing these techniques and their relative merits and shortcomings are discussed. Performance comparisons among these techniques are carried out via their application to the current population survey (CPS) data on wages and Boston housing data. Overfitting and post-selection inference issues associated with these techniques are also investigated. Our results suggest that the recently developed adaptive machine learning techniques of random forests, boosting, GAM and MARS outperform nonlinear regression model with Gaussian errors and can be scaled to bigger data sets by fitting a rich class of functions almost automatically.
    Keywords: Generalized Additive Models, Multivariate Adaptive Regression Splines, Random Forests, Regression Trees, Semi-parametric Regression
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:smo:raiswp:0594
  21. By: Bridget Smart; Ebba Mark; Anne Bastian; Josefina Waugh
    Abstract: Prediction markets mobilize financial incentives to forecast binary event outcomes through the aggregation of dispersed beliefs and heterogeneous information. Their growing popularity and demonstrated predictive accuracy in political elections have raised speculation and concern regarding their susceptibility to manipulation and the potential consequences for democratic processes. Using agent-based simulations combined with an analytic characterization of price dynamics, we study how high-budget agents can introduce price distortions in prediction markets. We explore the persistence and stability of these distortions in the presence of herding or stubborn agents, and analyze how agent expertise affects market-price variance. Firstly we propose an agent-based model of a prediction market in which bettors with heterogeneous expertise, noisy private information, variable learning rates and budgets observe the evolution of public opinion on a binary election outcome to inform their betting strategies in the market. The model exhibits stability across a broad parameter space, with complex agent behaviors and price interactions producing self-regulatory price discovery. Second, using this simulation framework, we investigate the conditions under which a highly resourced minority, or ''whale'' agent, with a biased valuation can distort the market price, and for how long. We find that biased whales can temporarily shift prices, with the magnitude and duration of distortion increasing when non-whale bettors exhibit herding behavior and slow learning. Our theoretical analysis corroborates these results, showing that whales can shift prices proportionally to their share of market capital, with distortion duration depending on non-whale learning rates and herding intensity.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.20452
  22. By: Beier, Gregory Caldwell (Susarb LLC, a Public Benefit LLC)
    Abstract: Large Language Models exhibit remarkable linguistic fluency but remain fundamentally ungrounded in consequence. Hallucination is inherent to their architecture - probabilistic next-token prediction will always produce some confabulation. But the problem is severely exacerbated when training data carries no cost of error. A Reddit comment about nuclear physics and a peer-reviewed safety audit are treated as probabilistically adjacent tokens. This paper introduces *Superharddata* (SHD): a classification of information defined not by its format or source, but by the liability pressure under which it was generated. Borrowing from materials science, where superhard materials exhibit Vickers hardness exceeding 40 gigapascals, we define Superharddata by three epistemic properties: high Bulk Modulus (resistance to systemic distortion under scrutiny), high Fracture Toughness (maintenance of alignment during adversarial stress), and Creep Resistance (immunity to semantic drift over time). We argue that existing approaches to AI grounding - World Models, Embodied Cognition, and Blockchain Oracles - address only physical intuition, motor learning, or transactional verification, leaving the vast domain of institutional and human consequence unaddressed. To fill this gap, we present a General Theory of Incentive-Compatible Semantics, cataloging seven "Truth Bridges" across physical and institutional domains where reality enforces signal integrity through non-negotiable loss functions. We propose that fiduciary duty, anti-fraud liability, and survival pressure constitute a naturally occurring "loss function" that can ground AI reasoning in physical and social reality. Finally, we outline a Structured Disclosure Protocol for generating machine-readable Superharddata and a Federated Public AI Architecture for incorporating these signals into training and inference. This work supersedes and integrates two prior priority filings (DOI: 10.5281/zenodo.18111763 and DOI:10.5281/zenodo.18112796) by the author, establishing a unified theoretical framework for liability-grounded artificial intelligence, and is available at SSRN: https://ssrn.com/abstract=6004354 or http://dx.doi.org/10.2139/ssrn.6004354 as well. This paper presents concepts adapted from a forthcoming book series by Gregory Caldwell Beier.
    Date: 2026–01–02
    URL: https://d.repec.org/n?u=RePEc:osf:lawarc:yhc6t_v1
  23. By: Masoud Soleimani
    Abstract: We develop a transparent and fully auditable LLM-based pipeline for macro-financial stress testing, combining structured prompting with optional retrieval of country fundamentals and news. The system generates machine-readable macroeconomic scenarios for the G7, which cover GDP growth, inflation, and policy rates, and are translated into portfolio losses through a factor-based mapping that enables Value-at-Risk and Expected Shortfall assessment relative to classical econometric baselines. Across models, countries, and retrieval settings, the LLMs produce coherent and country-specific stress narratives, yielding stable tail-risk amplification with limited sensitivity to retrieval choices. Comprehensive plausibility checks, scenario diagnostics, and ANOVA-based variance decomposition show that risk variation is driven primarily by portfolio composition and prompt design rather than by the retrieval mechanism. The pipeline incorporates snapshotting, deterministic modes, and hash-verified artifacts to ensure reproducibility and auditability. Overall, the results demonstrate that LLM-generated macro scenarios, when paired with transparent structure and rigorous validation, can provide a scalable and interpretable complement to traditional stress-testing frameworks.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.07867

This nep-cmp issue is ©2026 by Stan Miles. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.