nep-cmp 2025-10-13 papers

on Computational Economics

Issue of 2025–10–13
forty-five papers chosen by
Stan Miles, Thompson Rivers University

From Classical Rationality to Contextual Reasoning: Quantum Logic as a New Frontier for Human-Centric AI in Finance By Fabio Bagarello; Francesco Gargano; Polina Khrennikova
Can Machine Learning Algorithms Outperform Traditional Models for Option Pricing? By Georgy Milyushkov
Functional effects models: Accounting for preference heterogeneity in panel data with machine learning By Nicolas Salvad\'e; Tim Hillel
How human is the machine? Evidence from 66, 000 Conversations with Large Language Models By Antonios Stamatogiannakis; Arsham Ghodsinia; Sepehr Etminanrad; Dilney Gon\c{c}alves; David Santos
An Artificial Intelligence Value at Risk Approach: Metrics and Models By Luis Enriquez Alvarez
Minimizing the Value-at-Risk of Loan Portfolio via Deep Neural Networks By Albert Di Wang; Ye Du
Exploring Network-Knowledge Graph Duality: A Case Study in Agentic Supply Chain Risk Analysis By Evan Heus; Rick Bookstaber; Dhruv Sharma
Diffusion-Augmented Reinforcement Learning for Robust Portfolio Optimization under Stress Scenarios By Himanshu Choudhary; Arishi Orra; Manoj Thakur
Parsing the pulse: decomposing macroeconomic sentiment with LLMs By Byeungchun Kwon; Taejin Park; Phurichai Rungcharoenkitkul; Frank Smets
Reimagining Agent-based Modeling with Large Language Model Agents via Shachi By So Kuroki; Yingtao Tian; Kou Misaki; Takashi Ikegami; Takuya Akiba; Yujin Tang
When Machines Meet Each Other: Network Effects and the Strategic Role of History in Multi-Agent AI By Yu Liu; Wenwen Li; Yifan Dou; Guangnan Ye
Financial Stability Implications of Generative AI: Taming the Animal Spirits By Anne Lundgaard Hansen; Seung Jung Lee
Mamba Outpaces Reformer in Stock Prediction with Sentiments from Top Ten LLMs By Lokesh Antony Kadiyala; Amir Mirzaeinia
One More Question is Enough, Expert Question Decomposition (EQD) Model for Domain Quantitative Reasoning By Mengyu Wang; Sotirios Sabanis; Miguel de Carvalho; Shay B. Cohen; Tiejun Ma
Multi-Agent Analysis of Off-Exchange Public Information for Cryptocurrency Market Trend Prediction By Kairan Hong; Jinling Gan; Qiushi Tian; Yanglinxuan Guo; Rui Guo; Runnan Li
AlphaSAGE: Structure-Aware Alpha Mining via GFlowNets for Robust Exploration By Binqi Chen; Hongjun Ding; Ning Shen; Jinsheng Huang; Taian Guo; Luchen Liu; Ming Zhang
Inducing State Anxiety in LLM Agents Reproduces Human-Like Biases in Consumer Decision-Making By Ziv Ben-Zion; Zohar Elyoseph; Tobias Spiller; Teddy Lazebnik
LEMs: A Primer On Large Execution Models By Remi Genet; Hugo Inzirillo
FinReflectKG - EvalBench: Benchmarking Financial KG with Multi-Dimensional Evaluation By Fabrizio Dimino; Abhinav Arun; Bhaskarjit Sarmah; Stefano Pasquali
Can language models boost the power of randomized experiments without statistical bias? By Xinrui Ruan; Xinwei Ma; Yingfei Wang; Waverly Wei; Jingshen Wang
Enhanced fill probability estimates in institutional algorithmic bond trading using statistical learning algorithms with quantum computers By Axel Ciceri; Austin Cottrell; Joshua Freeland; Daniel Fry; Hirotoshi Hirai; Philip Intallura; Hwajung Kang; Chee-Kong Lee; Abhijit Mitra; Kentaro Ohno; Das Pemmaraju; Manuel Proissl; Brian Quanz; Del Rajan; Noriaki Shimada; Kavitha Yograj
Predicting Credit Spreads and Ratings with Machine Learning: The Role of Non-Financial Data By Yanran Wu; Xinlei Zhang; Quanyi Xu; Qianxin Yang; Chao Zhang
Uncovering Representation Bias for Investment Decisions in Open-Source Large Language Models By Fabrizio Dimino; Krati Saxena; Bhaskarjit Sarmah; Stefano Pasquali
Analysis of the Impact of an Execution Algorithm with an Order Book Imbalance Strategy on a Financial Market Using an Agent-based Simulation By Shuto Endo; Takanobu Mizuta; Isao Yagi
An Adaptive Multi Agent Bitcoin Trading System By Aadi Singhi
What Can Satellite Imagery and Machine Learning Measure? By Jonathan Proctor; Tamma Carleton; Trinetta Chong; Taryn Fransen; Simon Greenhill; Jessica Katz; Hikari Murayama; Luke Sherman; Jeanette Tseng; Hannah Druckenmiller; Solomon Hsiang
Predictive economics: Rethinking economic methodology with machine learning By Miguel Alves Pereira
The New Quant: A Survey of Large Language Models in Financial Prediction and Trading By Weilong Fu
Private and public school efficiency gaps in Latin America-A combined DEA and machine learning approach based on PISA 2022 By Marcos Delprato
Increase Alpha: Performance and Risk of an AI-Driven Trading Framework By Sid Ghatak; Arman Khaledian; Navid Parvini; Nariman Khaledian
Asymptotic expansions as control variates for deep solvers to fully-coupled forward-backward stochastic differential equations "Forthcoming in PLOS ONE" By Makoto Naito; Taiga Saito; Akihiko Takahashi; Kohta Takehara
Joint Stochastic Optimal Control and Stopping in Aquaculture: Finite-Difference and PINN-Based Approaches By Kevin Kamm
The AI Productivity Index (APEX) By Bertie Vidgen; Abby Fennelly; Evan Pinnix; Chirag Mahapatra; Zach Richards; Austin Bridges; Calix Huang; Ben Hunsberger; Fez Zafar; Brendan Foody; Dominic Barton; Cass R. Sunstein; Eric Topol; Osvald Nitski
Machines are more productive than humans until they aren't, and vice versa By Riccardo Zanardelli
Artificial Intelligence in Port Logistics: A Bibliometric Analysis of Technological Integration and Research Dynamics By Abdelhafid Khazzar; Yassine Sekaki; Yasser Lachhab; Said El-marzouki
Why Virtual Mileage Can Threaten Vehicle-to-Grid By Pierre Dumont; Lorenzo Nicoletti; Marc Petit; Damien-Pierre Sainflou
Incorporating Non-Unitary Income Elasticities, Choke Prices and Choke Incomes into Applied General-Equilibrium Models By James R. Markusen
Progressiver Einkommensteuertarif und Ehegattenbesteuerung - Simulationsanalyse alternativer Besteuerungskonzepte By Neugebauer, Claudia; Mattern, Marcel
Robust Identification in Repeated Games: An Empirical Approach to Algorithmic Competition By Antonio Cozzolino; Cristina Gualdani; Ivan Gufler; Niccolò Lomys; Lorenzo Magnolfi
Inverse Portfolio Optimization with Synthetic Investor Data: Recovering Risk Preferences under Uncertainty By Jinho Cha; Long Pham; Thi Le Hoa Vo; Jaeyoung Cho; Jaejin Lee
Cost estimation in the context of Manufacturing-as-a-Service By Farah Abdoune; Rasmus Andersen; Ann-Louise Andersen; Catherine da Cunha
Fiscal Drag in Theory and in Practice: a European Perspective By Esteban García-Miralles; Maximilian Freier; Sara Riscado; Chrysa Leventi; Alberto Mazzon; Glenn Abela; Laura Lehtonen; Laura Boyd; Baiba BrusbÄ rde; Marion Cochard; David Cornille; Emanuele Dicarlo; Ian Debattista; Mar Delgado-Téllez; Mathias Dolls; Ludmila Fadejeva; Maria Flevotomou; Florian Henne; Alena Harrer-Bachleitner; Viktor Jaszberenyi-Kiraly; Max Lay; Mauro Mastrogiacomo; Tara McIndoe-Calder; Mathias Moser; Martin Nevicky; Andreas Peichl; Myroslav Pidkuyko; Mojca Roter; Frédérique Savignac; Andreja Strojan Kastelec; Vaidotas Tuzikas; Nikos Ventouris; Lara Wemans
Modeling ROI in Chronic Disease Management, A Simulation-Based Framework Integrating Patient Adherence and Policy Timing By Jinho Cha; Eunchan D. Cha; Emily Yoo; Hyoshin Song
Traffic jams and driver behavior archetypes By Shawn Berry
Smart Contract-Enabled Procurement under Bounded Demand Variability: A Truncated Normal Approach By Jinho Cha; Youngchul Kim; Junyeol Ryu; Sangjun Park; Jeongho Kang; Hyeyoung Hwang

From Classical Rationality to Contextual Reasoning: Quantum Logic as a New Frontier for Human-Centric AI in Finance

By:	Fabio Bagarello; Francesco Gargano; Polina Khrennikova
Abstract:	We consider state of the art applications of artificial intelligence (AI) in modelling human financial expectations and explore the potential of quantum logic to drive future advancements in this field. This analysis highlights the application of machine learning techniques, including reinforcement learning and deep neural networks, in financial statement analysis, algorithmic trading, portfolio management, and robo-advisory services. We further discuss the emergence and progress of quantum machine learning (QML) and advocate for broader exploration of the advantages provided by quantum-inspired neural networks.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.05475

Can Machine Learning Algorithms Outperform Traditional Models for Option Pricing?

By:	Georgy Milyushkov
Abstract:	This study investigates the application of machine learning techniques, specifically Neural Networks, Random Forests, and CatBoost for option pricing, in comparison to traditional models such as Black-Scholes and Heston Model. Using both synthetically generated data and real market option data, each model is evaluated in predicting the option price. The results show that machine learning models can capture complex, non-linear relationships in option prices and, in several cases, outperform both Black-Scholes and Heston models. These findings highlight the potential of data-driven methods to improve pricing accuracy and better reflect market dynamics.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.01446

Functional effects models: Accounting for preference heterogeneity in panel data with machine learning

By:	Nicolas Salvad\'e; Tim Hillel
Abstract:	In this paper, we present a general specification for Functional Effects Models, which use Machine Learning (ML) methodologies to learn individual-specific preference parameters from socio-demographic characteristics, therefore accounting for inter-individual heterogeneity in panel choice data. We identify three specific advantages of the Functional Effects Model over traditional fixed, and random/mixed effects models: (i) by mapping individual-specific effects as a function of socio-demographic variables, we can account for these effects when forecasting choices of previously unobserved individuals (ii) the (approximate) maximum-likelihood estimation of functional effects avoids the incidental parameters problem of the fixed effects model, even when the number of observed choices per individual is small; and (iii) we do not rely on the strong distributional assumptions of the random effects model, which may not match reality. We learn functional intercept and functional slopes with powerful non-linear machine learning regressors for tabular data, namely gradient boosting decision trees and deep neural networks. We validate our proposed methodology on a synthetic experiment and three real-world panel case studies, demonstrating that the Functional Effects Model: (i) can identify the true values of individual-specific effects when the data generation process is known; (ii) outperforms both state-of-the-art ML choice modelling techniques that omit individual heterogeneity in terms of predictive performance, as well as traditional static panel choice models in terms of learning inter-individual heterogeneity. The results indicate that the FI-RUMBoost model, which combines the individual-specific constants of the Functional Effects Model with the complex, non-linear utilities of RUMBoost, performs marginally best on large-scale revealed preference panel data.
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2509.18047

How human is the machine? Evidence from 66, 000 Conversations with Large Language Models

By:	Antonios Stamatogiannakis; Arsham Ghodsinia; Sepehr Etminanrad; Dilney Gon\c{c}alves; David Santos
Abstract:	When Artificial Intelligence (AI) is used to replace consumers (e.g., synthetic data), it is often assumed that AI emulates established consumers, and more generally human behaviors. Ten experiments with Large Language Models (LLMs) investigate if this is true in the domain of well-documented biases and heuristics. Across studies we observe four distinct types of deviations from human-like behavior. First, in some cases, LLMs reduce or correct biases observed in humans. Second, in other cases, LLMs amplify these same biases. Third, and perhaps most intriguingly, LLMs sometimes exhibit biases opposite to those found in humans. Fourth, LLMs' responses to the same (or similar) prompts tend to be inconsistent (a) within the same model after a time delay, (b) across models, and (c) among independent research studies. Such inconsistencies can be uncharacteristic of humans and suggest that, at least at one point, LLMs' responses differed from humans. Overall, unhuman-like responses are problematic when LLMs are used to mimic or predict consumer behavior. These findings complement research on synthetic consumer data by showing that sources of bias are not necessarily human-centric. They also contribute to the debate about the tasks for which consumers, and more generally humans, can be replaced by AI.
Date:	2025–08
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.07321

An Artificial Intelligence Value at Risk Approach: Metrics and Models

By:	Luis Enriquez Alvarez
Abstract:	Artificial intelligence risks are multidimensional in nature, as the same risk scenarios may have legal, operational, and financial risk dimensions. With the emergence of new AI regulations, the state of the art of artificial intelligence risk management seems to be highly immature due to upcoming AI regulations. Despite the appearance of several methodologies and generic criteria, it is rare to find guidelines with real implementation value, considering that the most important issue is customizing artificial intelligence risk metrics and risk models for specific AI risk scenarios. Furthermore, the financial departments, legal departments and Government Risk Compliance teams seem to remain unaware of many technical aspects of AI systems, in which data scientists and AI engineers emerge as the most appropriate implementers. It is crucial to decompose the problem of artificial intelligence risk in several dimensions: data protection, fairness, accuracy, robustness, and information security. Consequently, the main task is developing adequate metrics and risk models that manage to reduce uncertainty for decision-making in order to take informed decisions concerning the risk management of AI systems. The purpose of this paper is to orientate AI stakeholders about the depths of AI risk management. Although it is not extremely technical, it requires a basic knowledge of risk management, quantifying uncertainty, the FAIR model, machine learning, large language models and AI context engineering. The examples presented pretend to be very basic and understandable, providing simple ideas that can be developed regarding specific AI customized environments. There are many issues to solve in AI risk management, and this paper will present a holistic overview of the inter-dependencies of AI risks, and how to model them together, within risk scenarios.
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2509.18394

Minimizing the Value-at-Risk of Loan Portfolio via Deep Neural Networks

By:	Albert Di Wang; Ye Du
Abstract:	Risk management is a prominent issue in peer-to-peer lending. An investor may naturally reduce his risk exposure by diversifying instead of putting all his money on one loan. In that case, an investor may want to minimize the Value-at-Risk (VaR) or Conditional Value-at-Risk (CVaR) of his loan portfolio. We propose a low degree of freedom deep neural network model, DeNN, as well as a high degree of freedom model, DSNN, to tackle the problem. In particular, our models predict not only the default probability of a loan but also the time when it will default. The experiments demonstrate that both models can significantly reduce the portfolio VaRs at different confidence levels, compared to benchmarks. More interestingly, the low degree of freedom model, DeNN, outperforms DSNN in most scenarios.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.07444

Exploring Network-Knowledge Graph Duality: A Case Study in Agentic Supply Chain Risk Analysis

By:	Evan Heus; Rick Bookstaber; Dhruv Sharma
Abstract:	Large Language Models (LLMs) struggle with the complex, multi-modal, and network-native data underlying financial risk. Standard Retrieval-Augmented Generation (RAG) oversimplifies relationships, while specialist models are costly and static. We address this gap with an LLM-centric agent framework for supply chain risk analysis. Our core contribution is to exploit the inherent duality between networks and knowledge graphs (KG). We treat the supply chain network as a KG, allowing us to use structural network science principles for retrieval. A graph traverser, guided by network centrality scores, efficiently extracts the most economically salient risk paths. An agentic architecture orchestrates this graph retrieval alongside data from numerical factor tables and news streams. Crucially, it employs novel ``context shells'' -- descriptive templates that embed raw figures in natural language -- to make quantitative data fully intelligible to the LLM. This lightweight approach enables the model to generate concise, explainable, and context-rich risk narratives in real-time without costly fine-tuning or a dedicated graph database.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.01115

Diffusion-Augmented Reinforcement Learning for Robust Portfolio Optimization under Stress Scenarios

By:	Himanshu Choudhary; Arishi Orra; Manoj Thakur
Abstract:	In the ever-changing and intricate landscape of financial markets, portfolio optimisation remains a formidable challenge for investors and asset managers. Conventional methods often struggle to capture the complex dynamics of market behaviour and align with diverse investor preferences. To address this, we propose an innovative framework, termed Diffusion-Augmented Reinforcement Learning (DARL), which synergistically integrates Denoising Diffusion Probabilistic Models (DDPMs) with Deep Reinforcement Learning (DRL) for portfolio management. By leveraging DDPMs to generate synthetic market crash scenarios conditioned on varying stress intensities, our approach significantly enhances the robustness of training data. Empirical evaluations demonstrate that DARL outperforms traditional baselines, delivering superior risk-adjusted returns and resilience against unforeseen crises, such as the 2025 Tariff Crisis. This work offers a robust and practical methodology to bolster stress resilience in DRL-driven financial applications.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.07099

Parsing the pulse: decomposing macroeconomic sentiment with LLMs

By:	Byeungchun Kwon; Taejin Park; Phurichai Rungcharoenkitkul; Frank Smets
Abstract:	Macroeconomic indicators provide quantitative signals that must be pieced together and interpreted by economists. We propose a reversed approach of parsing press narratives directly using Large Language Models (LLM) to recover growth and inflation sentiment indices. A key advantage of this LLM-based approach is the ability to decompose aggregate sentiment into its drivers, readily enabling an interpretation of macroeconomic dynamics. Our sentiment indices track hard-data counterparts closely, providing an accurate, near real-time picture of the macroeconomy. Their components–demand, supply, and deeper structural forces–are intuitive and consistent with prior model-based studies. Incorporating sentiment indices improves the forecasting performance of simple statistical models, pointing to information unspanned by traditional data.
Keywords:	macroeconomic sentiment, growth, inflation, monetary policy, fiscal policy, LLMs, machine learning
JEL:	E30 E44 E60 C55 C82
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:bis:biswps:1294

Reimagining Agent-based Modeling with Large Language Model Agents via Shachi

By:	So Kuroki; Yingtao Tian; Kou Misaki; Takashi Ikegami; Takuya Akiba; Yujin Tang
Abstract:	The study of emergent behaviors in large language model (LLM)-driven multi-agent systems is a critical research challenge, yet progress is limited by a lack of principled methodologies for controlled experimentation. To address this, we introduce Shachi, a formal methodology and modular framework that decomposes an agent's policy into core cognitive components: Configuration for intrinsic traits, Memory for contextual persistence, and Tools for expanded capabilities, all orchestrated by an LLM reasoning engine. This principled architecture moves beyond brittle, ad-hoc agent designs and enables the systematic analysis of how specific architectural choices influence collective behavior. We validate our methodology on a comprehensive 10-task benchmark and demonstrate its power through novel scientific inquiries. Critically, we establish the external validity of our approach by modeling a real-world U.S. tariff shock, showing that agent behaviors align with observed market reactions only when their cognitive architecture is appropriately configured with memory and tools. Our work provides a rigorous, open-source foundation for building and evaluating LLM agents, aimed at fostering more cumulative and scientifically grounded research.
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2509.21862

When Machines Meet Each Other: Network Effects and the Strategic Role of History in Multi-Agent AI

By:	Yu Liu; Wenwen Li; Yifan Dou; Guangnan Ye
Abstract:	As artificial intelligence (AI) enters the agentic era, large language models (LLMs) are increasingly deployed as autonomous agents that interact with one another rather than operate in isolation. This shift raises a fundamental question: how do machine agents behave in interdependent environments where outcomes depend not only on their own choices but also on the coordinated expectations of peers? To address this question, we study LLM agents in a canonical network-effect game, where economic theory predicts convergence to a fulfilled expectation equilibrium (FEE). We design an experimental framework in which 50 heterogeneous GPT-5-based agents repeatedly interact under systematically varied network-effect strengths, price trajectories, and decision-history lengths. The results reveal that LLM agents systematically diverge from FEE: they underestimate participation at low prices, overestimate at high prices, and sustain persistent dispersion. Crucially, the way history is structured emerges as a design lever. Simple monotonic histories-where past outcomes follow a steady upward or downward trend-help stabilize coordination, whereas nonmonotonic histories amplify divergence and path dependence. Regression analyses at the individual level further show that price is the dominant driver of deviation, history moderates this effect, and network effects amplify contextual distortions. Together, these findings advance machine behavior research by providing the first systematic evidence on multi-agent AI systems under network effects and offer guidance for configuring such systems in practice.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.06903

Financial Stability Implications of Generative AI: Taming the Animal Spirits

By:	Anne Lundgaard Hansen; Seung Jung Lee
Abstract:	This paper investigates the impact of the adoption of generative AI on financial stability. We conduct laboratory-style experiments using large language models to replicate classic studies on herd behavior in trading decisions. Our results show that AI agents make more rational decisions than humans, relying predominantly on private information over market trends. Increased reliance on AI-powered trading advice could therefore potentially lead to fewer asset price bubbles arising from animal spirits that trade by following the herd. However, exploring variations in the experimental settings reveals that AI agents can be induced to herd optimally when explicitly guided to make profit-maximizing decisions. While optimal herding improves market discipline, this behavior still carries potential implications for financial stability. In other experimental variations, we show that AI agents are not purely algorithmic, but have inherited some elements of human conditioning and bias.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.01451

Mamba Outpaces Reformer in Stock Prediction with Sentiments from Top Ten LLMs

By:	Lokesh Antony Kadiyala; Amir Mirzaeinia
Abstract:	The stock market is extremely difficult to predict in the short term due to high market volatility, changes caused by news, and the non-linear nature of the financial time series. This research proposes a novel framework for improving minute-level prediction accuracy using semantic sentiment scores from top ten different large language models (LLMs) combined with minute interval intraday stock price data. We systematically constructed a time-aligned dataset of AAPL news articles and 1-minute Apple Inc. (AAPL) stock prices for the dates of April 4 to May 2, 2025. The sentiment analysis was achieved using the DeepSeek-V3, GPT variants, LLaMA, Claude, Gemini, Qwen, and Mistral models through their APIs. Each article obtained sentiment scores from all ten LLMs, which were scaled to a [0, 1] range and combined with prices and technical indicators like RSI, ROC, and Bollinger Band Width. Two state-of-the-art such as Reformer and Mamba were trained separately on the dataset using the sentiment scores produced by each LLM as input. Hyper parameters were optimized by means of Optuna and were evaluated through a 3-day evaluation period. Reformer had mean squared error (MSE) or the evaluation metrics, and it should be noted that Mamba performed not only faster but also better than Reformer for every LLM across the 10 LLMs tested. Mamba performed best with LLaMA 3.3--70B, with the lowest error of 0.137. While Reformer could capture broader trends within the data, the model appeared to over smooth sudden changes by the LLMs. This study highlights the potential of integrating LLM-based semantic analysis paired with efficient temporal modeling to enhance real-time financial forecasting.
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.01203

One More Question is Enough, Expert Question Decomposition (EQD) Model for Domain Quantitative Reasoning

By:	Mengyu Wang; Sotirios Sabanis; Miguel de Carvalho; Shay B. Cohen; Tiejun Ma
Abstract:	Domain-specific quantitative reasoning remains a major challenge for large language models (LLMs), especially in fields requiring expert knowledge and complex question answering (QA). In this work, we propose Expert Question Decomposition (EQD), an approach designed to balance the use of domain knowledge with computational efficiency. EQD is built on a two-step fine-tuning framework and guided by a reward function that measures the effectiveness of generated sub-questions in improving QA outcomes. It requires only a few thousand training examples and a single A100 GPU for fine-tuning, with inference time comparable to zero-shot prompting. Beyond its efficiency, EQD outperforms state-of-the-art domain-tuned models and advanced prompting strategies. We evaluate EQD in the financial domain, characterized by specialized knowledge and complex quantitative reasoning, across four benchmark datasets. Our method consistently improves QA performance by 0.6% to 10.5% across different LLMs. Our analysis reveals an important insight: in domain-specific QA, a single supporting question often provides greater benefit than detailed guidance steps.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.01526

Multi-Agent Analysis of Off-Exchange Public Information for Cryptocurrency Market Trend Prediction

By:	Kairan Hong; Jinling Gan; Qiushi Tian; Yanglinxuan Guo; Rui Guo; Runnan Li
Abstract:	Cryptocurrency markets present unique prediction challenges due to their extreme volatility, 24/7 operation, and hypersensitivity to news events, with existing approaches suffering from key information extraction and poor sideways market detection critical for risk management. We introduce a theoretically-grounded multi-agent cryptocurrency trend prediction framework that advances the state-of-the-art through three key innovations: (1) an information-preserving news analysis system with formal theoretical guarantees that systematically quantifies market impact, regulatory implications, volume dynamics, risk assessment, technical correlation, and temporal effects using large language models; (2) an adaptive volatility-conditional fusion mechanism with proven optimal properties that dynamically combines news sentiment and technical indicators based on market regime detection; (3) a distributed multi-agent coordination architecture with low communication complexity enabling real-time processing of heterogeneous data streams. Comprehensive experimental evaluation on Bitcoin across three prediction horizons demonstrates statistically significant improvements over state-of-the-art natural language processing baseline, establishing a new paradigm for financial machine learning with broad implications for quantitative trading and risk management systems.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.08268

AlphaSAGE: Structure-Aware Alpha Mining via GFlowNets for Robust Exploration

By:	Binqi Chen; Hongjun Ding; Ning Shen; Jinsheng Huang; Taian Guo; Luchen Liu; Ming Zhang
Abstract:	The automated mining of predictive signals, or alphas, is a central challenge in quantitative finance. While Reinforcement Learning (RL) has emerged as a promising paradigm for generating formulaic alphas, existing frameworks are fundamentally hampered by a triad of interconnected issues. First, they suffer from reward sparsity, where meaningful feedback is only available upon the completion of a full formula, leading to inefficient and unstable exploration. Second, they rely on semantically inadequate sequential representations of mathematical expressions, failing to capture the structure that determine an alpha's behavior. Third, the standard RL objective of maximizing expected returns inherently drives policies towards a single optimal mode, directly contradicting the practical need for a diverse portfolio of non-correlated alphas. To overcome these challenges, we introduce AlphaSAGE (Structure-Aware Alpha Mining via Generative Flow Networks for Robust Exploration), a novel framework is built upon three cornerstone innovations: (1) a structure-aware encoder based on Relational Graph Convolutional Network (RGCN); (2) a new framework with Generative Flow Networks (GFlowNets); and (3) a dense, multi-faceted reward structure. Empirical results demonstrate that AlphaSAGE outperforms existing baselines in mining a more diverse, novel, and highly predictive portfolio of alphas, thereby proposing a new paradigm for automated alpha mining. Our code is available at https://github.com/BerkinChen/AlphaSAGE.
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2509.25055

Inducing State Anxiety in LLM Agents Reproduces Human-Like Biases in Consumer Decision-Making

By:	Ziv Ben-Zion; Zohar Elyoseph; Tobias Spiller; Teddy Lazebnik
Abstract:	Large language models (LLMs) are rapidly evolving from text generators to autonomous agents, raising urgent questions about their reliability in real-world contexts. Stress and anxiety are well known to bias human decision-making, particularly in consumer choices. Here, we tested whether LLM agents exhibit analogous vulnerabilities. Three advanced models (ChatGPT-5, Gemini 2.5, Claude 3.5-Sonnet) performed a grocery shopping task under budget constraints (24, 54, 108 USD), before and after exposure to anxiety-inducing traumatic narratives. Across 2, 250 runs, traumatic prompts consistently reduced the nutritional quality of shopping baskets (Change in Basket Health Scores of -0.081 to -0.126; all pFDR
Date:	2025–08
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.06222

LEMs: A Primer On Large Execution Models

By:	Remi Genet; Hugo Inzirillo
Abstract:	This paper introduces Large Execution Models (LEMs), a novel deep learning framework that extends transformer-based architectures to address complex execution problems with flexible time boundaries and multiple execution constraints. Building upon recent advances in neural VWAP execution strategies, LEMs generalize the approach from fixed-duration orders to scenarios where execution duration is bounded between minimum and maximum time horizons, similar to share buyback contract structures. The proposed architecture decouples market information processing from execution allocation decisions: a common feature extraction pipeline using Temporal Kolmogorov-Arnold Networks (TKANs), Variable Selection Networks (VSNs), and multi-head attention mechanisms processes market data to create informational context, while independent allocation networks handle the specific execution logic for different scenarios (fixed quantity vs. fixed notional, buy vs. sell orders). This architectural separation enables a unified model to handle diverse execution objectives while leveraging shared market understanding across scenarios. Through comprehensive empirical evaluation on intraday cryptocurrency markets and multi-day equity trading using DOW Jones constituents, we demonstrate that LEMs achieve superior execution performance compared to traditional benchmarks by dynamically optimizing execution paths within flexible time constraints. The unified model architecture enables deployment across different execution scenarios (buy/sell orders, varying duration boundaries, volume/notional targets) through a single framework, providing significant operational advantages over asset-specific approaches.
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2509.25211

FinReflectKG - EvalBench: Benchmarking Financial KG with Multi-Dimensional Evaluation

By:	Fabrizio Dimino; Abhinav Arun; Bhaskarjit Sarmah; Stefano Pasquali
Abstract:	Large language models (LLMs) are increasingly being used to extract structured knowledge from unstructured financial text. Although prior studies have explored various extraction methods, there is no universal benchmark or unified evaluation framework for the construction of financial knowledge graphs (KG). We introduce FinReflectKG - EvalBench, a benchmark and evaluation framework for KG extraction from SEC 10-K filings. Building on the agentic and holistic evaluation principles of FinReflectKG - a financial KG linking audited triples to source chunks from S&P 100 filings and supporting single-pass, multi-pass, and reflection-agent-based extraction modes - EvalBench implements a deterministic commit-then-justify judging protocol with explicit bias controls, mitigating position effects, leniency, verbosity and world-knowledge reliance. Each candidate triple is evaluated with binary judgments of faithfulness, precision, and relevance, while comprehensiveness is assessed on a three-level ordinal scale (good, partial, bad) at the chunk level. Our findings suggest that, when equipped with explicit bias controls, LLM-as-Judge protocols provide a reliable and cost-efficient alternative to human annotation, while also enabling structured error analysis. Reflection-based extraction emerges as the superior approach, achieving best performance in comprehensiveness, precision, and relevance, while single-pass extraction maintains the highest faithfulness. By aggregating these complementary dimensions, FinReflectKG - EvalBench enables fine-grained benchmarking and bias-aware evaluation, advancing transparency and governance in financial AI applications.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.05710

Can language models boost the power of randomized experiments without statistical bias?

By:	Xinrui Ruan; Xinwei Ma; Yingfei Wang; Waverly Wei; Jingshen Wang
Abstract:	Randomized experiments or randomized controlled trials (RCTs) are gold standards for causal inference, yet cost and sample-size constraints limit power. Meanwhile, modern RCTs routinely collect rich, unstructured data that are highly prognostic of outcomes but rarely used in causal analyses. We introduce CALM (Causal Analysis leveraging Language Models), a statistical framework that integrates large language models (LLMs) predictions with established causal estimators to increase precision while preserving statistical validity. CALM treats LLM outputs as auxiliary prognostic information and corrects their potential bias via a heterogeneous calibration step that residualizes and optimally reweights predictions. We prove that CALM remains consistent even when LLM predictions are biased and achieves efficiency gains over augmented inverse probability weighting estimators for various causal effects. In particular, CALM develops a few-shot variant that aggregates predictions across randomly sampled demonstration sets. The resulting U-statistic-like predictor restores i.i.d. structure and also mitigates prompt-selection variability. Empirically, in simulations calibrated to a mobile-app depression RCT, CALM delivers lower variance relative to other benchmarking methods, is effective in zero- and few-shot settings, and remains stable across prompt designs. By principled use of LLMs to harness unstructured data and external knowledge learned during pretraining, CALM provides a practical path to more precise causal analyses in RCTs.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.05545

Enhanced fill probability estimates in institutional algorithmic bond trading using statistical learning algorithms with quantum computers

By:	Axel Ciceri; Austin Cottrell; Joshua Freeland; Daniel Fry; Hirotoshi Hirai; Philip Intallura; Hwajung Kang; Chee-Kong Lee; Abhijit Mitra; Kentaro Ohno; Das Pemmaraju; Manuel Proissl; Brian Quanz; Del Rajan; Noriaki Shimada; Kavitha Yograj
Abstract:	The estimation of fill probabilities for trade orders represents a key ingredient in the optimization of algorithmic trading strategies. It is bound by the complex dynamics of financial markets with inherent uncertainties, and the limitations of models aiming to learn from multivariate financial time series that often exhibit stochastic properties with hidden temporal patterns. In this paper, we focus on algorithmic responses to trade inquiries in the corporate bond market and investigate fill probability estimation errors of common machine learning models when given real production-scale intraday trade event data, transformed by a quantum algorithm running on IBM Heron processors, as well as on noiseless quantum simulators for comparison. We introduce a framework to embed these quantum-generated data transforms as a decoupled offline component that can be selectively queried by models in low-latency institutional trade optimization settings. A trade execution backtesting method is employed to evaluate the fill prediction performance of these models in relation to their input data. We observe a relative gain of up to ~ 34% in out-of-sample test scores for those models with access to quantum hardware-transformed data over those using the original trading data or transforms by noiseless quantum simulation. These empirical results suggest that the inherent noise in current quantum hardware contributes to this effect and motivates further studies. Our work demonstrates the emerging potential of quantum computing as a complementary explorative tool in quantitative finance and encourages applied industry research towards practical applications in trading.
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2509.17715

Predicting Credit Spreads and Ratings with Machine Learning: The Role of Non-Financial Data

By:	Yanran Wu; Xinlei Zhang; Quanyi Xu; Qianxin Yang; Chao Zhang
Abstract:	We build a 167-indicator comprehensive credit risk indicator set, integrating macro, corporate financial, bond-specific indicators, and for the first time, 30 large-scale corporate non-financial indicators. We use seven machine learning models to construct a bond credit spread prediction model, test their spread predictive power and economic mechanisms, and verify their credit rating prediction effectiveness. Results show these models outperform Chinese credit rating agencies in explaining credit spreads. Specially, adding non-financial indicators more than doubles their out-of-sample performance vs. traditional feature-driven models. Mechanism analysis finds non-financial indicators far more important than traditional ones (macro-level, financial, bond features)-seven of the top 10 are non-financial (e.g., corporate governance, property rights nature, information disclosure evaluation), the most stable predictors. Models identify high-risk traits (deteriorating operations, short-term debt, higher financing constraints) via these indicators for spread prediction and risk identification. Finally, we pioneer a credit rating model using predicted spreads (predicted implied rating model), with full/sub-industry models achieving over 75% accuracy, recall, F1. This paper provides valuable guidance for bond default early warning, credit rating, and financial stability.
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2509.19042

Uncovering Representation Bias for Investment Decisions in Open-Source Large Language Models

By:	Fabrizio Dimino; Krati Saxena; Bhaskarjit Sarmah; Stefano Pasquali
Abstract:	Large Language Models are increasingly adopted in financial applications to support investment workflows. However, prior studies have seldom examined how these models reflect biases related to firm size, sector, or financial characteristics, which can significantly impact decision-making. This paper addresses this gap by focusing on representation bias in open-source Qwen models. We propose a balanced round-robin prompting method over approximately 150 U.S. equities, applying constrained decoding and token-logit aggregation to derive firm-level confidence scores across financial contexts. Using statistical tests and variance analysis, we find that firm size and valuation consistently increase model confidence, while risk factors tend to decrease it. Confidence varies significantly across sectors, with the Technology sector showing the greatest variability. When models are prompted for specific financial categories, their confidence rankings best align with fundamental data, moderately with technical signals, and least with growth indicators. These results highlight representation bias in Qwen models and motivate sector-aware calibration and category-conditioned evaluation protocols for safe and fair financial LLM deployment.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.05702

Analysis of the Impact of an Execution Algorithm with an Order Book Imbalance Strategy on a Financial Market Using an Agent-based Simulation

By:	Shuto Endo; Takanobu Mizuta; Isao Yagi
Abstract:	Order book imbalance (OBI) - buy orders minus sell orders near the best quote - measures supply-demand imbalance that can move prices. OBI is positively correlated with returns, and some investors try to use it to improve performance. Large orders placed at once can reveal intent, invite front-running, raise volatility, and cause losses. Execution algorithms therefore split parent orders into smaller lots to limit price distortion. In principle, using OBI inside such algorithms could improve execution, but prior evidence is scarce because isolating OBI's effect in real markets is nearly impossible amid many external factors. Multi-agent simulation offers a way to study this. In an artificial market, individual actors are agents whose rules and interactions form the model. This study builds an execution algorithm that accounts for OBI, tests it across several market patterns in artificial markets, and analyzes mechanisms, comparing it with a conventional (OBI-agnostic) algorithm. Results: (i) In stable markets, the OBI strategy's performance depends on the number of order slices; outcomes vary with how the parent order is partitioned. (ii) In markets with unstable prices, the OBI-based algorithm outperforms the conventional approach. (iii) Under spoofing manipulation, the OBI strategy is not significantly worse than the conventional algorithm, indicating limited vulnerability to spoofing. Overall, OBI provides a useful signal for execution. Incorporating OBI can add value - especially in volatile conditions - while remaining reasonably robust to spoofing; in calm markets, benefits are sensitive to slicing design.
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2509.16912

An Adaptive Multi Agent Bitcoin Trading System

By:	Aadi Singhi
Abstract:	This paper presents a Multi Agent Bitcoin Trading system that utilizes Large Lan- guage Models (LLMs) for alpha generation and portfolio management in the cryptocur- rencies market. Unlike equities, cryptocurrencies exhibit extreme volatility and are heavily influenced by rapidly shifting market sentiments and regulatory announcements, making them difficult to model using static regression models or neural networks trained solely on historical data [53]. The proposed framework overcomes this by structuring LLMs into specialised agents for technical analysis, sentiment evaluation, decision-making, and performance reflection. The system improves over time through a novel verbal feedback mechanism where a Reflect agent provides daily and weekly natural-language critiques of trading decisions. These textual evaluations are then injected into future prompts, al- lowing the system to adjust indicator priorities, sentiment weights, and allocation logic without parameter updates or finetuning. Back-testing on Bitcoin price data from July 2024 to April 2025 shows consistent outperformance across market regimes: the Quantita- tive agent delivered over 30% higher returns in bullish phases and 15% overall gains versus buy-and-hold, while the sentiment-driven agent turned sideways markets from a small loss into a gain of over 100%. Adding weekly feedback further improved total performance by 31% and reduced bearish losses by 10%. The results demonstrate that verbal feedback represents a new, scalable, and low-cost method of tuning LLMs for financial goals.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.08068

What Can Satellite Imagery and Machine Learning Measure?

By:	Jonathan Proctor; Tamma Carleton; Trinetta Chong; Taryn Fransen; Simon Greenhill; Jessica Katz; Hikari Murayama; Luke Sherman; Jeanette Tseng; Hannah Druckenmiller; Solomon Hsiang
Abstract:	Satellite imagery and machine learning (SIML) are increasingly being combined to remotely measure social and environmental outcomes, yet use of this technology has been limited by insufficient understanding of its strengths and weaknesses. Here, we undertake the most extensive effort yet to characterize the potential and limits of using a SIML technology to measure ground conditions. We conduct 115 standardized large-scale experiments using a composite high-resolution optical image of Earth and a generalizable SIML technology to evaluate what can be accurately measured and where this technology struggles. We find that SIML alone predicts roughly half the variation in ground measurements on average, and that variables describing human society (e.g. female literacy, R²=0.55) are generally as easily measured as natural variables (e.g. bird diversity, R²=0.55). Patterns of performance across measured variable type, space, income and population density indicate that SIML can likely support many new applications and decision-making use cases, although within quantifiable limits.
JEL:	C80 Q5
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:nbr:nberwo:34315

Predictive economics: Rethinking economic methodology with machine learning

By:	Miguel Alves Pereira
Abstract:	This article proposes predictive economics as a distinct analytical perspective within economics, grounded in machine learning and centred on predictive accuracy rather than causal identification. Drawing on the instrumentalist tradition (Friedman), the explanation-prediction divide (Shmueli), and the contrast between modelling cultures (Breiman), we formalise prediction as a valid epistemological and methodological objective. Reviewing recent applications across economic subfields, we show how predictive models contribute to empirical analysis, particularly in complex or data-rich contexts. This perspective complements existing approaches and supports a more pluralistic methodology - one that values out-of-sample performance alongside interpretability and theoretical structure.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.04726

The New Quant: A Survey of Large Language Models in Financial Prediction and Trading

By:	Weilong Fu
Abstract:	Large language models are reshaping quantitative investing by turning unstructured financial information into evidence-grounded signals and executable decisions. This survey synthesizes research with a focus on equity return prediction and trading, consolidating insights from domain surveys and more than fifty primary studies. We propose a task-centered taxonomy that spans sentiment and event extraction, numerical and economic reasoning, multimodal understanding, retrieval-augmented generation, time series prompting, and agentic systems that coordinate tools for research, backtesting, and execution. We review empirical evidence for predictability, highlight design patterns that improve faithfulness such as retrieval first prompting and tool-verified numerics, and explain how signals feed portfolio construction under exposure, turnover, and capacity controls. We assess benchmarks and datasets for prediction and trading and outline desiderata-for time safe and economically meaningful evaluation that reports costs, latency, and capacity. We analyze challenges that matter in production, including temporal leakage, hallucination, data coverage and structure, deployment economics, interpretability, governance, and safety. The survey closes with recommendations for standardizing evaluation, building auditable pipelines, and advancing multilingual and cross-market research so that language-driven systems deliver robust and risk-controlled performance in practice.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.05533

Private and public school efficiency gaps in Latin America-A combined DEA and machine learning approach based on PISA 2022

By:	Marcos Delprato
Abstract:	Latin America's education systems are fragmented and segregated, with substantial differences by school type. The concept of school efficiency (the ability of school to produce the maximum level of outputs given available resources) is policy relevant due to scarcity of resources in the region. Knowing whether private and public schools are making an efficient use of resources -- and which are the leading drivers of efficiency -- is critical, even more so after the learning crisis brought by the COVID-19 pandemic. In this paper, relying on data of 2, 034 schools and nine Latin American countries from PISA 2022, I offer new evidence on school efficiency (both on cognitive and non-cognitive dimensions) using Data Envelopment Analysis (DEA) by school type and, then, I estimate efficiency leading determinants through interpretable machine learning methods (IML). This hybrid DEA-IML approach allows to accommodate the issue of big data (jointly assessing several determinants of school efficiency). I find a cognitive efficiency gap of nearly 0.10 favouring private schools and of 0.045 for non-cognitive outcomes, and with a lower heterogeneity in private than public schools. For cognitive efficiency, leading determinants for the chance of a private school of being highly efficient are higher stock of books and PCs at home, lack of engagement in paid work and school's high autonomy; whereas low-efficient public schools are shaped by poor school climate, large rates of repetition, truancy and intensity of paid work, few books at home and increasing barriers for homework during the pandemic.
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2509.25353

Increase Alpha: Performance and Risk of an AI-Driven Trading Framework

By:	Sid Ghatak; Arman Khaledian; Navid Parvini; Nariman Khaledian
Abstract:	There are inefficiencies in financial markets, with unexploited patterns in price, volume, and cross-sectional relationships. While many approaches use large-scale transformers, we take a domain-focused path: feed-forward and recurrent networks with curated features to capture subtle regularities in noisy financial data. This smaller-footprint design is computationally lean and reliable under low signal-to-noise, crucial for daily production at scale. At Increase Alpha, we built a deep-learning framework that maps over 800 U.S. equities into daily directional signals with minimal computational overhead. The purpose of this paper is twofold. First, we outline the general overview of the predictive model without disclosing its core underlying concepts. Second, we evaluate its real-time performance through transparent, industry standard metrics. Forecast accuracy is benchmarked against both naive baselines and macro indicators. The performance outcomes are summarized via cumulative returns, annualized Sharpe ratio, and maximum drawdown. The best portfolio combination using our signals provides a low-risk, continuous stream of returns with a Sharpe ratio of more than 2.5, maximum drawdown of around 3\%, and a near-zero correlation with the S\&P 500 market benchmark. We also compare the model's performance through different market regimes, such as the recent volatile movements of the US equity market in the beginning of 2025. Our analysis showcases the robustness of the model and significantly stable performance during these volatile periods. Collectively, these findings show that market inefficiencies can be systematically harvested with modest computational overhead if the right variables are considered. This report will emphasize the potential of traditional deep learning frameworks for generating an AI-driven edge in the financial market.
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2509.16707

Asymptotic expansions as control variates for deep solvers to fully-coupled forward-backward stochastic differential equations "Forthcoming in PLOS ONE"

By:	Makoto Naito (Graduate School of Management, Tokyo Metropolitan University); Taiga Saito (Graduate School of Commerce, Senshu University); Akihiko Takahashi (Graduate School of Economics, The University of Tokyo); Kohta Takehara (Graduate School of Management, Tokyo Metropolitan University)
Abstract:	Coupled forward-backward stochastic differential equations (FBSDEs) are closely related to financially important issues such as optimal investment. However, it is well known that obtaining solutions is challenging, even when employing numerical methods. In this paper, we propose new methods that combine an algorithm recently developed for coupled FBSDEs and an asymptotic expansion approach to those FBSDEs as control variates for learning of the neural networks. The proposed method is demonstrated to perform better than the original algorithm in numerical examples, including one with a financial implication. The results show that the proposed method exhibits not only faster convergence but also greater stability in computation.
Date:	2025–03
URL:	https://d.repec.org/n?u=RePEc:cfi:fseres:cf600

Joint Stochastic Optimal Control and Stopping in Aquaculture: Finite-Difference and PINN-Based Approaches

By:	Kevin Kamm
Abstract:	This paper studies a joint stochastic optimal control and stopping (JCtrlOS) problem motivated by aquaculture operations, where the objective is to maximize farm profit through an optimal feeding strategy and harvesting time under stochastic price dynamics. We introduce a simplified aquaculture model capturing essential biological and economic features, distinguishing between biologically optimal and economically optimal feeding strategies. The problem is formulated as a Hamilton-Jacobi-Bellman variational inequality and corresponding free boundary problem. We develop two numerical solution approaches: First, a finite difference scheme that serves as a benchmark, and second, a Physics-Informed Neural Network (PINN)-based method, combined with a deep optimal stopping (DeepOS) algorithm to improve stopping time accuracy. Numerical experiments demonstrate that while finite differences perform well in medium-dimensional settings, the PINN approach achieves comparable accuracy and is more scalable to higher dimensions where grid-based methods become infeasible. The results confirm that jointly optimizing feeding and harvesting decisions outperforms strategies that neglect either control or stopping.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.02910

The AI Productivity Index (APEX)

By:	Bertie Vidgen; Abby Fennelly; Evan Pinnix; Chirag Mahapatra; Zach Richards; Austin Bridges; Calix Huang; Ben Hunsberger; Fez Zafar; Brendan Foody; Dominic Barton; Cass R. Sunstein; Eric Topol; Osvald Nitski
Abstract:	We introduce the first version of the AI Productivity Index (APEX), a benchmark for assessing whether frontier AI models can perform knowledge work with high economic value. APEX addresses one of the largest inefficiencies in AI research: outside of coding, benchmarks often fail to test economically relevant capabilities. APEX-v1.0 contains 200 test cases and covers four domains: investment banking, management consulting, law, and primary medical care. It was built in three steps. First, we sourced experts with top-tier experience e.g., investment bankers from Goldman Sachs. Second, experts created prompts that reflect high-value tasks in their day-to-day work. Third, experts created rubrics for evaluating model responses. We evaluate 23 frontier models on APEX-v1.0 using an LM judge. GPT 5 (Thinking = High) achieves the highest mean score (64.2%), followed by Grok 4 (61.3%) and Gemini 2.5 Flash (Thinking = On) (60.4%). Qwen 3 235B is the best performing open-source model and seventh best overall. There is a large gap between the performance of even the best models and human experts, highlighting the need for better measurement of models' ability to produce economically valuable work.
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2509.25721

Machines are more productive than humans until they aren't, and vice versa

By:	Riccardo Zanardelli
Abstract:	With the growth of artificial skills, organizations are increasingly confronting the problem of optimizing skill policy decisions guided by economic principles. This paper addresses the underlying complexity of this challenge by developing an in-silico framework based on Monte Carlo simulations grounded in empirical realism to analyze the economic impact of human and machine skills, individually or jointly deployed, in the execution of tasks presenting varying levels of complexity. Our results provide quantitative support for the established notions that automation tends to be the most economically-effective strategy for tasks characterized by low-to-medium generalization difficulty, while automation may struggle to match the economic utility of human skills in more complex scenarios. Critically, our simulations highlight that, when a high level of generalization is required and the cost of errors is high, combining human and machine skills can be the most effective strategy, but only if genuine augmentation is achieved. In contrast, when failing to realize this synergy, the human-machine policy is severely penalized by the inherent costs of its dual skill structure, causing it to destroy value and become the worst choice from an economic perspective. The takeaway for decision-makers is unambiguous: in complex and critical contexts, simply allocating human and machine skills to a task may be insufficient, and a human-machine skill policy is neither a silver-bullet solution nor a low-risk compromise. Rather, it is a critical opportunity to boost competitiveness that demands a strong organizational commitment to enabling augmentation. Also, our findings show that improving the cost-effectiveness of machine skills over time, while useful, does not replace the fundamental need to focus on achieving augmentation.
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2509.14057

Artificial Intelligence in Port Logistics: A Bibliometric Analysis of Technological Integration and Research Dynamics

By:	Abdelhafid Khazzar; Yassine Sekaki; Yasser Lachhab; Said El-marzouki
Abstract:	The paper explores the transformation of port logistics operations with artificial intelligence during the port transformation into a smart port. The research integrates capabilities-based resource analysis and dynamic capabilities with sociotechnicalimplementations of technologies and resilience approaches of complex systems under disruptions. The system applies robustdata infrastructures to propel analytical and AI modules that become effective once integrated with sufficient governance systems and trained personnel and operational processes to transform planning and safety and sustainability operations.It applies Scopus bibliometric research to analyze 123 articles using a systematic approach with both a search protocol and a document screening and duplication verification. It incorporates annual behavior and distribution of author and country performance analysis with science mapping techniques that explore keyword relation and co-citation and bibliographic coupling and conceptual structuring tools that construct thematic maps and multiple correspondence analysis with community detection while applying explicit thresholding and robust tests.The research connects AI applications to smart port domains through specific data-to-impact pathways while providing a method for bibliometric analysis that enables future updates. The research presents a step-by-step approach for data readiness followed by predictive and optimization implementation and organizational integration. The paper supports public policy through recommendations for data sharing standards and complete environmental benefit assessments. The research proposes a future study plan whichcombines field-based testing with multiple port assessments to enhance both cause-effect understanding and research applicability.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.06556

Why Virtual Mileage Can Threaten Vehicle-to-Grid

By:	Pierre Dumont (GeePs - Laboratoire Génie électrique et électronique de Paris - CentraleSupélec - SU - Sorbonne Université - Université Paris-Saclay - CNRS - Centre National de la Recherche Scientifique, Stellantis (Centre technique de Carrières-sous-Poissy)); Lorenzo Nicoletti (Stellantis (Centre technique de Carrières-sous-Poissy)); Marc Petit (GeePs - Laboratoire Génie électrique et électronique de Paris - CentraleSupélec - SU - Sorbonne Université - Université Paris-Saclay - CNRS - Centre National de la Recherche Scientifique); Damien-Pierre Sainflou (Stellantis (Centre technique de Carrières-sous-Poissy))
Abstract:	Vehicle-to-grid (V2G) technology is gaining interest, particularly for electricity trading using electric vehicle (EV) batteries. This study focuses on the economic impact of V2Ginduced vehicle degradation. Unlike traditional approaches that estimate costs based on battery capacity loss, upcoming "virtual mileage" regulations aim to provide a more tangible metric. Virtual mileage, deduced from the energy reinjected onto the grid by the vehicle, is meant to represent a similar wear as real mileage. However, it should be noted that this metric is flawed as it tends to overestimate vehicle degradation since it tacitly includes wear on components that are not used during V2G (for instance: tires, brakes), and overlooks other factors like battery calendar ageing: for example, an EV with high virtual mileage could retain better battery health than one stored at full charge. Virtual mileage could hence significantly affect EV residual value, around ~1 c€ per virtual kilometre in order of magnitude, translating to ~0.05 € per discharged kWh. This depreciation would pose a substantial barrier to V2G profitability. Using simulations of EVs in the French day-ahead electricity market for 2019, the study finds that accounting for devaluation reduces average annual V2G benefits to just 6.96 €/EV, compared to 29.2 €/EV without it. The paper highlights the aforementioned limitations to virtual mileage and advocates alternative metrics such as the state-of-health to assess vehicle degradation, aiming to enhance the feasibility of V2G.
Keywords:	day-ahead market, energy arbitrage, residual value, virtual mileage, battery degradation, Vehicle-to-grid
Date:	2025–06–29
URL:	https://d.repec.org/n?u=RePEc:hal:journl:hal-05294002

Incorporating Non-Unitary Income Elasticities, Choke Prices and Choke Incomes into Applied General-Equilibrium Models

By:	James R. Markusen
Abstract:	Traditional applied general-equilibrium (AGE) models have always faced trade-offs between analytical and computational tractability and counter-empirical restrictions. One is the assumption of homothetic preferences implying unitary income elasticities of demand, significantly inconsistent with data. Similarly, there is no “choke” income level, below which a certain good is not purchased and there is no choke price above which a good is not purchased, implying no changes in the extensive margin of trade. Here I exploit what I will label a Stone-Geary Modified (SGM) formulation. This produces a model in which there are non-unitary income elasticities, choke income levels for some/all goods, and choke prices. The second approach modifies CRIE (constant relative income elasticity) preferences which are preferred for modeling income elasticities, but don’t by themselves permit choke income and prices. While other authors have explored these properties in alternative ways, both my approaches have considerable advantages for high-dimension simulation models in that they retain CES structures and functional forms so that they can slot right into existing modeling formats. They require only small modifications to off-the-shelf cost and expenditure functions, and therefore goods and factor demand functions via Shepard’s lemma. Unobserved parameters can be calibrated from observed data and econometric estimates.
JEL:	C63 C68 F1 F17
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:nbr:nberwo:34314

Progressiver Einkommensteuertarif und Ehegattenbesteuerung - Simulationsanalyse alternativer Besteuerungskonzepte

By:	Neugebauer, Claudia (Schumpeter School of Business and Economics, University of Wuppertal); Mattern, Marcel (Schumpeter School of Business and Economics, University of Wuppertal)
Abstract:	Der progressive Steuertarif ist ein zentrales Element des deutschen Steuersystems. Er zielt darauf ab, höhere Einkommen stärker zu besteuern und somit eine gerechtere Verteilung der Steuerlast zu gewährleisten. Auch bei einem progressiven Tarifverlauf soll die Gesamtsteuerbelastung verheirateter Personen nicht höher sein als das von zwei Ledigen. Als Alternative zum geltenden Ehegattensplitting wird schon seit Längerem die Individualbesteuerung diskutiert, ggf. kombiniert mit einem übertragbaren Grundfreibetrag oder aber der Berücksichtigung von Unterhaltsleistungen (Realsplitting). Der vorliegende Beitrag analysiert anhand der FAST Daten, wie sich die Gesamtsteuerbelastung zusammen veranlagter Ehegatten bei Anwendung alternativer Besteuerungsmodelle verändern würde. Die Resultate zeigen, dass die Effekte je nach Alter, Haushaltsgröße und Einkommensverteilung zwischen den Ehegatten stark variieren. Das Prinzip der Nichtdiskriminierung wird beim Splittingverfahren uneingeschränkt und bei alternativen Besteuerungsmodellen umso eher erreicht, je höher der zu berücksichtigende Unterhaltsbetrag ausfällt.
Abstract:	The progressive income tax rate is central to the German tax system. The aim is to tax higher incomes more heavily, thus ensuring a fairer distribution of the tax burden. Under this system, the total tax burden for married persons should not exceed that for two single individuals. Individual taxation has been discussed as an alternative to the current marriage splitting system, possibly combined with a transferable basic tax-free allowance or the consideration of maintenance payments (real splitting). This article uses FAST data to analyze how the total tax burden of jointly assessed spouses would change if alternative taxation models were applied. The results show that the effects vary greatly depending on age, household size and income Distribution between the spouses. The principle of non-discrimination is fully achieved under the Splitting method, and with alternative taxation models, this principle is better fulfilled the higher the amount of maintenance payments consid ered.
Keywords:	Ehegattenbesteuerung, Realsplitting, Tarifsimulation; joint taxation, individual taxation, tax rate simulation
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:bwu:schdps:sdp25001

Robust Identification in Repeated Games: An Empirical Approach to Algorithmic Competition

By:	Antonio Cozzolino (NYU Stern, New York University, New York, NY, USA); Cristina Gualdani (School of Economics and Finance, Queen Mary University of London, London, UK); Ivan Gufler (Department of Economics and Finance, University of Bonn, Bonn, Germany); Niccolò Lomys (CSEF and Department of Economics and Statistics, University of Naples Federico II, Naples, Italy); Lorenzo Magnolfi (Department of Economics, University of Wisconsin-Madison, Madison, WI, USA)
Abstract:	We develop an econometric framework for recovering structural primitives---such as marginal costs---from price or quantity data generated by firms whose decisions are governed by reinforcement-learning algorithms. Guided by recent theory and simulations showing that such algorithms can learn to approximate repeated-game equilibria, we impose only the minimal optimality conditions implied by equilibrium, while remaining agnostic about the algorithms’ hidden design choices and the resulting conduct---competitive, collusive, or anywhere in between. These weak restrictions yield set identification of the primitives; we characterise the resulting sets and construct estimators with valid confidence regions. Monte~Carlo simulations confirm that our bounds contain the true parameters across a wide range of algorithm specifications, and that the sets tighten substantially when exogenous demand variation across markets is exploited. The framework thus offers a practical tool for empirical analysis and regulatory assessment of algorithmic behaviour.
Keywords:	Algorithms; Reinforcement Learning; Repeated Games; Coarse Correlated Equilibrium; Partial Identification; Incomplete Models
JEL:	C1 C5 C7 D8 L1
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:net:wpaper:2504

Inverse Portfolio Optimization with Synthetic Investor Data: Recovering Risk Preferences under Uncertainty

By:	Jinho Cha; Long Pham; Thi Le Hoa Vo; Jaeyoung Cho; Jaejin Lee
Abstract:	This study develops an inverse portfolio optimization framework for recovering latent investor preferences including risk aversion, transaction cost sensitivity, and ESG orientation from observed portfolio allocations. Using controlled synthetic data, we assess the estimator's statistical properties such as consistency, coverage, and dynamic regret. The model integrates robust optimization and regret-based inference to quantify welfare losses under preference misspecification and market shocks. Simulation experiments demonstrate accurate recovery of transaction cost parameters, partial identifiability of ESG penalties, and sublinear regret even under stochastic volatility and liquidity shocks. A real-data illustration using ETFs confirms that transaction-cost shocks dominate volatility shocks in welfare impact. The framework thus provides a statistically rigorous and economically interpretable tool for robust preference inference and portfolio design under uncertainty.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.06986

Cost estimation in the context of Manufacturing-as-a-Service

By:	Farah Abdoune (LS2N - Laboratoire des Sciences du Numérique de Nantes - Inria - Institut National de Recherche en Informatique et en Automatique - CNRS - Centre National de la Recherche Scientifique - IMT Atlantique - IMT Atlantique - IMT - Institut Mines-Télécom [Paris] - Nantes Univ - ECN - NANTES UNIVERSITÉ - École Centrale de Nantes - Nantes Univ - Nantes Université - Nantes univ - UFR ST - Nantes université - UFR des Sciences et des Techniques - Nantes Université - pôle Sciences et technologie - Nantes Univ - Nantes Université, LS2N - équipe CPS3 - Conception, Pilotage, Surveillance et Supervision des systèmes - LS2N - Laboratoire des Sciences du Numérique de Nantes - Inria - Institut National de Recherche en Informatique et en Automatique - CNRS - Centre National de la Recherche Scientifique - IMT Atlantique - IMT Atlantique - IMT - Institut Mines-Télécom [Paris] - Nantes Univ - ECN - NANTES UNIVERSITÉ - École Centrale de Nantes - Nantes Univ - Nantes Université - Nantes univ - UFR ST - Nantes université - UFR des Sciences et des Techniques - Nantes Université - pôle Sciences et technologie - Nantes Univ - Nantes Université, Nantes Univ - Nantes Université); Rasmus Andersen (AAU - Aalborg University [Denmark]); Ann-Louise Andersen (AAU - Aalborg University [Denmark]); Catherine da Cunha (LS2N - équipe CPS3 - Conception, Pilotage, Surveillance et Supervision des systèmes - LS2N - Laboratoire des Sciences du Numérique de Nantes - Inria - Institut National de Recherche en Informatique et en Automatique - CNRS - Centre National de la Recherche Scientifique - IMT Atlantique - IMT Atlantique - IMT - Institut Mines-Télécom [Paris] - Nantes Univ - ECN - NANTES UNIVERSITÉ - École Centrale de Nantes - Nantes Univ - Nantes Université - Nantes univ - UFR ST - Nantes université - UFR des Sciences et des Techniques - Nantes Université - pôle Sciences et technologie - Nantes Univ - Nantes Université, LS2N - Laboratoire des Sciences du Numérique de Nantes - Inria - Institut National de Recherche en Informatique et en Automatique - CNRS - Centre National de la Recherche Scientifique - IMT Atlantique - IMT Atlantique - IMT - Institut Mines-Télécom [Paris] - Nantes Univ - ECN - NANTES UNIVERSITÉ - École Centrale de Nantes - Nantes Univ - Nantes Université - Nantes univ - UFR ST - Nantes université - UFR des Sciences et des Techniques - Nantes Université - pôle Sciences et technologie - Nantes Univ - Nantes Université, Nantes Univ - ECN - NANTES UNIVERSITÉ - École Centrale de Nantes - Nantes Univ - Nantes Université)
Abstract:	Manufacturing as a Service (MaaS) represents a transformative shift in industrial production, offering flexible and scalable solutions through the use of shared manufacturing resources. However, the on-demand and variable nature of MaaS poses significant challenges in accurately estimating costs. This paper addresses these challenges by reviewing existing costing methods and selecting Activity-Based Costing (ABC) as the most suitable approach. A framework for cost estimation in this context is then proposed, followed by the use of simulation to support the implementation of ABC. The feasibility of this approach is demonstrated in a smart factory environment, showcasing how simulation can enhance cost estimation in a controlled setting. Finally, the methodology is examined from an industrial perspective, highlighting potential challenges and considerations for real-world application.
Keywords:	Activity-based costing, Costing, Digital enterprise, Modeling and simulation, Manufacturing-as-a-service
Date:	2025–06–30
URL:	https://d.repec.org/n?u=RePEc:hal:journl:hal-05288523

Fiscal Drag in Theory and in Practice: a European Perspective

By:	Esteban García-Miralles; Maximilian Freier; Sara Riscado; Chrysa Leventi; Alberto Mazzon; Glenn Abela; Laura Lehtonen; Laura Boyd; Baiba BrusbÄ rde; Marion Cochard; David Cornille; Emanuele Dicarlo; Ian Debattista; Mar Delgado-Téllez; Mathias Dolls; Ludmila Fadejeva; Maria Flevotomou; Florian Henne; Alena Harrer-Bachleitner; Viktor Jaszberenyi-Kiraly; Max Lay; Mauro Mastrogiacomo; Tara McIndoe-Calder; Mathias Moser; Martin Nevicky; Andreas Peichl; Myroslav Pidkuyko; Mojca Roter; Frédérique Savignac; Andreja Strojan Kastelec; Vaidotas Tuzikas; Nikos Ventouris; Lara Wemans
Abstract:	This paper presents a comprehensive characterization of â€œfiscal dragâ€ â€”the increase in tax revenue that occurs when nominal tax bases grow but nominal parameters of progressive tax legislation are not updated accordinglyâ€”across 21 European countries using a microsimulation approach. First, we estimate tax-to-base elasticities, showing that the progressivity built in each countryâ€™s personal income tax system induces elas- ticities around 1.7â€“1.9 for many countries, indicating a potential for large fiscal drag effects. We unpack these elasticities to show stark heterogeneity in their underlying mechanisms (tax brackets or tax deductions and credits), across income sources (labor, capital, self-employment, public benefits), and across the individual income distribu- tion. Second, we extend the analysis beyond these elasticities to study fiscal drag in practice between 2019 and 2023, incorporating observed income growth and legislative changes. We quantify the actual impact of fiscal drag and the extent to which govern- ment policies have offset it, either through indexation or other reforms. Our results provide new insights into the fiscal and distributional effects of fiscal drag in Europe, as well as useful statistics for modeling public finances.
Keywords:	Personal income tax; inflation; indexation; bracket creep;
JEL:	D31 H24 E62
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:dnb:dnbwpp:844

Modeling ROI in Chronic Disease Management, A Simulation-Based Framework Integrating Patient Adherence and Policy Timing

By:	Jinho Cha; Eunchan D. Cha; Emily Yoo; Hyoshin Song
Abstract:	Background: Chronic diseases impose a sustained burden on healthcare systems through progressive deterioration and long-term costs. Although adherence-enhancing interventions are widely promoted, their return on investment (ROI) remains uncertain, particularly under heterogeneous patient behavior and socioeconomic variation. Methods: We developed a simulation-based framework integrating disease progression, time-varying adherence, and policy timing. Cumulative healthcare costs were modeled over a 10-year horizon using continuous-time stochastic formulations calibrated with Medical Expenditure Panel Survey (MEPS) data stratified by income. ROI was estimated across adherence gains (delta) and policy costs (gamma). Results: Early and adaptive interventions yielded the highest ROI by sustaining adherence and slowing progression. ROI exceeded 20 percent when delta >= 0.20 and gamma
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.06379

Traffic jams and driver behavior archetypes

By:	Shawn Berry
Abstract:	Traffic congestion represents a complex urban phenomenon that has been the subject of extensive research employing various modeling techniques grounded in the principles of physics and molecular theory. Although factors such as road design, accidents, weather conditions, and construction activities contribute to traffic congestion, driver behavior and decision-making are primary determinants of traffic flow efficiency. This study introduces a driver behavior archetype model that quantifies the relationship between individual driver behavior and system-level traffic outcomes through game-theoretic modeling and simulation (N = 500, 000) of a three-lane roadway. Mann-Whitney U tests revealed statistically significant differences across all utility measures (p 2.0). In homogeneous populations, responsible drivers achieved substantially higher expected utility (M = -0.090) than irresponsible drivers (M = -1.470). However, in mixed environments (50/50), irresponsible drivers paradoxically outperformed responsible drivers (M = 0.128 vs. M = -0.127), illustrating a social dilemma wherein defection exploits cooperation. Pairwise comparisons across the six driver archetypes indicated that all irresponsible types achieved equivalent utilities while consistently surpassing responsible drivers. Lane-specific analyses revealed differential capacity patterns, with lane 1 exhibiting a more pronounced cumulative utility decline. These findings offer a robust framework for traffic management interventions, congestion prediction, and policy design that aligns individual incentives with collective efficiency. Directions for future research were also proposed.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.04740

Smart Contract-Enabled Procurement under Bounded Demand Variability: A Truncated Normal Approach

By:	Jinho Cha; Youngchul Kim; Junyeol Ryu; Sangjun Park; Jeongho Kang; Hyeyoung Hwang
Abstract:	This study develops a strategic procurement framework integrating blockchain-based smart contracts with bounded demand variability modeled through a truncated normal distribution. While existing research emphasizes the technical feasibility of smart contracts, the operational and economic implications of adoption under moderate uncertainty remain underexplored. We propose a multi-supplier model in which a centralized retailer jointly determines the optimal smart contract adoption intensity and supplier allocation decisions. The formulation endogenizes adoption costs, supplier digital readiness, and inventory penalties to capture realistic trade-offs among efficiency, sustainability, and profitability. Analytical results establish concavity and provide closed-form comparative statics for adoption thresholds and procurement quantities. Extensive numerical experiments demonstrate that moderate demand variability supports partial adoption strategies, whereas excessive investment in digital infrastructure can reduce overall profitability. Dynamic simulations further reveal how adaptive learning and declining implementation costs progressively enhance adoption intensity and supply chain performance. The findings provide theoretical and managerial insights for balancing digital transformation, resilience, and sustainability objectives in smart contract-enabled procurement.
Date:	2025–10
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2510.07801

This nep-cmp issue is ©2025 by Stan Miles. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.