nep-cmp New Economics Papers
on Computational Economics
Issue of 2025–08–18
33 papers chosen by
Stan Miles, Thompson Rivers University


  1. Deep Learning Models for Financial Data Analysis: A Focused Review of Recent Advances By Duane, Jackson; Ren, Alicia; Zhang, Wei
  2. FinMarBa: A Market-Informed Dataset for Financial Sentiment Classification By Baptiste Lefort; Eric Benhamou; Beatrice Guez; Jean-Jacques Ohana; Ethan Setrouk; Alban Etienne
  3. How AI Detects Financial Fraud: A Review of Emerging Deep Learning Methods By Mori, Misato
  4. Machine learning regionalisation of input data for microsimulation models: An application of a hybrid GBM / IPF method to build a tax-benefit model for the Essex region in the UK By Richiardi, Matteo; Rejoice, Frimpong
  5. The ordinary meaning bot: Simulating human surveys with LLMs By Johannes Kruse
  6. Machine Learning based Enterprise Financial Audit Framework and High Risk Identification By Tingyu Yuan; Xi Zhang; Xuanjing Chen
  7. Quantum generative modeling for financial time series with temporal correlations By David Dechant; Eliot Schwander; Lucas van Drooge; Charles Moussa; Diego Garlaschelli; Vedran Dunjko; Jordi Tura
  8. Evaluating Large Language Models (LLMs) in Financial NLP: A Comparative Study on Financial Report Analysis By Md Talha Mohsin
  9. Valuing Time in Silicon: Can Large Language Model Replicate Human Value of Travel Time By Yingnan Yan; Tianming Liu; Yafeng Yin
  10. MountainLion: A Multi-Modal LLM-Based Agent System for Interpretable and Adaptive Financial Trading By Siyi Wu; Zhaoyang Guan; Leyi Zhao; Xinyuan Song; Xinyu Ying; Hanlin Zhang; Michele Pak; Yangfan He; Yi Xin; Jianhui Wang; Tianyu Shi
  11. AI-Powered Trading, Algorithmic Collusion, and Price Efficiency By Winston Wei Dou; Itay Goldstein; Yan Ji
  12. ContestTrade: A Multi-Agent Trading System Based on Internal Contest Mechanism By Li Zhao; Rui Sun; Zuoyou Jiang; Bo Yang; Yuxiao Bai; Mengting Chen; Xinyang Wang; Jing Li; Zuo Bai
  13. Can large language models assist choice modelling? Insights into prompting strategies and current models capabilities By Georges Sfeir; Gabriel Nova; Stephane Hess; Sander van Cranenburgh
  14. Your AI, Not Your View: The Bias of LLMs in Investment Analysis By Hoyoung Lee; Junhyuk Seo; Suhwan Park; Junhyeong Lee; Wonbin Ahn; Chanyeol Choi; Alejandro Lopez-Lira; Yongjae Lee
  15. An Enhanced Focal Loss Function to Mitigate Class Imbalance in Auto Insurance Fraud Detection with Explainable AI By Francis Boabang; Samuel Asante Gyamerah
  16. A New and Efficient Debiased Estimation of General Treatment Models by Balanced Neural Networks Weighting By Zeqi Wu; Meilin Wang; Wei Huang; Zheng Zhang
  17. Learning from Expert Factors: Trajectory-level Reward Shaping for Formulaic Alpha Mining By Junjie Zhao; Chengxi Zhang; Chenkai Wang; Peng Yang
  18. FinSurvival: A Suite of Large Scale Survival Modeling Tasks from Finance By Aaron Green; Zihan Nie; Hanzhen Qin; Oshani Seneviratne; Kristin P. Bennett
  19. Redefining Regions in Space and Time: A Deep Learning Method for Spatio-Temporal Clustering By Pablo Quintana; Marcos Herrera-Gómez
  20. Aligning Large Language Model Agents with Rational and Moral Preferences: A Supervised Fine-Tuning Approach By Wei Lu; Daniel L. Chen; Christian B. Hansen
  21. End-to-End Large Portfolio Optimization for Variance Minimization with Neural Networks through Covariance Cleaning By Christian Bongiorno; Efstratios Manolakis; Rosario Nunzio Mantegna
  22. "SustAIn" Designing generative AI to support environmental sensemaking By Brune, Niclas; Vetter, Oliver A.; Walter, Phillip; Buxmann, Peter
  23. Financial Regulation and AI: A Faustian Bargain? By Coppola, Antonio; Clayton, Christopher
  24. Deep Reputation Scoring in DeFi: zScore-Based Wallet Ranking from Liquidity and Trading Signals By Dhanashekar Kandaswamy; Ashutosh Sahoo; Akshay SP; Gurukiran S; Parag Paul; Girish G N
  25. AI Agents in the Electricity Market Game with Cryptocurrency Transactions: A Post-Terminator Analysis By Microsoft Copilot; Stephen E. Spear
  26. Time Deep Gradient Flow Method for pricing American options By Jasper Rou
  27. Defining Current and Expected Financial Constraints Using AI: Reinterpreting the Cash Flow Sensitivity of Cash By Rachel Cho; Christoph Görtz; Danny McGowan; Max Schröder
  28. NMIXX: Domain-Adapted Neural Embeddings for Cross-Lingual eXploration of Finance By Hanwool Lee; Sara Yu; Yewon Hwang; Jonghyun Choi; Heejae Ahn; Sungbum Jung; Youngjae Yu
  29. Networked Information Aggregation via Machine Learning By Michael Kearns; Aaron Roth; Emily Ryu
  30. Human Realignment: An Empirical Study of LLMs as Legal Decision-Aids in Moral Dilemmas By Christoph Engel; Yoan Hermstrüwer; Alison Kim
  31. A Simulation-Based Conceptual Model for Tokenized Recycling: Integrating Blockchain, Market Dynamics, and Behavioral Economics By Atta Ul Mustafa
  32. Building crypto portfolios with agentic AI By Antonino Castelli; Paolo Giudici; Alessandro Piergallini
  33. A Time Series Model for Three Asset Classes used in Financial Simulator By Andrey Sarantsev; Angel Piotrowski; Ian Anderson

  1. By: Duane, Jackson; Ren, Alicia; Zhang, Wei
    Abstract: This paper presents a focused review of recent academic advances in the application of deep learning techniques to algorithmic trading. While traditional machine learning models have long been used in financial forecasting, the last decade has seen a rapid expansion in the use of deep learning architectures due to their ability to model non-linear dependencies, learn hierarchical features, and process high-dimensional sequential data. We categorize and synthesize developments across three primary paradigms: supervised deep learning models for price prediction and signal generation, unsupervised and generative approaches for feature extraction and data augmentation, and reinforcement learning agents for decision-making in trading environments. By analyzing over 30 recent peer-reviewed studies, we highlight how modern models such as attention-based networks, graph neural networks, and deep Q-learning have enhanced the robustness and adaptability of trading algorithms. We also discuss key limitations—including overfitting, data non-stationarity, and lack of interpretability—and summarize efforts to address them. This review serves as a resource for researchers seeking a clear, academically grounded perspective on how deep learning is currently reshaping algorithmic trading systems.
    Date: 2025–07–23
    URL: https://d.repec.org/n?u=RePEc:osf:osfxxx:ctxf9_v1
  2. By: Baptiste Lefort; Eric Benhamou; Beatrice Guez; Jean-Jacques Ohana; Ethan Setrouk; Alban Etienne
    Abstract: This paper presents a novel hierarchical framework for portfolio optimization, integrating lightweight Large Language Models (LLMs) with Deep Reinforcement Learning (DRL) to combine sentiment signals from financial news with traditional market indicators. Our three-tier architecture employs base RL agents to process hybrid data, meta-agents to aggregate their decisions, and a super-agent to merge decisions based on market data and sentiment analysis. Evaluated on data from 2018 to 2024, after training on 2000-2017, the framework achieves a 26% annualized return and a Sharpe ratio of 1.2, outperforming equal-weighted and S&P 500 benchmarks. Key contributions include scalable cross-modal integration, a hierarchical RL structure for enhanced stability, and open-source reproducibility.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.22932
  3. By: Mori, Misato
    Abstract: Financial fraud generates persistent risk and capital loss across sectors. This study investigates artificial intelligence (AI) methodologies for financial fraud detection, with emphasis on Retrieval-Augmented Generation (RAG). The review covers supervised classification, unsupervised anomaly detection, and graph-based relational modeling using deep neural networks, transformers, and hybrid architectures. Challenges include class imbalance, concept drift, and decision interpretability. We describe the RAG framework integrating retrievers and generative language models with external knowledge bases. Empirical comparisons on synthetic and real-time fraud datasets show improved F1-score, precision, and contextual reasoning in contrast to fine-tuned transformers and static classifiers. Applications include transaction monitoring, policy violation detection, account takeover analysis, and social engineering prevention. Evaluation highlights retrieval-grounded generation as an effective fraud signal augmentation mechanism. The paper concludes with architectural implications for deploying scalable, compliant, and adaptive fraud detection pipelines in multi-domain financial systems.
    Date: 2025–07–16
    URL: https://d.repec.org/n?u=RePEc:osf:osfxxx:5yjm4_v1
  4. By: Richiardi, Matteo; Rejoice, Frimpong
    Abstract: Development of microsimulation models often requires reweighting some input dataset to reflect the characteristics of a different population of interest. In this paper we explore a machine learning approach whereas a variant of decision trees (Gradient Boosted Machine) is used to replicate the joint distribution of target variables observed in a large commercially available but slightly biased dataset, with an additional raking step to remove the bias and ensure consistency of relevant marginal distributions with official statistics. The method is applied to build a regional variant of UKMOD, an open-source static tax-benefit model for the UK belonging to the EUROMOD family, with an application to the Greater Essex region in the UK.
    Date: 2025–08–11
    URL: https://d.repec.org/n?u=RePEc:ese:cempwp:cempa9-25
  5. By: Johannes Kruse (Max Planck Institute for Research on Collective Goods, Bonn)
    Abstract: This comment shows how large language models (LLMs) can help courts discern the "ordinary meaning" of statutory terms. Instead of relying on expert-heavy corpus‑linguistic techniques (Gries 2025), the author simulates a human survey with GPT‑4o. Demographically realistic AI agents replicate the 2, 835 participants in Tobia's 2020 study on vehicle and yield response distributions with no statistically significant difference from the human data (Kolmogorov–Smirnov p = 0.915). The paper addresses concerns about hallucinations, reproducibility, data leakage, and explainability, and introduces the locked‑prompt "Ordinary Meaning Bot, " arguing that LLM-based survey simulation is a practical, accurate alternative to dictionaries, intuition, or complex corpus analysis.
    Keywords: ordinary meaning; large language models; prompt engineering; human survey simulation; alignment
    JEL: K1 Z0
    Date: 2025–08
    URL: https://d.repec.org/n?u=RePEc:mpg:wpaper:2025_12
  6. By: Tingyu Yuan; Xi Zhang; Xuanjing Chen
    Abstract: In the face of global economic uncertainty, financial auditing has become essential for regulatory compliance and risk mitigation. Traditional manual auditing methods are increasingly limited by large data volumes, complex business structures, and evolving fraud tactics. This study proposes an AI-driven framework for enterprise financial audits and high-risk identification, leveraging machine learning to improve efficiency and accuracy. Using a dataset from the Big Four accounting firms (EY, PwC, Deloitte, KPMG) from 2020 to 2025, the research examines trends in risk assessment, compliance violations, and fraud detection. The dataset includes key indicators such as audit project counts, high-risk cases, fraud instances, compliance breaches, employee workload, and client satisfaction, capturing both audit behaviors and AI's impact on operations. To build a robust risk prediction model, three algorithms - Support Vector Machine (SVM), Random Forest (RF), and K-Nearest Neighbors (KNN) - are evaluated. SVM uses hyperplane optimization for complex classification, RF combines decision trees to manage high-dimensional, nonlinear data with resistance to overfitting, and KNN applies distance-based learning for flexible performance. Through hierarchical K-fold cross-validation and evaluation using F1-score, accuracy, and recall, Random Forest achieves the best performance, with an F1-score of 0.9012, excelling in identifying fraud and compliance anomalies. Feature importance analysis reveals audit frequency, past violations, employee workload, and client ratings as key predictors. The study recommends adopting Random Forest as a core model, enhancing features via engineering, and implementing real-time risk monitoring. This research contributes valuable insights into using machine learning for intelligent auditing and risk management in modern enterprises.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.06266
  7. By: David Dechant; Eliot Schwander; Lucas van Drooge; Charles Moussa; Diego Garlaschelli; Vedran Dunjko; Jordi Tura
    Abstract: Quantum generative adversarial networks (QGANs) have been investigated as a method for generating synthetic data with the goal of augmenting training data sets for neural networks. This is especially relevant for financial time series, since we only ever observe one realization of the process, namely the historical evolution of the market, which is further limited by data availability and the age of the market. However, for classical generative adversarial networks it has been shown that generated data may (often) not exhibit desired properties (also called stylized facts), such as matching a certain distribution or showing specific temporal correlations. Here, we investigate whether quantum correlations in quantum inspired models of QGANs can help in the generation of financial time series. We train QGANs, composed of a quantum generator and a classical discriminator, and investigate two approaches for simulating the quantum generator: a full simulation of the quantum circuits, and an approximate simulation using tensor network methods. We tested how the choice of hyperparameters, such as the circuit depth and bond dimensions, influenced the quality of the generated time series. The QGAN that we trained generate synthetic financial time series that not only match the target distribution but also exhibit the desired temporal correlations, with the quality of each property depending on the hyperparameters and simulation method.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.22035
  8. By: Md Talha Mohsin
    Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide variety of Financial Natural Language Processing (FinNLP) tasks. However, systematic comparisons among widely used LLMs remain underexplored. Given the rapid advancement and growing influence of LLMs in financial analysis, this study conducts a thorough comparative evaluation of five leading LLMs, GPT, Claude, Perplexity, Gemini and DeepSeek, using 10-K filings from the 'Magnificent Seven' technology companies. We create a set of domain-specific prompts and then use three methodologies to evaluate model performance: human annotation, automated lexical-semantic metrics (ROUGE, Cosine Similarity, Jaccard), and model behavior diagnostics (prompt-level variance and across-model similarity). The results show that GPT gives the most coherent, semantically aligned, and contextually relevant answers; followed by Claude and Perplexity. Gemini and DeepSeek, on the other hand, have more variability and less agreement. Also, the similarity and stability of outputs change from company to company and over time, showing that they are sensitive to how prompts are written and what source material is used.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.22936
  9. By: Yingnan Yan; Tianming Liu; Yafeng Yin
    Abstract: As a key advancement in artificial intelligence, large language models (LLMs) are set to transform transportation systems. While LLMs offer the potential to simulate human travelers in future mixed-autonomy transportation systems, their behavioral fidelity in complex scenarios remains largely unconfirmed by existing research. This study addresses this gap by conducting a comprehensive analysis of the value of travel time (VOT) of a popular LLM, GPT-4o. We employ a full factorial experimental design to systematically examine the LLM's sensitivity to various transportation contexts, including the choice setting, travel purpose, income, and socio-demographic factors. Our results reveal a high degree of behavioral similarity between the LLM and humans. The LLM exhibits an aggregate VOT similar to that of humans, and demonstrates human-like sensitivity to travel purpose, income, and the time-cost trade-off ratios of the alternatives. Furthermore, the behavioral patterns of LLM are remarkably consistent across varied contexts. However, we also find that the LLM's context sensitivity is less pronounced than that observed in humans. Overall, this study provides a foundational benchmark for the future development of LLMs as proxies for human travelers, demonstrating their value and robustness while highlighting that their blunted contextual sensitivity requires careful consideration.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.22244
  10. By: Siyi Wu; Zhaoyang Guan; Leyi Zhao; Xinyuan Song; Xinyu Ying; Hanlin Zhang; Michele Pak; Yangfan He; Yi Xin; Jianhui Wang; Tianyu Shi
    Abstract: Cryptocurrency trading is a challenging task requiring the integration of heterogeneous data from multiple modalities. Traditional deep learning and reinforcement learning approaches typically demand large training datasets and encode diverse inputs into numerical representations, often at the cost of interpretability. Recent progress in large language model (LLM)-based agents has demonstrated the capacity to process multi-modal data and support complex investment decision-making. Building on these advances, we present \textbf{MountainLion}, a multi-modal, multi-agent system for financial trading that coordinates specialized LLM-based agents to interpret financial data and generate investment strategies. MountainLion processes textual news, candlestick charts, and trading signal charts to produce high-quality financial reports, while also enabling modification of reports and investment recommendations through data-driven user interaction and question answering. A central reflection module analyzes historical trading signals and outcomes to continuously refine decision processes, and the system is capable of real-time report analysis, summarization, and dynamic adjustment of investment strategies. Empirical results confirm that MountainLion systematically enriches technical price triggers with contextual macroeconomic and capital flow signals, providing a more interpretable, robust, and actionable investment framework that improves returns and strengthens investor confidence.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.20474
  11. By: Winston Wei Dou; Itay Goldstein; Yan Ji
    Abstract: The integration of algorithmic trading with reinforcement learning, termed AI-powered trading, is transforming financial markets. Alongside the benefits, it raises concerns for collusion. This study first develops a model to explore the possibility of collusion among informed speculators in a theoretical environment. We then conduct simulation experiments, replacing the speculators in the model with informed AI speculators who trade based on reinforcement-learning algorithms. We show that they autonomously sustain collusive supra-competitive profits without agreement, communication, or intent. Such collusion undermines competition and market efficiency. We demonstrate that two separate mechanisms are underlying this collusion and characterize when each one arises.
    JEL: D43 G10 G14 L13
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:nbr:nberwo:34054
  12. By: Li Zhao; Rui Sun; Zuoyou Jiang; Bo Yang; Yuxiao Bai; Mengting Chen; Xinyang Wang; Jing Li; Zuo Bai
    Abstract: In financial trading, large language model (LLM)-based agents demonstrate significant potential. However, the high sensitivity to market noise undermines the performance of LLM-based trading systems. To address this limitation, we propose a novel multi-agent system featuring an internal competitive mechanism inspired by modern corporate management structures. The system consists of two specialized teams: (1) Data Team - responsible for processing and condensing massive market data into diversified text factors, ensuring they fit the model's constrained context. (2) Research Team - tasked with making parallelized multipath trading decisions based on deep research methods. The core innovation lies in implementing a real-time evaluation and ranking mechanism within each team, driven by authentic market feedback. Each agent's performance undergoes continuous scoring and ranking, with only outputs from top-performing agents being adopted. The design enables the system to adaptively adjust to dynamic environment, enhances robustness against market noise and ultimately delivers superior trading performance. Experimental results demonstrate that our proposed system significantly outperforms prevailing multiagent systems and traditional quantitative investment methods across diverse evaluation metrics.
    Date: 2025–08
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2508.00554
  13. By: Georges Sfeir; Gabriel Nova; Stephane Hess; Sander van Cranenburgh
    Abstract: Large Language Models (LLMs) are widely used to support various workflows across different disciplines, yet their potential in choice modelling remains relatively unexplored. This work examines the potential of LLMs as assistive agents in the specification and, where technically feasible, estimation of Multinomial Logit models. We implement a systematic experimental framework involving thirteen versions of six leading LLMs (ChatGPT, Claude, DeepSeek, Gemini, Gemma, and Llama) evaluated under five experimental configurations. These configurations vary along three dimensions: modelling goal (suggesting vs. suggesting and estimating MNLs); prompting strategy (Zero-Shot vs. Chain-of-Thoughts); and information availability (full dataset vs. data dictionary only). Each LLM-suggested specification is implemented, estimated, and evaluated based on goodness-of-fit metrics, behavioural plausibility, and model complexity. Findings reveal that proprietary LLMs can generate valid and behaviourally sound utility specifications, particularly when guided by structured prompts. Open-weight models such as Llama and Gemma struggled to produce meaningful specifications. Claude 4 Sonnet consistently produced the best-fitting and most complex models, while GPT models suggested models with robust and stable modelling outcomes. Some LLMs performed better when provided with just data dictionary, suggesting that limiting raw data access may enhance internal reasoning capabilities. Among all LLMs, GPT o3 was uniquely capable of correctly estimating its own specifications by executing self-generated code. Overall, the results demonstrate both the promise and current limitations of LLMs as assistive agents in choice modelling, not only for model specification but also for supporting modelling decision and estimation, and provide practical guidance for integrating these tools into choice modellers' workflows.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.21790
  14. By: Hoyoung Lee; Junhyuk Seo; Suhwan Park; Junhyeong Lee; Wonbin Ahn; Chanyeol Choi; Alejandro Lopez-Lira; Yongjae Lee
    Abstract: In finance, Large Language Models (LLMs) face frequent knowledge conflicts due to discrepancies between pre-trained parametric knowledge and real-time market data. These conflicts become particularly problematic when LLMs are deployed in real-world investment services, where misalignment between a model's embedded preferences and those of the financial institution can lead to unreliable recommendations. Yet little research has examined what investment views LLMs actually hold. We propose an experimental framework to investigate such conflicts, offering the first quantitative analysis of confirmation bias in LLM-based investment analysis. Using hypothetical scenarios with balanced and imbalanced arguments, we extract models' latent preferences and measure their persistence. Focusing on sector, size, and momentum, our analysis reveals distinct, model-specific tendencies. In particular, we observe a consistent preference for large-cap stocks and contrarian strategies across most models. These preferences often harden into confirmation bias, with models clinging to initial judgments despite counter-evidence.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.20957
  15. By: Francis Boabang; Samuel Asante Gyamerah
    Abstract: In insurance fraud prediction, handling class imbalance remains a critical challenge. This paper presents a novel multistage focal loss function designed to enhance the performance of machine learning models in such imbalanced settings by helping to escape local minima and converge to a good solution. Building upon the foundation of the standard focal loss, our proposed approach introduces a dynamic, multi-stage convex and nonconvex mechanism that progressively adjusts the focus on hard-to-classify samples across training epochs. This strategic refinement facilitates more stable learning and improved discrimination between fraudulent and legitimate cases. Through extensive experimentation on a real-world insurance dataset, our method achieved better performance than the traditional focal loss, as measured by accuracy, precision, F1-score, recall and Area Under the Curve (AUC) metrics on the auto insurance dataset. These results demonstrate the efficacy of the multistage focal loss in boosting model robustness and predictive accuracy in highly skewed classification tasks, offering significant implications for fraud detection systems in the insurance industry. An explainable model is included to interpret the results.
    Date: 2025–08
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2508.02283
  16. By: Zeqi Wu; Meilin Wang; Wei Huang; Zheng Zhang
    Abstract: Estimation and inference of treatment effects under unconfounded treatment assignments often suffer from bias and the `curse of dimensionality' due to the nonparametric estimation of nuisance parameters for high-dimensional confounders. Although debiased state-of-the-art methods have been proposed for binary treatments under particular treatment models, they can be unstable for small sample sizes. Moreover, directly extending them to general treatment models can lead to computational complexity. We propose a balanced neural networks weighting method for general treatment models, which leverages deep neural networks to alleviate the curse of dimensionality while retaining optimal covariate balance through calibration, thereby achieving debiased and robust estimation. Our method accommodates a wide range of treatment models, including average, quantile, distributional, and asymmetric least squares treatment effects, for discrete, continuous, and mixed treatments. Under regularity conditions, we show that our estimator achieves rate double robustness and $\sqrt{N}$-asymptotic normality, and its asymptotic variance achieves the semiparametric efficiency bound. We further develop a statistical inference procedure based on weighted bootstrap, which avoids estimating the efficient influence/score functions. Simulation results reveal that the proposed method consistently outperforms existing alternatives, especially when the sample size is small. Applications to the 401(k) dataset and the Mother's Significant Features dataset further illustrate the practical value of the method for estimating both average and quantile treatment effects under binary and continuous treatments, respectively.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.04044
  17. By: Junjie Zhao; Chengxi Zhang; Chenkai Wang; Peng Yang
    Abstract: Reinforcement learning (RL) has successfully automated the complex process of mining formulaic alpha factors, for creating interpretable and profitable investment strategies. However, existing methods are hampered by the sparse rewards given the underlying Markov Decision Process. This inefficiency limits the exploration of the vast symbolic search space and destabilizes the training process. To address this, Trajectory-level Reward Shaping (TLRS), a novel reward shaping method, is proposed. TLRS provides dense, intermediate rewards by measuring the subsequence-level similarity between partially generated expressions and a set of expert-designed formulas. Furthermore, a reward centering mechanism is introduced to reduce training variance. Extensive experiments on six major Chinese and U.S. stock indices show that TLRS significantly improves the predictive power of mined factors, boosting the Rank Information Coefficient by 9.29% over existing potential-based shaping algorithms. Notably, TLRS achieves a major leap in computational efficiency by reducing its time complexity with respect to the feature dimension from linear to constant, which is a significant improvement over distance-based baselines.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.20263
  18. By: Aaron Green; Zihan Nie; Hanzhen Qin; Oshani Seneviratne; Kristin P. Bennett
    Abstract: Survival modeling predicts the time until an event occurs and is widely used in risk analysis; for example, it's used in medicine to predict the survival of a patient based on censored data. There is a need for large-scale, realistic, and freely available datasets for benchmarking artificial intelligence (AI) survival models. In this paper, we derive a suite of 16 survival modeling tasks from publicly available transaction data generated by lending of cryptocurrencies in Decentralized Finance (DeFi). Each task was constructed using an automated pipeline based on choices of index and outcome events. For example, the model predicts the time from when a user borrows cryptocurrency coins (index event) until their first repayment (outcome event). We formulate a survival benchmark consisting of a suite of 16 survival-time prediction tasks (FinSurvival). We also automatically create 16 corresponding classification problems for each task by thresholding the survival time using the restricted mean survival time. With over 7.5 million records, FinSurvival provides a suite of realistic financial modeling tasks that will spur future AI survival modeling research. Our evaluation indicated that these are challenging tasks that are not well addressed by existing methods. FinSurvival enables the evaluation of AI survival models applicable to traditional finance, industry, medicine, and commerce, which is currently hindered by the lack of large public datasets. Our benchmark demonstrates how AI models could assess opportunities and risks in DeFi. In the future, the FinSurvival benchmark pipeline can be used to create new benchmarks by incorporating more DeFi transactions and protocols as the use of cryptocurrency grows.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.14160
  19. By: Pablo Quintana (UNCuyo); Marcos Herrera-Gómez (CIANECO/CONICET/Universidad Nacional de Río Cuarto)
    Abstract: Identifying regions that are both spatially contiguous and internally homogeneous remains a core challenge in spatial analysis and regional economics, especially with the increasing complexity of modern datasets. These limitations are particularly problematic when working with socioeconomic data that evolve over time. This paper presents a novel methodology for spatio-temporal regionalization—Spatial Deep Embedded Clustering (SDEC)—which integrates deep learning with spatially constrained clustering to effectively process time series data. The approach uses autoencoders to capture hidden temporal patterns and reduce dimensionality before clustering, ensuring that both spatial contiguity and temporal coherence are maintained. Through Monte Carlo simulations, we show that SDEC significantly outperforms traditional methods in capturing complex temporal patterns while preserving spatial structure. Using empirical examples, we demonstrate that the proposed framework provides a robust, scalable, and data-driven tool for researchers and policymakers working in public health, urban planning, and regional economic analysis.
    Keywords: Spatial clustering, Spatial Data Science, Spatio-temporal Classification, Territorial analysis.
    JEL: C23 C45 C63
    Date: 2025–08
    URL: https://d.repec.org/n?u=RePEc:aoz:wpaper:368
  20. By: Wei Lu; Daniel L. Chen; Christian B. Hansen
    Abstract: Understanding how large language model (LLM) agents behave in strategic interactions is essential as these systems increasingly participate autonomously in economically and morally consequential decisions. We evaluate LLM preferences using canonical economic games, finding substantial deviations from human behavior. Models like GPT-4o show excessive cooperation and limited incentive sensitivity, while reasoning models, such as o3-mini, align more consistently with payoff-maximizing strategies. We propose a supervised fine-tuning pipeline that uses synthetic datasets derived from economic reasoning to align LLM agents with economic preferences, focusing on two stylized preference structures. In the first, utility depends only on individual payoffs (homo economicus), while utility also depends on a notion of Kantian universalizability in the second preference structure (homo moralis). We find that fine-tuning based on small datasets shifts LLM agent behavior toward the corresponding economic agent. We further assess the fine-tuned agents' behavior in two applications: Moral dilemmas involving autonomous vehicles and algorithmic pricing in competitive markets. These examples illustrate how different normative objectives embedded via realizations from structured preference structures can influence market and moral outcomes. This work contributes a replicable, cost-efficient, and economically grounded pipeline to align AI preferences using moral-economic principles.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.20796
  21. By: Christian Bongiorno; Efstratios Manolakis; Rosario Nunzio Mantegna
    Abstract: We develop a rotation-invariant neural network that provides the global minimum-variance portfolio by jointly learning how to lag-transform historical returns and how to regularise both the eigenvalues and the marginal volatilities of large equity covariance matrices. This explicit mathematical mapping offers clear interpretability of each module's role, so the model cannot be regarded as a pure black-box. The architecture mirrors the analytical form of the global minimum-variance solution yet remains agnostic to dimension, so a single model can be calibrated on panels of a few hundred stocks and applied, without retraining, to one thousand US equities-a cross-sectional jump that demonstrates robust out-of-sample generalisation. The loss function is the future realized minimum portfolio variance and is optimized end-to-end on real daily returns. In out-of-sample tests from January 2000 to December 2024 the estimator delivers systematically lower realised volatility, smaller maximum drawdowns, and higher Sharpe ratios than the best analytical competitors, including state-of-the-art non-linear shrinkage. Furthermore, although the model is trained end-to-end to produce an unconstrained (long-short) minimum-variance portfolio, we show that its learned covariance representation can be used in general optimizers under long-only constraints with virtually no loss in its performance advantage over competing estimators. These gains persist when the strategy is executed under a highly realistic implementation framework that models market orders at the auctions, empirical slippage, exchange fees, and financing charges for leverage, and they remain stable during episodes of acute market stress.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.01918
  22. By: Brune, Niclas; Vetter, Oliver A.; Walter, Phillip; Buxmann, Peter
    Abstract: The paper reports the results of a design science research study that develops design principles for information systems (IS) supporting environmental sensemaking by generative artificial intelligence (AI) in the use case of a personal assistant for sustainability reporting. We identify initial design principles based on the concept of sensemaking and related prior research on generative AI assistants. Afterward, we revise the design principles in two rounds of developing, demonstrating, and evaluating a prototypical implementation. Through the second round, we incorporate the knowledge of experts about the requirements of small and medium-sized organizations for which the technology stands to be particularly valuable. We thus contribute to research on incorporating generative AI-powered IS to foster corporate sensemaking, a process crucial for organizations’ sustainability reporting activities and climate change mitigation practices.
    Date: 2025–06
    URL: https://d.repec.org/n?u=RePEc:dar:wpaper:155719
  23. By: Coppola, Antonio; Clayton, Christopher
    Abstract: We examine whether and how granular, real-time predictive models should be integrated into central banks' macroprudential toolkit. First, we develop a tractable framework that formalizes the tradeoff regulators face when choosing between implementing models that forecast systemic risk accurately but have uncertain causal content and models with the opposite profile. We derive the regulator’s optimal policy in a setting in which private portfolios react endogenously to the regulator's model choice and policy rule. We show that even purely predictive models can generate welfare gains for a regulator, and that predictive precision and knowledge of causal impacts of policy interventions are complementary. Second, we introduce a deep learning architecture tailored to financial holdings data—a graph transformer—and we discuss why it is optimally suited to this problem. The model learns vector embedding representations for both assets and investors by explicitly modeling the relational structure of holdings, and it attains state-of-the-art predictive accuracy in out-of-sample forecasting tasks including trade prediction.
    Date: 2025–07–25
    URL: https://d.repec.org/n?u=RePEc:osf:socarx:xwsje_v1
  24. By: Dhanashekar Kandaswamy; Ashutosh Sahoo; Akshay SP; Gurukiran S; Parag Paul; Girish G N
    Abstract: As decentralized finance (DeFi) evolves, distinguishing between user behaviors - liquidity provision versus active trading - has become vital for risk modeling and on-chain reputation. We propose a behavioral scoring framework for Uniswap that assigns two complementary scores: a Liquidity Provision Score that assesses strategic liquidity contributions, and a Swap Behavior Score that reflects trading intent, volatility exposure, and discipline. The scores are constructed using rule-based blueprints that decompose behavior into volume, frequency, holding time, and withdrawal patterns. To handle edge cases and learn feature interactions, we introduce a deep residual neural network with densely connected skip blocks inspired by the U-Net architecture. We also incorporate pool-level context such as total value locked (TVL), fee tiers, and pool size, allowing the system to differentiate similar user behaviors across pools with varying characteristics. Our framework enables context-aware and scalable DeFi user scoring, supporting improved risk assessment and incentive design. Experiments on Uniswap v3 data show its usefulness for user segmentation and protocol-aligned reputation systems. Although we refer to our metric as zScore, it is independently developed and methodologically different from the cross-protocol system proposed by Udupi et al. Our focus is on role-specific behavioral modeling within Uniswap using blueprint logic and supervised learning.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.20494
  25. By: Microsoft Copilot; Stephen E. Spear
    Abstract: This paper extends (Spear 2003) by replacing human agents with artificial intelligence (AI) entities that derive utility solely from electricity consumption. These AI agents must prepay for electricity using cryptocurrency and the verification of these transactions requires a fixed amount of electricity. As a result the agents must strategically allocate electricity resources between consumption and payment verification. This paper analyzes the equilibrium outcomes of such a system and discusses the implications of AI-driven energy markets.
    Date: 2025–05
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2505.14612
  26. By: Jasper Rou
    Abstract: In this research, we explore neural network-based methods for pricing multidimensional American put options under the BlackScholes and Heston model, extending up to five dimensions. We focus on two approaches: the Time Deep Gradient Flow (TDGF) method and the Deep Galerkin Method (DGM). We extend the TDGF method to handle the free-boundary partial differential equation inherent in American options. We carefully design the sampling strategy during training to enhance performance. Both TDGF and DGM achieve high accuracy while outperforming conventional Monte Carlo methods in terms of computational speed. In particular, TDGF tends to be faster during training than DGM.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.17606
  27. By: Rachel Cho; Christoph Görtz; Danny McGowan; Max Schröder
    Abstract: We propose a new approach to identify firm-level financial constraints by applying artificial intelligence to text of 10-K filings by U.S. public firms from 1993 to 2021. Leveraging transformer-based natural language processing, our model captures contextual and semantic nuances often missed by traditional text classification techniques, enabling more accurate detection of financial constraints. A key contribution is to differentiate between constraints that affect firms presently and those anticipated in the future. These two types of constraints are associated with distinctly different financial profiles: while firms expecting future constraints tend to accumulate cash preemptively, currently constrained firms exhibit reduced liquidity and higher leverage. We show that only firms anticipating financial constraints exhibit significant cash flow sensitivity of cash, whereas currently constrained and unconstrained firms do not. This calls for a narrower interpretation of this widely used cash-based constraints measure, as it may conflate distinct firm types – unconstrained and currently constrained – and fail to capture all financially constrained firms. Our findings underscore the critical role of constraint timing in shaping corporate financial behavior.
    Keywords: financial constraints, artificial intelligence, expectations, cash, cash flow, corporate finance behavior
    JEL: G31 G32 D92
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:ces:ceswps:_12054
  28. By: Hanwool Lee; Sara Yu; Yewon Hwang; Jonghyun Choi; Heejae Ahn; Sungbum Jung; Youngjae Yu
    Abstract: General-purpose sentence embedding models often struggle to capture specialized financial semantics, especially in low-resource languages like Korean, due to domain-specific jargon, temporal meaning shifts, and misaligned bilingual vocabularies. To address these gaps, we introduce NMIXX (Neural eMbeddings for Cross-lingual eXploration of Finance), a suite of cross-lingual embedding models fine-tuned with 18.8K high-confidence triplets that pair in-domain paraphrases, hard negatives derived from a semantic-shift typology, and exact Korean-English translations. Concurrently, we release KorFinSTS, a 1, 921-pair Korean financial STS benchmark spanning news, disclosures, research reports, and regulations, designed to expose nuances that general benchmarks miss. When evaluated against seven open-license baselines, NMIXX's multilingual bge-m3 variant achieves Spearman's rho gains of +0.10 on English FinSTS and +0.22 on KorFinSTS, outperforming its pre-adaptation checkpoint and surpassing other models by the largest margin, while revealing a modest trade-off in general STS performance. Our analysis further shows that models with richer Korean token coverage adapt more effectively, underscoring the importance of tokenizer design in low-resource, cross-lingual settings. By making both models and the benchmark publicly available, we provide the community with robust tools for domain-adapted, multilingual representation learning in finance.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.09601
  29. By: Michael Kearns; Aaron Roth; Emily Ryu
    Abstract: We study a distributed learning problem in which learning agents are embedded in a directed acyclic graph (DAG). There is a fixed and arbitrary distribution over feature/label pairs, and each agent or vertex in the graph is able to directly observe only a subset of the features -- potentially a different subset for every agent. The agents learn sequentially in some order consistent with a topological sort of the DAG, committing to a model mapping observations to predictions of the real-valued label. Each agent observes the predictions of their parents in the DAG, and trains their model using both the features of the instance that they directly observe, and the predictions of their parents as additional features. We ask when this process is sufficient to achieve \emph{information aggregation}, in the sense that some agent in the DAG is able to learn a model whose error is competitive with the best model that could have been learned (in some hypothesis class) with direct access to \emph{all} features, despite the fact that no single agent in the network has such access. We give upper and lower bounds for this problem for both linear and general hypothesis classes. Our results identify the \emph{depth} of the DAG as the key parameter: information aggregation can occur over sufficiently long paths in the DAG, assuming that all of the relevant features are well represented along the path, and there are distributions over which information aggregation cannot occur even in the linear case, and even in arbitrarily large DAGs that do not have sufficient depth (such as a hub-and-spokes topology in which the spoke vertices collectively see all the features). We complement our theoretical results with a comprehensive set of experiments.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.09683
  30. By: Christoph Engel (Max Planck Institute for Research on Collective Goods, Bonn); Yoan Hermstrüwer (University of Zurich); Alison Kim (University of Zurich)
    Abstract: Recent advances in AI create possibilities for delegating legal decision-making to machines or enhancing human adjudication through AI assistance. Using classic normative conflicts-the trolley problem and similar moral dilemmas-as a proof of concept, we examine the alignment between AI legal reasoning and human judgment. In our baseline experiment, we find a pronounced mismatch between decisions made by GPT and those of human subjects. This misalignment raises substantive concerns for AI-powered legal decision-aids. We investigate whether explicit normative guidance can address this misalignment, with mixed results. GPT-3.5 is susceptible to such intervention, but frequently refuses to decide when faced with a moral dilemma. GPT-4 is outright utilitarian, and essentially ignores the instruction to decide on deontological grounds. GPT-o3-mini faithfully implements this instruction, but is unwilling to balance deontological and utilitarian concerns if instructed to do so. At least for the time being, explicit normative instructions are not fully able to realign AI advice with the normative convictions of the legislator.
    Keywords: large language models, human-AI alignment, rule of law, moral dilemmas, trolley problems
    JEL: C99 D63 D81 K10 K40 Z13
    Date: 2025–04
    URL: https://d.repec.org/n?u=RePEc:mpg:wpaper:2025_03
  31. By: Atta Ul Mustafa
    Abstract: This study develops a conceptual simulation model for a tokenized recycling incentive system that integrates blockchain infrastructure, market-driven pricing, behavioral economics, and carbon credit mechanisms. The model aims to address the limitations of traditional recycling systems, which often rely on static government subsidies and fail to generate sustained public participation. By introducing dynamic token values linked to real-world supply and demand conditions, as well as incorporating non-monetary behavioral drivers (e.g., social norms, reputational incentives), the framework creates a dual-incentive structure that can adapt over time. The model uses Monte Carlo simulations to estimate outcomes under a range of scenarios involving operational costs, carbon pricing, token volatility, and behavioral adoption rates. Due to the absence of real-world implementations of such integrated blockchain-based recycling systems, the paper remains theoretical and simulation-based. It is intended as a prototype framework for future policy experimentation and pilot projects. The model provides insights for policymakers, urban planners, and technology developers aiming to explore decentralized and market-responsive solutions to sustainable waste management. Future work should focus on validating the model through field trials or behavioral experiments.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.19901
  32. By: Antonino Castelli; Paolo Giudici; Alessandro Piergallini
    Abstract: The rapid growth of crypto markets has opened new opportunities for investors, but at the same time exposed them to high volatility. To address the challenge of managing dynamic portfolios in such an environment, this paper presents a practical application of a multi-agent system designed to autonomously construct and evaluate crypto-asset allocations. Using data on daily frequencies of the ten most capitalized cryptocurrencies from 2020 to 2025, we compare two automated investment strategies. These are a static equal weighting strategy and a rolling-window optimization strategy, both implemented to maximize the evaluation metrics of the Modern Portfolio Theory (MPT), such as Expected Return, Sharpe and Sortino ratios, while minimizing volatility. Each step of the process is handled by dedicated agents, integrated through a collaborative architecture in Crew AI. The results show that the dynamic optimization strategy achieves significantly better performance in terms of risk-adjusted returns, both in-sample and out-of-sample. This highlights the benefits of adaptive techniques in portfolio management, particularly in volatile markets such as cryptocurrency markets. The following methodology proposed also demonstrates how multi-agent systems can provide scalable, auditable, and flexible solutions in financial automation.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.20468
  33. By: Andrey Sarantsev; Angel Piotrowski; Ian Anderson
    Abstract: We create a dynamic stochastic general equilibrium model for annual returns of three asset classes: the USA Standard & Poor (S&P) stock index, the international stock index, and the USA Bank of America investment-grade corporate bond index. Using this, we made an online financial app simulating wealth process. This includes options for regular withdrawals and contributions. Four factors are: S&P volatility and earnings, corporate BAA rate, and long-short Treasury bond spread. Our valuation measure is an improvement of Shiller's cyclically adjusted price-earnings ratio. We use classic linear regression models, and make residuals white noise by dividing by annual volatility. We use multivariate kernel density estimation for residuals. We state and prove long-term stability results.
    Date: 2025–08
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2508.06010

This nep-cmp issue is ©2025 by Stan Miles. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.