nep-for 2025-06-23 papers

on Forecasting

Issue of 2025–06–23
nineteen papers chosen by
Malte Knüppel, Deutsche Bundesbank

Decoding Futures Price Dynamics: A Regularized Sparse Autoencoder for Interpretable Multi-Horizon Forecasting and Factor Discovery By Gupta, Abhijit
A new model to forecast energy inflation in the euro area By Bańbura, Marta; Bobeica, Elena; Giammaria, Alessandro; Porqueddu, Mario; van Spronsen, Josha
An Interpretable Machine Learning Approach in Predicting Inflation Using Payments System Data: A Case Study of Indonesia By Wishnu Badrawani
Combine and conquer: model averaging for out-of-distribution forecasting By Stephane Hess; Sander van Cranenburgh
Deep Learning Enhanced Multivariate GARCH By Haoyuan Wang; Chen Liu; Minh-Ngoc Tran; Chao Wang
Why Regression? Binary Encoding Classification Brings Confidence to Stock Market Index Price Prediction By Junzhe Jiang; Chang Yang; Xinrun Wang; Bo Li
Applying Informer for Option Pricing: A Transformer-Based Approach By Feliks Ba\'nka; Jaros{\l}aw A. Chudziak
Upgrading the Czech National Bank's Core Forecasting Model g3+ By Frantisek Brazdik; Karel Musil; Tomas Pokorny; Tomas Sestorad; Jaromir Tonner; Jan Zacek
Diffusion index forecasts under weaker loadings: PCA, ridge regression, and random projections By Tom Boot; Bart Keijsers
Exploring Microstructural Dynamics in Cryptocurrency Limit Order Books: Better Inputs Matter More Than Stacking Another Hidden Layer By Haochuan; Wang
Inflation at Risk: The Czech Case By Michal Franta; Jan Vlcek
Latent Variable Autoregression with Exogenous Inputs By Daniil Bargman
Forecast Sensitivity to Global Risks : A BVAR Analysis By Heather Jane Ruberl; Remzi Baris Tercioglu; Elderfield, Adam
Enhancing the Merger Simulation Toolkit with ML/AI By Harold D. Chiang; Jack Collison; Lorenzo Magnolfi; Christopher Sullivan
Forecasting the Moroccan Stock Market: A Theoretical Approach Integrating Macroeconomic and Sentiment Data through Deep Learning By Imad Talhartit; Sanae Ait Jillali; Mounime El Kabbouri
Extracting Statistical Relationships from Observational Data: Predicting with Full or Partial Information By Fréchette, Guillaume R; Vespa, Emanuel; Yuksel, Sevgi
Nowcasting the euro area with social media data By Konstantin Boss; Luigi Longo; Luca Onorante
High-Dimensional Learning in Finance By Hasan Fallahgoul
EDINET-Bench: Evaluating LLMs on Complex Financial Tasks using Japanese Financial Statements By Issa Sugiura; Takashi Ishida; Taro Makino; Chieko Tazuke; Takanori Nakagawa; Kosuke Nakago; David Ha

Decoding Futures Price Dynamics: A Regularized Sparse Autoencoder for Interpretable Multi-Horizon Forecasting and Factor Discovery

By:	Gupta, Abhijit
Abstract:	Commodity futures price volatility creates significant economic challenges, necessitating accurate multi-horizon forecasting. Predicting these prices is complicated by diverse interacting factors (macroeconomic, supply/demand, geopolitical). Current models often lack transparency, limiting strategic use. This paper presents a Regularized Sparse Autoencoder (RSAE), a deep learning framework for simultaneous multi-horizon commodity futures prediction and discovery of interpretable latent market drivers. The RSAE forecasts prices at multiple horizons (e.g., 1-day, 1-week, 1-month) using multivariate time series. A key L1 regularization on its latent vector enforces sparsity, promoting parsimonious explanations of market dynamics through learned factors representing underlying drivers (e.g., demand shifts, supply shocks). Drawing from energy-based models and sparse coding, the RSAE optimizes predictive accuracy while learning sparse representations. Evaluated on historical Copper and Crude Oil futures data with numerous indicators, our findings suggest the RSAE offers competitive multi-horizon forecasting accuracy and data-driven insights into price dynamics via its interpretable latent space, a notable advantage over traditional black-box approaches.
Date:	2025–05–10
URL:	https://d.repec.org/n?u=RePEc:osf:osfxxx:4rzky_v1

A new model to forecast energy inflation in the euro area

By:	Bańbura, Marta; Bobeica, Elena; Giammaria, Alessandro; Porqueddu, Mario; van Spronsen, Josha
Abstract:	Energy inflation is a major source of headline inflation volatility and forecast errors, therefore it is critical to model it accurately. This paper introduces a novel suite of Bayesian VAR models for euro area HICP energy inflation, which adopts a granular, bottom-up approach – disaggregating energy into subcomponents, such as fuels, gas, and electricity. The suite incorporates key features for energy prices: stochastic volatility, outlier correction, high-frequency indicators, and pre-tax price modelling. These characteristics enhance both in-sample explanatory power and forecast accuracy. Compared to standard benchmarks and official projections, our BVARs achieve better forecasting performance, particularly beyond the very short term. The suite also captures a sizable variation in the impact of commodity price shocks, pointing to higher elasticities at higher levels of commodity prices. Beyond forecasting, our framework is also useful for scenario and sensitivity analysis as an effective tool to gauge risks, which is especially relevant amid ongoing energy market transformations. JEL Classification: C32, C53, E31, E37
Keywords:	Bayesian VAR, gas prices, HICP, oil prices
Date:	2025–06
URL:	https://d.repec.org/n?u=RePEc:ecb:ecbwps:20253062

An Interpretable Machine Learning Approach in Predicting Inflation Using Payments System Data: A Case Study of Indonesia

By:	Wishnu Badrawani
Abstract:	This paper evaluates the performance of prominent machine learning (ML) algorithms in predicting Indonesia's inflation using the payment system, capital market, and macroeconomic data. We compare the forecasting performance of each ML model, namely shrinkage regression, ensemble learning, and super vector regression, to that of the univariate time series ARIMA and SARIMA models. We examine various out-of-bag sample periods in each ML model to determine the appropriate data-splitting ratios for the regression case study. This study indicates that all ML models produced lower RMSEs and reduced average forecast errors by 45.16 percent relative to the ARIMA benchmark, with the Extreme Gradient Boosting model outperforming other ML models and the benchmark. Using the Shapley value, we discovered that numerous payment system variables significantly predict inflation. We explore the ML forecast using local Shapley decomposition and show the relationship between the explanatory variables and inflation for interpretation. The interpretation of the ML forecast highlights some significant findings and offers insightful recommendations, enhancing previous economic research that uses a more established econometric method. Our findings advocate ML models as supplementary tools for the central bank to predict inflation and support monetary policy.
Date:	2025–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2506.10369

Combine and conquer: model averaging for out-of-distribution forecasting

By:	Stephane Hess; Sander van Cranenburgh
Abstract:	Travel behaviour modellers have an increasingly diverse set of models at their disposal, ranging from traditional econometric structures to models from mathematical psychology and data-driven approaches from machine learning. A key question arises as to how well these different models perform in prediction, especially when considering trips of different characteristics from those used in estimation, i.e. out-of-distribution prediction, and whether better predictions can be obtained by combining insights from the different models. Across two case studies, we show that while data-driven approaches excel in predicting mode choice for trips within the distance bands used in estimation, beyond that range, the picture is fuzzy. To leverage the relative advantages of the different model families and capitalise on the notion that multiple `weak' models can result in more robust models, we put forward the use of a model averaging approach that allocates weights to different model families as a function of the \emph{distance} between the characteristics of the trip for which predictions are made, and those used in model estimation. Overall, we see that the model averaging approach gives larger weight to models with stronger behavioural or econometric underpinnings the more we move outside the interval of trip distances covered in estimation. Across both case studies, we show that our model averaging approach obtains improved performance both on the estimation and validation data, and crucially also when predicting mode choices for trips of distances outside the range used in estimation.
Date:	2025–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2506.03693

Deep Learning Enhanced Multivariate GARCH

By:	Haoyuan Wang; Chen Liu; Minh-Ngoc Tran; Chao Wang
Abstract:	This paper introduces a novel multivariate volatility modeling framework, named Long Short-Term Memory enhanced BEKK (LSTM-BEKK), that integrates deep learning into multivariate GARCH processes. By combining the flexibility of recurrent neural networks with the econometric structure of BEKK models, our approach is designed to better capture nonlinear, dynamic, and high-dimensional dependence structures in financial return data. The proposed model addresses key limitations of traditional multivariate GARCH-based methods, particularly in capturing persistent volatility clustering and asymmetric co-movement across assets. Leveraging the data-driven nature of LSTMs, the framework adapts effectively to time-varying market conditions, offering improved robustness and forecasting performance. Empirical results across multiple equity markets confirm that the LSTM-BEKK model achieves superior performance in terms of out-of-sample portfolio risk forecast, while maintaining the interpretability from the BEKK models. These findings highlight the potential of hybrid econometric-deep learning models in advancing financial risk management and multivariate volatility forecasting.
Date:	2025–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2506.02796

Why Regression? Binary Encoding Classification Brings Confidence to Stock Market Index Price Prediction

By:	Junzhe Jiang; Chang Yang; Xinrun Wang; Bo Li
Abstract:	Stock market indices serve as fundamental market measurement that quantify systematic market dynamics. However, accurate index price prediction remains challenging, primarily because existing approaches treat indices as isolated time series and frame the prediction as a simple regression task. These methods fail to capture indices' inherent nature as aggregations of constituent stocks with complex, time-varying interdependencies. To address these limitations, we propose Cubic, a novel end-to-end framework that explicitly models the adaptive fusion of constituent stocks for index price prediction. Our main contributions are threefold. i) Fusion in the latent space: we introduce the fusion mechanism over the latent embedding of the stocks to extract the information from the vast number of stocks. ii) Binary encoding classification: since regression tasks are challenging due to continuous value estimation, we reformulate the regression into the classification task, where the target value is converted to binary and we optimize the prediction of the value of each digit with cross-entropy loss. iii) Confidence-guided prediction and trading: we introduce the regularization loss to address market prediction uncertainty for the index prediction and design the rule-based trading policies based on the confidence. Extensive experiments across multiple stock markets and indices demonstrate that Cubic consistently outperforms state-of-the-art baselines in stock index prediction tasks, achieving superior performance on both forecasting accuracy metrics and downstream trading profitability.
Date:	2025–05
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2506.03153

Applying Informer for Option Pricing: A Transformer-Based Approach

By:	Feliks Ba\'nka; Jaros{\l}aw A. Chudziak
Abstract:	Accurate option pricing is essential for effective trading and risk management in financial markets, yet it remains challenging due to market volatility and the limitations of traditional models like Black-Scholes. In this paper, we investigate the application of the Informer neural network for option pricing, leveraging its ability to capture long-term dependencies and dynamically adjust to market fluctuations. This research contributes to the field of financial forecasting by introducing Informer's efficient architecture to enhance prediction accuracy and provide a more adaptable and resilient framework compared to existing methods. Our results demonstrate that Informer outperforms traditional approaches in option pricing, advancing the capabilities of data-driven financial forecasting in this domain.
Date:	2025–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2506.05565

Upgrading the Czech National Bank's Core Forecasting Model g3+

By:	Frantisek Brazdik; Karel Musil; Tomas Pokorny; Tomas Sestorad; Jaromir Tonner; Jan Zacek
Abstract:	We present the upgraded version of g3+, the Czech National Bank's core forecasting model, which became operational in April 2024 and summarizes its additional modifications over 2024. This paper outlines the innovative features of the model and the motivations behind their adoption. The enhancements also reflect the period from 2020 to 2022, which was marked by extraordinary events such as the Covid-19 pandemic and a significant surge in energy commodity prices. The upgraded g3+ now includes, among others, the endogenous decomposition of foreign economic activity into gap and trend components, a refined structure of foreign producer prices, and adjusted links between foreign and domestic economies. In addition, several model parameters have been recalibrated to reflect current and anticipated economic conditions. The introduction of these model changes and parameter adjustments lead to improved forecasting performance relative to the previous version of the model.
Keywords:	Conditional forecast, DSGE, energy, g3+ model, small open economy, two-country model
JEL:	C51 C53 E27 E37 F41
Date:	2025–05
URL:	https://d.repec.org/n?u=RePEc:cnb:wpaper:2025/7

Diffusion index forecasts under weaker loadings: PCA, ridge regression, and random projections

By:	Tom Boot; Bart Keijsers
Abstract:	We study the accuracy of forecasts in the diffusion index forecast model with possibly weak loadings. The default option to construct forecasts is to estimate the factors through principal component analysis (PCA) on the available predictor matrix, and use the estimated factors to forecast the outcome variable. Alternatively, we can directly relate the outcome variable to the predictors through either ridge regression or random projections. We establish that forecasts based on PCA, ridge regression and random projections are consistent for the conditional mean under the same assumptions on the strength of the loadings. However, under weaker loadings the convergence rate is lower for ridge and random projections if the time dimension is small relative to the cross-section dimension. We assess the relevance of these findings in an empirical setting by comparing relative forecast accuracy for monthly macroeconomic and financial variables using different window sizes. The findings support the theoretical results, and at the same time show that regularization-based procedures may be more robust in settings not covered by the developed theory.
Date:	2025–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2506.09575

Exploring Microstructural Dynamics in Cryptocurrency Limit Order Books: Better Inputs Matter More Than Stacking Another Hidden Layer

By:	Haochuan (Kevin); Wang
Abstract:	Cryptocurrency price dynamics are driven largely by microstructural supply demand imbalances in the limit order book (LOB), yet the highly noisy nature of LOB data complicates the signal extraction process. Prior research has demonstrated that deep-learning architectures can yield promising predictive performance on pre-processed equity and futures LOB data, but they often treat model complexity as an unqualified virtue. In this paper, we aim to examine whether adding extra hidden layers or parameters to "blackbox ish" neural networks genuinely enhances short term price forecasting, or if gains are primarily attributable to data preprocessing and feature engineering. We benchmark a spectrum of models from interpretable baselines, logistic regression, XGBoost to deep architectures (DeepLOB, Conv1D+LSTM) on BTC/USDT LOB snapshots sampled at 100 ms to multi second intervals using publicly available Bybit data. We introduce two data filtering pipelines (Kalman, Savitzky Golay) and evaluate both binary (up/down) and ternary (up/flat/down) labeling schemes. Our analysis compares models on out of sample accuracy, latency, and robustness to noise. Results reveal that, with data preprocessing and hyperparameter tuning, simpler models can match and even exceed the performance of more complex networks, offering faster inference and greater interpretability.
Date:	2025–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2506.05764

Inflation at Risk: The Czech Case

By:	Michal Franta; Jan Vlcek
Abstract:	Inflation at Risk provides a coherent description of the risks associated with an inflation outlook. This paper explores the practical applicability of this approach in central banks. The method is applied to Czech inflation to highlight issues related to short data sample. A set of quantile regressions with a non-crossing quantiles constraint is estimated using monthly data from the year 2000 onwards, and the model's in-sample fit and out-of-sample forecasting performance are then assessed. Furthermore, we discuss the Inflation at Risk estimates in the context of several historical events and demonstrate how the approach can inform monetary policy. The estimation results suggest the presence of nonlinearities in the Czech inflation process, which are related to supply-side pressures. In addition, it appears that regime changes have occurred recently.
Keywords:	Inflation dynamics, inflation risk, quantile regressions
JEL:	E31 E37 E52
Date:	2025–05
URL:	https://d.repec.org/n?u=RePEc:cnb:wpaper:2025/8

Latent Variable Autoregression with Exogenous Inputs

By:	Daniil Bargman
Abstract:	This paper introduces a new least squares regression methodology called (C)LARX: a (constrained) latent variable autoregressive model with exogenous inputs. Two additional contributions are made as a side effect: First, a new matrix operator is introduced for matrices and vectors with blocks along one dimension; Second, a new latent variable regression (LVR) framework is proposed for economics and finance. The empirical section examines how well the stock market predicts real economic activity in the United States. (C)LARX models outperform the baseline OLS specification in out-of-sample forecasts and offer novel analytical insights about the underlying functional relationship.
Date:	2025–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2506.04488

Forecast Sensitivity to Global Risks : A BVAR Analysis

By:	Heather Jane Ruberl; Remzi Baris Tercioglu; Elderfield, Adam
Abstract:	Developing countries face uncertainties driven by global macroeconomic variables over which they have little to no control. Key exogenous factors faced by most developing countries include interest rates in high-income countries, commodity prices, global demand for exports, and remit- tance inflows. While these variables are sensitive to common global shocks, they also exhibit idiosyncratic fluctuations. This paper employs a Bayesian Vector Autoregression model to capture interdependencies of global variables and simulates global risks using the empirical joint distribution of global shock as captured by joint Bayesian Vector Autoregression errors. The simulated shocks are then integrated into the World Bank’s macro-structural model to assess how a range of potential global disturbances could impact economic outcomes across countries. The methodology is applied to 115 countries, using the World Bank’s fall 2024 edition of the Macro-Poverty Outlook forecasts as a baseline. Although the individual country results are heterogeneous, the aggregate distribution of gross domestic product outcomes across the 115 countries suggests that global factors influence gross domestic product levels in individual developing countries by less than plus or minus 2 percent in most years, but by between 2 and 4 percent in about 3 in 10 years.
Date:	2025–05–27
URL:	https://d.repec.org/n?u=RePEc:wbk:wbrwps:11132

Enhancing the Merger Simulation Toolkit with ML/AI

By:	Harold D. Chiang; Jack Collison; Lorenzo Magnolfi; Christopher Sullivan
Abstract:	This paper develops a flexible approach to predict the price effects of horizontal mergers using ML/AI methods. While standard merger simulation techniques rely on restrictive assumptions about firm conduct, we propose a data-driven framework that relaxes these constraints when rich market data are available. We develop and identify a flexible nonparametric model of supply that nests a broad range of conduct models and cost functions. To overcome the curse of dimensionality, we adapt the Variational Method of Moments (VMM) (Bennett and Kallus, 2023) to estimate the model, allowing for various forms of strategic interaction. Monte Carlo simulations show that our method significantly outperforms an array of misspecified models and rivals the performance of the true model, both in predictive performance and counterfactual merger simulations. As a way to interpret the economics of the estimated function, we simulate pass-through and reveal that the model learns markup and cost functions that imply approximately correct pass-through behavior. Applied to the American Airlines-US Airways merger, our method produces more accurate post-merger price predictions than traditional approaches. The results demonstrate the potential for machine learning techniques to enhance merger analysis while maintaining economic structure.
Date:	2025–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2506.05225

Forecasting the Moroccan Stock Market: A Theoretical Approach Integrating Macroeconomic and Sentiment Data through Deep Learning

By:	Imad Talhartit (Université Hassan 1er [Settat], Ecole Nationale de Commerce et Gestion - Settat, Laboratory of Finance, Audit and Organizational Governance Research); Sanae Ait Jillali (Université Hassan 1er [Settat], Ecole Nationale de Commerce et Gestion - Settat, Laboratory of Finance, Audit and Organizational Governance Research); Mounime El Kabbouri (Université Hassan 1er [Settat], Ecole Nationale de Commerce et Gestion - Settat, Laboratory of Finance, Audit and Organizational Governance Research)
Abstract:	In today's data-driven economy, predicting stock market behavior has become a key focus for both finance professionals and academics. Traditionally reliant on historical and economic data, stock price forecasting is now being enhanced by AI technologies, especially Deep Learning and Natural Language Processing (NLP), which allow the integration of qualitative data like news sentiment and investor opinions. Deep Learning uses multi-layered neural networks to analyze complex patterns, while NLP enables machines to interpret human language, making it useful for extracting sentiment from media sources. Though most research has focused on developed markets, emerging economies like Morocco offer a unique context due to their evolving financial systems and data limitations. This study takes a theoretical and exploratory approach, aiming to conceptually examine how macroeconomic indicators and sentiment analysis can be integrated using deep learning models to enhance stock price prediction in Morocco. Rather than building a model, the paper reviews literature, evaluates data sources, and identifies key challenges and opportunities. Ultimately, the study aims to bridge AI techniques with financial theory in an emerging market setting, providing a foundation for future empirical research and interdisciplinary collaboration.
Keywords:	Stock Price Prediction, Deep Learning, Natural Language Processing (NLP), Sentiment Analysis, Macroeconomic Indicators, Emerging Markets, Moroccan Financial Market
Date:	2025–05
URL:	https://d.repec.org/n?u=RePEc:hal:journl:hal-05094029

Extracting Statistical Relationships from Observational Data: Predicting with Full or Partial Information

By:	Fréchette, Guillaume R; Vespa, Emanuel; Yuksel, Sevgi
Abstract:	Decision-makers sometimes rely on past data to learn statistical relationships between variables. However, when predicting a target variable, they must adjust how they aggregate past information depending on the observables available. If agents have information on all observables, it is optimal to understand how the observables jointly predict the target, while with only one observable, they should focus on the unconditional correlation. An experiment examining this process shows that predictions that require the use of unconditional correlations are more challenging for decision-makers.
Keywords:	Economics, Economic Theory
Date:	2025–05–01
URL:	https://d.repec.org/n?u=RePEc:cdl:ucsdec:qt57x6d5sw

Nowcasting the euro area with social media data

By:	Konstantin Boss; Luigi Longo; Luca Onorante
Abstract:	Using a state-of-the-art large language model, we extract forward-looking and context-sensitive signals related to inflation and unemployment in the euro area from millions of Reddit submissions and comments. We develop daily indicators that incorporate, in addition to posts, the social interaction among users. Our empirical results show consistent gains in out-of-sample nowcasting accuracy relative to daily newspaper sentiment and financial variables, especially in unusual times such as the (post-)COVID-19 period. We conclude that the application of AI tools to the analysis of social media, specifically Reddit, provides useful signals about inflation and unemployment in Europe at daily frequency and constitutes a useful addition to the toolkit available to economic forecasters and nowcasters.
Date:	2025–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2506.10546

High-Dimensional Learning in Finance

By:	Hasan Fallahgoul
Abstract:	Recent advances in machine learning have shown promising results for financial prediction using large, over-parameterized models. This paper provides theoretical foundations and empirical validation for understanding when and how these methods achieve predictive success. I examine three key aspects of high-dimensional learning in finance. First, I prove that within-sample standardization in Random Fourier Features implementations fundamentally alters the underlying Gaussian kernel approximation, replacing shift-invariant kernels with training-set dependent alternatives. Second, I derive sample complexity bounds showing when reliable learning becomes information-theoretically impossible under weak signal-to-noise ratios typical in finance. Third, VC-dimension analysis reveals that ridgeless regression's effective complexity is bounded by sample size rather than nominal feature dimension. Comprehensive numerical validation confirms these theoretical predictions, revealing systematic breakdown of claimed theoretical properties across realistic parameter ranges. These results show that when sample size is small and features are high-dimensional, observed predictive success is necessarily driven by low-complexity artifacts, not genuine high-dimensional learning.
Date:	2025–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2506.03780

EDINET-Bench: Evaluating LLMs on Complex Financial Tasks using Japanese Financial Statements

By:	Issa Sugiura; Takashi Ishida; Taro Makino; Chieko Tazuke; Takanori Nakagawa; Kosuke Nakago; David Ha
Abstract:	Financial analysis presents complex challenges that could leverage large language model (LLM) capabilities. However, the scarcity of challenging financial datasets, particularly for Japanese financial data, impedes academic innovation in financial analytics. As LLMs advance, this lack of accessible research resources increasingly hinders their development and evaluation in this specialized domain. To address this gap, we introduce EDINET-Bench, an open-source Japanese financial benchmark designed to evaluate the performance of LLMs on challenging financial tasks including accounting fraud detection, earnings forecasting, and industry prediction. EDINET-Bench is constructed by downloading annual reports from the past 10 years from Japan's Electronic Disclosure for Investors' NETwork (EDINET) and automatically assigning labels corresponding to each evaluation task. Our experiments reveal that even state-of-the-art LLMs struggle, performing only slightly better than logistic regression in binary classification for fraud detection and earnings forecasting. These results highlight significant challenges in applying LLMs to real-world financial applications and underscore the need for domain-specific adaptation. Our dataset, benchmark construction code, and evaluation code is publicly available to facilitate future research in finance with LLMs.
Date:	2025–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2506.08762

This nep-for issue is ©2025 by Malte Knüppel. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.