nep-for 2025-09-08 papers

on Forecasting

Issue of 2025–09–08
24 papers chosen by
Malte Knüppel, Deutsche Bundesbank

A Python Package to Assist Macroframework Forecasting: Concepts and Examples By Mr. Sakai Ando; Shuvam Das; Sultan Orazbayev
Combining a Large Pool of Forecasts of Value-at-Risk and Expected Shortfall By James W. Taylor; Chao Wang
FinCast: A Foundation Model for Financial Time-Series Forecasting By Zhuohang Zhu; Haodong Chen; Qiang Qu; Vera Chung
Understanding How Exchange Rates are Perceived and How That Perception Affects Exchange Rate Forecasts By Yushi YOSHIDA
Forecasting NYC Yellow Taxi Ridership Decline: A Time Series Analysis of Daily Passenger Counts (2017-2019) By Gaurav Singh
Vector Autoregressive Models for Tax Forecasting By Susie McKenzie
Forecasting Commodity Price Shocks Using Temporal and Semantic Fusion of Prices Signals and Agentic Generative AI Extracted Economic News By Mohammed-Khalil Ghali; Cecil Pang; Oscar Molina; Carlos Gershenson-Garcia; Daehan Won
Binary Response Forecasting under a Factor-Augmented Framework By Tingting Cheng; Jiachen Cong; Fei Liu; Xuanbin Yang
Is All the Information in the Price? LLM Embeddings versus the EMH in Stock Clustering By Bingyang Wang; Grant Johnson; Maria Hybinette; Tucker Balch
Forecasting Binary Economic Events in Modern Mercantilism: Traditional methodologies coupled with PCA and K-means Quantitative Analysis of Qualitative Sentimental Data By Sebastian Kot
Mitigating Distribution Shift in Stock Price Data via Return-Volatility Normalization for Accurate Prediction By Hyunwoo Lee; Jihyeong Jeon; Jaemin Hong; U Kang
Estimation of the Unemployment Rate in Moldova: A Comparison of ARIMA and Machine Learning Models Including COVID-19 Pandemic Periods By Vîntu, Denis
An Artificial Neural Network Experiment on the Prediction of the Unemployment Rate By Vîntu, Denis
Prediction of linear fractional stable motions using codifference By Matthieu Garcin; Karl Sawaya; Thomas Valade
Dissecting Medium-Term Growth Prospects for Asia By Ms. Natasha X Che; Mr. Federico J Diez; Anne Oeking; Weining Xin
A Heterogeneous Spatiotemporal GARCH Model: A Predictive Framework for Volatility in Financial Networks By Atika Aouri; Philipp Otto
Texas Service Sector Outlook Survey: Survey Methodology, Performance and Forecast Accuracy By Jesus Cañas; Emily Kerr; Diego Morales-Burnett
Combined machine learning for stock selection strategy based on dynamic weighting methods By Lin Cai; Zhiyang He; Caiya Zhang
Adaptive Alpha Weighting with PPO: Enhancing Prompt-Based LLM-Generated Alphas in Quant Trading By Qizhao Chen; Hiroaki Kawashima
Dynamic Balance Sheet Simulation and Credit Default Prediction: A Stress Test Model for Colombian Firms By Diego Fernando Cuesta-Mora; Camilo Gómez
Tracking the economy at high frequency By Freddy Garc\'ia-Alb\'an; Juan Jarr\'in
Modelling high frequency non-financial big time series with an application to jobless claims in Chile By Antoni Espasa; Guillermo Carlomagno
An AI-powered Tool for Central Bank Business Liaisons: Quantitative Indicators and On-demand Insights from Firms By Nicholas Gray; Finn Lattimore; Kate McLoughlin; Callan Windsor
Alternative Loss Function in Evaluation of Transformer Models By Jakub Micha\'nk\'ow; Pawe{\l} Sakowski; Robert \'Slepaczuk

A Python Package to Assist Macroframework Forecasting: Concepts and Examples

By:	Mr. Sakai Ando; Shuvam Das; Sultan Orazbayev
Abstract:	In forecasting economic time series, statistical models often need to be complemented with a process to impose various constraints in a smooth manner. Systematically imposing constraints and retaining smoothness are important but challenging. Ando (2024) proposes a systematic approach, but a user-friendly package to implement it has not been developed. This paper addresses this gap by introducing a Python package, macroframe-forecast, that allows users to generate forecasts that are both smooth over time and consistent with user-specified constraints. We demonstrate the package’s functionality with two examples about forecasting US GDP and fiscal variables.
Keywords:	Forecast Reconciliation; Python Package; Macroframework
Date:	2025–08–29
URL:	https://d.repec.org/n?u=RePEc:imf:imfwpa:2025/172

Combining a Large Pool of Forecasts of Value-at-Risk and Expected Shortfall

By:	James W. Taylor; Chao Wang
Abstract:	Value-at-risk (VaR) and expected shortfall (ES) have become widely used measures of risk for daily portfolio returns. As a result, many methods now exist for forecasting the VaR and ES. These include GARCH-based modelling, approaches involving quantile-based autoregressive models, and methods incorporating measures of realised volatility. When multiple forecasting methods are available, an alternative to method selection is forecast combination. In this paper, we consider the combination of a large pool of VaR and ES forecasts. As there have been few studies in this area, we implement a variety of new combining methods. In terms of simplistic methods, in addition to the simple average, the large pool of forecasts leads us to use the median and mode. As a complement to the previously proposed performance-based weighted combinations, we use regularised estimation to limit the risk of overfitting due to the large number of weights. By viewing the forecasts of VaR and ES from each method as the bounds of an interval forecast, we are able to apply interval forecast combining methods from the decision analysis literature. These include different forms of trimmed mean, and a probability averaging method that involves a mixture of the probability distributions inferred from the VaR and ES forecasts. Among other methods, we consider smooth transition between two combining methods. Using six stock indices and a pool of 90 individual forecasting methods, we obtained particularly strong results for a trimmed mean approach, the probability averaging method, and performance-based weighting combining.
Date:	2025–08
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2508.16919

FinCast: A Foundation Model for Financial Time-Series Forecasting

By:	Zhuohang Zhu; Haodong Chen; Qiang Qu; Vera Chung
Abstract:	Financial time-series forecasting is critical for maintaining economic stability, guiding informed policymaking, and promoting sustainable investment practices. However, it remains challenging due to various underlying pattern shifts. These shifts arise primarily from three sources: temporal non-stationarity (distribution changes over time), multi-domain diversity (distinct patterns across financial domains such as stocks, commodities, and futures), and varying temporal resolutions (patterns differing across per-second, hourly, daily, or weekly indicators). While recent deep learning methods attempt to address these complexities, they frequently suffer from overfitting and typically require extensive domain-specific fine-tuning. To overcome these limitations, we introduce FinCast, the first foundation model specifically designed for financial time-series forecasting, trained on large-scale financial datasets. Remarkably, FinCast exhibits robust zero-shot performance, effectively capturing diverse patterns without domain-specific fine-tuning. Comprehensive empirical and qualitative evaluations demonstrate that FinCast surpasses existing state-of-the-art methods, highlighting its strong generalization capabilities.
Date:	2025–08
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2508.19609

Understanding How Exchange Rates are Perceived and How That Perception Affects Exchange Rate Forecasts

By:	Yushi YOSHIDA
Abstract:	People perceive the same level of nominal exchange rate as overvalued at one point in time and undervalued at a different point in time. To capture the perception of the exchange rate at specific times, we suggest constructing the perceived exchange rate by counting the newspaper articles with phrases â€™appreciated currencyâ€™ or â€™depreciated currency.â€™ A shift in the perceived exchange rate (PER) index alters the dynamic response of exchange rates in time series. The PER index is a valid threshold variable in forecasting future exchange rates. The forecast model with the PER index as a threshold variable (PER TAR) outperforms models utilizing the lagged exchange rates as a threshold variable. We also show that the forecast precision of the PER TAR model is as good as the survey forecasts by market participants.
Date:	2025–08
URL:	https://d.repec.org/n?u=RePEc:eti:dpaper:25079

Forecasting NYC Yellow Taxi Ridership Decline: A Time Series Analysis of Daily Passenger Counts (2017-2019)

By:	Gaurav Singh
Abstract:	This study analyzes and forecasts daily passenger counts for New York City's iconic yellow taxis during 2017-2019, a period of significant decline in ridership. Using a comprehensive dataset from the NYC Taxi and Limousine Commission, we employ various time series modeling approaches, including ARIMA models, to predict daily passenger volumes. Our analysis reveals strong seasonal patterns, with a consistent linear decline of approximately 200 passengers per day throughout the study period. After comparing multiple modeling approaches, we find that a first-order autoregressive model, combined with careful detrending and cycle removal, provides the most accurate predictions, achieving a test RMSE of 34, 880 passengers on a mean ridership of 438, 000 daily passengers. The research provides valuable insights for policymakers and stakeholders in understanding and potentially addressing the declining trajectory of NYC's yellow taxi service.
Date:	2025–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2507.10588

Vector Autoregressive Models for Tax Forecasting

By:	Susie McKenzie (The Treasury)
Abstract:	This paper explores the use of vector autoregressive (VAR) models to supplement the New Zealand Treasury’s tax forecasting models. The models are used to forecast both tax revenue and tax receipts. A suite of VAR models is developed for 20 different tax types with a focus on assessing the forecasting performance of six model specifications for each tax category. This paper shows that VAR models exhibit strong predictive performance for tax types with stable trends, such as total tax and source deductions. By contrast, models for corporate tax and other persons tax exhibit higher volatility and larger discrepancies. Several challenges were identified with these models. One challenge is that it is difficult to accommodate changes in tax rates through the sample period. A second challenge is that large shocks, such as the COVID-19 pandemic, introduce significant volatility and affect the accuracy of forecasts, particularly for tax receipts. Some model specifications also exhibit biases in their predictions for certain tax types. Comparing the forecasts to the official data release for 2024Q3, the VAR models for 13 out of 20 tax types produced forecasts within the range of the official tax release, while 7 tax types had discrepancies between $0.7 billion and $3.2 billion, with the largest discrepancies arising in tax receipts forecasts for total, indirect, and GST taxes.
JEL:	C53 E62 H20 C22
Date:	2025–07–03
URL:	https://d.repec.org/n?u=RePEc:nzt:nztans:an25/03

Forecasting Commodity Price Shocks Using Temporal and Semantic Fusion of Prices Signals and Agentic Generative AI Extracted Economic News

By:	Mohammed-Khalil Ghali; Cecil Pang; Oscar Molina; Carlos Gershenson-Garcia; Daehan Won
Abstract:	Accurate forecasting of commodity price spikes is vital for countries with limited economic buffers, where sudden increases can strain national budgets, disrupt import-reliant sectors, and undermine food and energy security. This paper introduces a hybrid forecasting framework that combines historical commodity price data with semantic signals derived from global economic news, using an agentic generative AI pipeline. The architecture integrates dual-stream Long Short-Term Memory (LSTM) networks with attention mechanisms to fuse structured time-series inputs with semantically embedded, fact-checked news summaries collected from 1960 to 2023. The model is evaluated on a 64-year dataset comprising normalized commodity price series and temporally aligned news embeddings. Results show that the proposed approach achieves a mean AUC of 0.94 and an overall accuracy of 0.91 substantially outperforming traditional baselines such as logistic regression (AUC = 0.34), random forest (AUC = 0.57), and support vector machines (AUC = 0.47). Additional ablation studies reveal that the removal of attention or dimensionality reduction leads to moderate declines in performance, while eliminating the news component causes a steep drop in AUC to 0.46, underscoring the critical value of incorporating real-world context through unstructured text. These findings demonstrate that integrating agentic generative AI with deep learning can meaningfully improve early detection of commodity price shocks, offering a practical tool for economic planning and risk mitigation in volatile market environments while saving the very high costs of operating a full generative AI agents pipeline.
Date:	2025–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2508.06497

Binary Response Forecasting under a Factor-Augmented Framework

By:	Tingting Cheng; Jiachen Cong; Fei Liu; Xuanbin Yang
Abstract:	In this paper, we propose a novel factor-augmented forecasting regression model with a binary response variable. We develop a maximum likelihood estimation method for the regression parameters and establish the asymptotic properties of the resulting estimators. Monte Carlo simulation results show that the proposed estimation method performs very well in finite samples. Finally, we demonstrate the usefulness of the proposed model through an application to U.S. recession forecasting. The proposed model consistently outperforms conventional Probit regression across both in-sample and out-of-sample exercises, by effectively utilizing high-dimensional information through latent factors.
Date:	2025–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2507.16462

Is All the Information in the Price? LLM Embeddings versus the EMH in Stock Clustering

By:	Bingyang Wang; Grant Johnson; Maria Hybinette; Tucker Balch
Abstract:	This paper investigates whether artificial intelligence can enhance stock clustering compared to traditional methods. We consider this in the context of the semi-strong Efficient Markets Hypothesis (EMH), which posits that prices fully reflect all public information and, accordingly, that clusters based on price information cannot be improved upon. We benchmark three clustering approaches: (i) price-based clusters derived from historical return correlations, (ii) human-informed clusters defined by the Global Industry Classification Standard (GICS), and (iii) AI-driven clusters constructed from large language model (LLM) embeddings of stock-related news headlines. At each date, each method provides a classification in which each stock is assigned to a cluster. To evaluate a clustering, we transform it into a synthetic factor model following the Arbitrage Pricing Theory (APT) framework. This enables consistent evaluation of predictive performance in a roll forward, out-of-sample test. Using S&P 500 constituents from from 2022 through 2024, we find that price-based clustering consistently outperforms both rule-based and AI-based methods, reducing root mean squared error (RMSE) by 15.9% relative to GICS and 14.7% relative to LLM embeddings. Our contributions are threefold: (i) a generalizable methodology that converts any equity grouping: manual, machine, or market-driven, into a real-time factor model for evaluation; (ii) the first direct comparison of price-based, human rule-based, and AI-based clustering under identical conditions; and (iii) empirical evidence reinforcing that short-horizon return information is largely contained in prices. These results support the EMH while offering practitioners a practical diagnostic for monitoring evolving sector structures and provide academics a framework for testing alternative hypotheses about how quickly markets absorb information.
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2509.01590

Forecasting Binary Economic Events in Modern Mercantilism: Traditional methodologies coupled with PCA and K-means Quantitative Analysis of Qualitative Sentimental Data

By:	Sebastian Kot
Abstract:	This paper examines Modern Mercantilism, characterized by rising economic nationalism, strategic technological decoupling, and geopolitical fragmentation, as a disruptive shift from the post-1945 globalization paradigm. It applies Principal Component Analysis (PCA) to 768-dimensional SBERT-generated semantic embeddings of curated news articles to extract orthogonal latent factors that discriminate binary event outcomes linked to protectionism, technological sovereignty, and bloc realignments. Analysis of principal component loadings identifies key semantic features driving classification performance, enhancing interpretability and predictive accuracy. This methodology provides a scalable, data-driven framework for quantitatively tracking emergent mercantilist dynamics through high-dimensional text analytics
Date:	2025–08
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2508.09243

Mitigating Distribution Shift in Stock Price Data via Return-Volatility Normalization for Accurate Prediction

By:	Hyunwoo Lee; Jihyeong Jeon; Jaemin Hong; U Kang
Abstract:	How can we address distribution shifts in stock price data to improve stock price prediction accuracy? Stock price prediction has attracted attention from both academia and industry, driven by its potential to uncover complex market patterns and enhance decisionmaking. However, existing methods often fail to handle distribution shifts effectively, focusing on scaling or representation adaptation without fully addressing distributional discrepancies and shape misalignments between training and test data. We propose ReVol (Return-Volatility Normalization for Mitigating Distribution Shift in Stock Price Data), a robust method for stock price prediction that explicitly addresses the distribution shift problem. ReVol leverages three key strategies to mitigate these shifts: (1) normalizing price features to remove sample-specific characteristics, including return, volatility, and price scale, (2) employing an attention-based module to estimate these characteristics accurately, thereby reducing the influence of market anomalies, and (3) reintegrating the sample characteristics into the predictive process, restoring the traits lost during normalization. Additionally, ReVol combines geometric Brownian motion for long-term trend modeling with neural networks for short-term pattern recognition, unifying their complementary strengths. Extensive experiments on real-world datasets demonstrate that ReVol enhances the performance of the state-of-the-art backbone models in most cases, achieving an average improvement of more than 0.03 in IC and over 0.7 in SR across various settings.
Date:	2025–08
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2508.20108

Estimation of the Unemployment Rate in Moldova: A Comparison of ARIMA and Machine Learning Models Including COVID-19 Pandemic Periods

By:	Vîntu, Denis
Abstract:	This study investigates the estimation of the unemployment rate in the Republic of Moldova, focusing on the impact of the COVID-19 pandemic. Two forecasting approaches are compared: the traditional ARIMA model and several machine learning models. The performance of these models is evaluated based on prediction accuracy metrics over pre-pandemic and pandemic periods. Results indicate that while ARIMA captures general trends effectively, machine learning models can better adapt to sudden shocks, such as those induced by the pandemic.
Keywords:	Simultaneous equations model; Labor market equilibrium; Unemployment rate determination; Wage-setting equation; Price-setting equation; Beveridge curve; Job matching function; Phillips curve; Structural unemployment; Natural rate of unemployment; Labor supply and demand; Endogenous unemployment; Disequilibrium model; Employment dynamics; Wage-unemployment relationship; Aggregate labor market model; Multivariate system estimation; Identification problem; Reduced form equations; Equilibrium unemployment rate
JEL:	C30 C31 C32 C33 C51 J64 J65 J68
Date:	2025–08
URL:	https://d.repec.org/n?u=RePEc:pra:mprapa:125941

An Artificial Neural Network Experiment on the Prediction of the Unemployment Rate

By:	Vîntu, Denis
Abstract:	Unemployment is one of the most important macroeconomic indicators for evaluating economic performance and social well-being. Forecasting unemployment is crucial for policymakers, yet traditional econometric models often fail to capture nonlinear and dynamic patterns. This paper presents an experiment applying artificial neural networks (ANNs) to predict the unemployment rate using macroeconomic data. Results show that ANNs outperform traditional ARIMA models, particularly during stable economic conditions. Implications for policy, limitations, and future research are discussed.
Keywords:	Simultaneous equations model; Labor market equilibrium; Unemployment rate determination; Wage-setting equation; Price-setting equation; Beveridge curve; Job matching function; Phillips curve; Structural unemployment; Natural rate of unemployment; Labor supply and demand; Endogenous unemployment; Disequilibrium model; Employment dynamics; Wage-unemployment relationship; Aggregate labor market model; Multivariate system estimation; Identification problem; Reduced form equations; Equilibrium unemployment rate
JEL:	C30 C31 C32 C33 J64 J68
Date:	2025–08
URL:	https://d.repec.org/n?u=RePEc:pra:mprapa:125938

Prediction of linear fractional stable motions using codifference

By:	Matthieu Garcin; Karl Sawaya; Thomas Valade
Abstract:	The linear fractional stable motion (LFSM) extends the fractional Brownian motion (fBm) by considering $\alpha$-stable increments. We propose a method to forecast future increments of the LFSM from past discrete-time observations, using the conditional expectation when $\alpha>1$ or a semimetric projection otherwise. It relies on the codifference, which describes the serial dependence of the process, instead of the covariance. Indeed, covariance is commonly used for predicting an fBm but it is infinite when $\alpha
Date:	2025–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2507.15437

Dissecting Medium-Term Growth Prospects for Asia

By:	Ms. Natasha X Che; Mr. Federico J Diez; Anne Oeking; Weining Xin
Abstract:	This paper explores Asia-Pacific's medium-term growth prospects using two approaches. First, growth accounting analysis and machine-learning estimation reveal how demographics, capital deepening, productivity, and human capital shaped Asia's growth. Second, an innovative algorithm forecasts growth by matching countries' current conditions with historically analogous periods using Dynamic Time Warping (DTW). Comparing pattern-based forecasts with traditional projections highlights economic convergence and demographic headwinds. Results show that without ambitious reforms, Asia's growth will likely moderate, though remaining the world's fastest growing region. The paper offers data-driven tools for policymakers to identify growth drivers and generate robust forecasts.
Keywords:	Asia-Pacific; growth forecasting; demographics; productivity; Dynamic Time Warping; machine learning; growth accounting; economic convergence
Date:	2025–08–22
URL:	https://d.repec.org/n?u=RePEc:imf:imfwpa:2025/168

A Heterogeneous Spatiotemporal GARCH Model: A Predictive Framework for Volatility in Financial Networks

By:	Atika Aouri; Philipp Otto
Abstract:	We introduce a heterogeneous spatiotemporal GARCH model for geostatistical data or processes on networks, e.g., for modelling and predicting financial return volatility across firms in a latent spatial framework. The model combines classical GARCH(p, q) dynamics with spatially correlated innovations and spatially varying parameters, estimated using local likelihood methods. Spatial dependence is introduced through a geostatistical covariance structure on the innovation process, capturing contemporaneous cross-sectional correlation. This dependence propagates into the volatility dynamics via the recursive GARCH structure, allowing the model to reflect spatial spillovers and contagion effects in a parsimonious and interpretable way. In addition, this modelling framework allows for spatial volatility predictions at unobserved locations. In an empirical application, we demonstrate how the model can be applied to financial stock networks. Unlike other spatial GARCH models, our framework does not rely on a fixed adjacency matrix; instead, spatial proximity is defined in a proxy space constructed from balance sheet characteristics. Using daily log returns of 50 publicly listed firms over a one-year period, we evaluate the model's predictive performance in a cross-validation study.
Date:	2025–08
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2508.20101

Texas Service Sector Outlook Survey: Survey Methodology, Performance and Forecast Accuracy

By:	Jesus Cañas; Emily Kerr; Diego Morales-Burnett
Abstract:	The Texas Service Sector Outlook Survey (TSSOS) is a monthly survey of service sector and retail firms in Texas conducted by the Federal Reserve Bank of Dallas. TSSOS indexes provide timely information about activity in the Texas private service sector, which makes up the bulk of the state economy. The survey provides invaluable information on regional economic conditions—information that the Dallas Fed president and economists use in the formulation of monetary policy and informing the public. This paper describes the survey methodology and analyzes the explanatory and predictive power of TSSOS indexes with regard to other measures of state economic activity. Regression analysis shows that several TSSOS indexes successfully track changes in Texas employment, gross domestic product and inflation. Forecasting exercises show that many TSSOS indexes are also useful in predicting future changes in some of the same metrics.
Keywords:	service sector; business outlook surveys; diffusion indexes
JEL:	B23 C83 C53 L80
Date:	2025–08–13
URL:	https://d.repec.org/n?u=RePEc:fip:feddwp:101524

Combined machine learning for stock selection strategy based on dynamic weighting methods

By:	Lin Cai (Department of Statistics, Columbia University, New York, USA); Zhiyang He (Department of Engineering and Informatics, University of Sussex, Brighton, UK); Caiya Zhang (Department of Statistics and Data Science, Hangzhou City University, Hangzhou, China)
Abstract:	This paper proposes a novel stock selection strategy framework based on combined machine learning algorithms. Two types of weighting methods for three representative machine learning algorithms are developed to predict the returns of the stock selection strategy. One is static weighting based on model evaluation metrics, the other is dynamic weighting based on Information Coefficients (IC). Using CSI 300 index data, we empirically evaluate the strategy' s backtested performance and model predictive accuracy. The main results are as follows: (1) The strategy by combined machine learning algorithms significantly outperforms single-model approaches in backtested returns. (2) IC-based weighting (particularly IC_Mean) demonstrates greater competitiveness than evaluation-metric-based weighting in both backtested returns and predictive performance. (3) Factor screening substantially enhances the performance of combined machine learning strategies.
Date:	2025–08
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2508.18592

Adaptive Alpha Weighting with PPO: Enhancing Prompt-Based LLM-Generated Alphas in Quant Trading

By:	Qizhao Chen; Hiroaki Kawashima
Abstract:	This paper proposes a reinforcement learning framework that employs Proximal Policy Optimization (PPO) to dynamically optimize the weights of multiple large language model (LLM)-generated formulaic alphas for stock trading strategies. Formulaic alphas are mathematically defined trading signals derived from price, volume, sentiment, and other data. Although recent studies have shown that LLMs can generate diverse and effective alphas, a critical challenge lies in how to adaptively integrate them under varying market conditions. To address this gap, we leverage the deepseek-r1-distill-llama-70b model to generate fifty alphas for five major stocks: Apple, HSBC, Pepsi, Toyota, and Tencent, and then use PPO to adjust their weights in real time. Experimental results demonstrate that the PPO-optimized strategy achieves strong returns and high Sharpe ratios across most stocks, outperforming both an equal-weighted alpha portfolio and traditional benchmarks such as the Nikkei 225, S&P 500, and Hang Seng Index. The findings highlight the importance of reinforcement learning in the allocation of alpha weights and show the potential of combining LLM-generated signals with adaptive optimization for robust financial forecasting and trading.
Date:	2025–09
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2509.01393

Dynamic Balance Sheet Simulation and Credit Default Prediction: A Stress Test Model for Colombian Firms

By:	Diego Fernando Cuesta-Mora; Camilo Gómez
Abstract:	This paper presents a stress test model used by the Financial Stability Department of the Banco de la República to assess the financial vulnerability of Colombian non financial firms. The model supports the Central Bank’s biannual Financial Stability Report and informs policy decisions by identifying firms that are exposed to credit risk under adverse economic conditions. The proposed model integrates three components: a dynamic balance sheet simulation framework; a suite of machine learning models to estimate credit default probabilities; and a final module that identifies firms at risk of default. This tool strengthens the Central Bank’s capacity to monitor and evaluate risks in the corporate sector with a forward-looking perspective. The paper details each component and illustrates the model’s results using a stress scenario. *****RESUMEN: Este documento presenta un modelo de prueba de estrés empleado por el Departamento de Estabilidad Financiera del Banco de la República para evaluar la vulnerabilidad financiera de las firmas no financieras colombianas. El modelo apoya el Reporte de Estabilidad Financiera semestral del Banco de la República y aporta al diseño de políticas al identificar firmas expuestas al riesgo crediticio en condiciones macroeconómicas adversas. El modelo propuesto integra tres componentes: un marco dinámico de simulación de balances; un conjunto de modelos de machine learning para estimar probabilidades de incumplimiento crediticio; y un módulo final que identifica firmas en riesgo de incumplimiento crediticio. Esta herramienta fortalece la capacidad del Banco de la República para monitorear y evaluar riesgos en el sector empresarial de forma prospectiva. El documento detalla cada componente e ilustra los resultados mediante un escenario de estrés.
Keywords:	Stress Testing, Credit Risk, Credit Default, Machine Learning, Prueba de estrés, Riesgo crediticio, Incumplimiento crediticio.
JEL:	G3 G21 G01 G17
Date:	2025–08
URL:	https://d.repec.org/n?u=RePEc:bdr:borrec:1325

Tracking the economy at high frequency

By:	Freddy Garc\'ia-Alb\'an; Juan Jarr\'in
Abstract:	This paper develops a high-frequency economic indicator using a Bayesian Dynamic Factor Model estimated with mixed-frequency data. The model incorporates weekly, monthly, and quarterly official indicators, and allows for dynamic heterogeneity and stochastic volatility. To ensure temporal consistency and avoid irregular aggregation artifacts, we introduce a pseudo-week structure that harmonizes the timing of observations. Our framework integrates dispersed and asynchronous official statistics into a unified High-Frequency Economic Index (HFEI), enabling real-time economic monitoring even in environments characterized by severe data limitations. We apply this framework to construct a high-frequency indicator for Ecuador, a country where official data are sparse and highly asynchronous, and compute pseudo-weekly recession probabilities using a time-varying mean regime-switching model fitted to the resulting index.
Date:	2025–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2507.07450

Modelling high frequency non-financial big time series with an application to jobless claims in Chile

By:	Antoni Espasa; Guillermo Carlomagno
Abstract:	This paper explores the challenges of modelling high-frequency, non-financial big data time-series. Focusing on daily, hourly, and even minute-level data, the study in-vestigates the presence of various seasonalities (daily, weekly, monthly, and annual) and how these cycles might interrelate between them and be influenced by weather patterns and calendar variations. By analyzing these cyclical characteristics and data responses to external factors, the paper explores the potential for regimeswitching, dynamic, and non-linear models to capture these complexities. Furthermore, it proposes the use of Autometrics –an automated algorithm for identifying parsimonious models– to jointly account for all the data’s peculiarities. The resulting models, beyond structural anal-ysis and forecasting, are useful for constructing real-time quantitative macroeconomic leading indicators, demand planning and dynamic pricing strategies in various sectors that are sensitive to the factors identified in the analysis (e.g., of utilities, retail stores, traffic, or labor market indicators). The paper includes an application to the daily series of jobless claims in Chile.
Date:	2024–10
URL:	https://d.repec.org/n?u=RePEc:chb:bcchwp:1023

An AI-powered Tool for Central Bank Business Liaisons: Quantitative Indicators and On-demand Insights from Firms

By:	Nicholas Gray (Reserve Bank of Australia); Finn Lattimore (Reserve Bank of Australia); Kate McLoughlin (Reserve Bank of Australia); Callan Windsor (Reserve Bank of Australia)
Abstract:	In a world of high policy uncertainty, central banks are relying more on soft information sources to complement traditional economic statistics and model-based forecasts. One valuable source of soft information comes from intelligence gathered through central bank liaison programs â€“ structured programs in which central bank staff regularly talk with firms to gather insights. This paper introduces a new text analytics and retrieval tool that efficiently processes, organises, and analyses liaison intelligence gathered from firms using modern natural language processing techniques. The textual dataset spans around 25 years, integrates new information as soon as it becomes available, and covers a wide range of business sizes and industries. The tool uses both traditional text analysis techniques and powerful language models to provide analysts and researchers with three key capabilities: (1) quickly querying the entire history of business liaison meeting notes; (2) zooming in on particular topics to examine their frequency (topic exposure) and analysing the associated tone and uncertainty of the discussion; and (3) extracting precise numerical values from the text, such as firms' reported figures for wages and prices growth. We demonstrate how these capabilities are useful for assessing economic conditions by generating text-based indicators of wages growth and incorporating them into a nowcasting model. We find that adding these text-based features to current best-in-class predictive models, combined with the use of machine learning methods designed to handle many predictors, significantly improves the performance of nowcasts for wages growth. Predictive gains are driven by a small number of features, indicating a sparse signal in contrast to other predictive problems in macroeconomics, where the signal is typically dense.
Keywords:	central banking; macroeconomic policy; wages and labour costs; machine learning; econometric modelling; information retrieval systems; firm behaviour
JEL:	C5 C8 D2 E5 E6 J3
Date:	2025–08
URL:	https://d.repec.org/n?u=RePEc:rba:rbardp:rdp2025-06

Alternative Loss Function in Evaluation of Transformer Models

By:	Jakub Micha\'nk\'ow; Pawe{\l} Sakowski; Robert \'Slepaczuk
Abstract:	The proper design and architecture of testing of machine learning models, especially in their application to quantitative finance problems, is crucial. The most important in this process is selecting an adequate loss function used for training, validation, estimation purposes, and tuning of hyperparameters. Therefore, in this research, through empirical experiments on equity and cryptocurrency assets, we introduce the Mean Absolute Directional Loss (MADL) function which is more adequate for optimizing forecast-generating models used in algorithmic investment strategies. The MADL function results are compared for Transformer and LSTM models and we show that almost in every case Transformer results are significantly better than those obtained with LSTM.
Date:	2025–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2507.16548

This nep-for issue is ©2025 by Malte Knüppel. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the Griffith Business School of Griffith University in Australia.