|
on Forecasting |
| By: | Filip Blaha; Jan Botka; Josef Sveda; Ales Michl |
| Abstract: | We construct a quantile regression forest for inflation forecasting in the Czech Republic, inspired by growing literature on the use of Machine Learning in macroeconomics and finance. We contribute to the literature by implementing an optimisation scheme with time-varying weights that incorporates information from the entire distribution to form the point forecast. By dynamically reflecting the distribution of future inflation paths, our framework outperforms both standard mean and median point forecasts and delivers gains relative to conventional linear benchmark models. We also forecast individual inflation subcomponents that enable us to disentangle the drivers of future inflation and its risks. Furthermore, we integrate the Shapley-value decomposition to enhance the interpretability of our results and adjust the model's predictors for a small open economy. |
| Keywords: | Czech Republic, forecasting, inflation, machine learning, quantile regression forest, small open economy, time varying weights |
| JEL: | C53 C55 E31 E37 E52 |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:cnb:wpaper:2026/09 |
| By: | Max Kleinebrahm; Jonathan Berrisch; Philipp Eiser; Wolf Fichtner; Veit Hagenmeyer; Matthias Hertel; Nils Koster; Sebastian Lerch; Ralf Mikut; Jan Priesmann; Melanie Schienle; Benjamin Schaefer; Jann Weinand; Florian Ziel |
| Abstract: | Energy forecasting research faces a persistent comparability gap that makes it difficult to measure consistent progress over time. Reported accuracy gains are often not directly comparable because models are evaluated under study-specific datasets, time periods, information sets, and scoring setups, while widely used benchmarks and competition datasets are typically tied to fixed historical windows. This paper introduces the Energy-Arena, a dynamic benchmarking platform for operational energy time series forecasting that provides a continuously updated reference point as energy systems evolve. The platform operates as an open, API-based submission system and standardizes challenge definitions and submission deadlines aligned with operational constraints. Performance is reported on rolling evaluation windows via persistent leaderboards. By moving from retrospective backtesting to forward-looking benchmarking, the Energy-Arena enforces standardized ex-ante submission and ex-post evaluation, thereby improving transparency by preventing information leakage and retroactive tuning. The platform is publicly available at Energy-Arena.org. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.24705 |
| By: | My Thi Diem Phan; Trung Tuyen Truong; Hoai Phuong Ha; Dat Thanh Nguyen |
| Abstract: | Norway's electricity market is heavily dominated by hydropower, but the 2021--2022 energy crisis and stronger integration with Continental Europe have fundamentally altered price formation, reducing the reliability of forecasting models calibrated on historical data. Despite the critical need for updated models, a unified benchmark evaluating feature contributions across all structurally diverse Norwegian bidding zones remains lacking. Here we present a comprehensive evaluation of electricity price forecasting across all five Norwegian Nord Pool bidding zones. We constructed a multimodal hourly dataset spanning 2019--2025 and evaluated eight forecasting model families including LightGBM, ARX, and advanced deep learning architectures using a strictly causal test set. We implemented robust rolling-origin backtesting, leave-one-group-out feature ablation, and conditional regime analysis to dissect model performance and feature utility. Our results show that LightGBM achieves the best performance in every zone with MAE ranging from 1.64 to 5.74~EUR/MWh, while the ridge ARX model remains a highly competitive linear benchmark in northern zones. Feature ablation reveals that models relying solely on lagged prices and calendar variables achieve high accuracy and often match or exceed full multimodal integration. However, conditional regime analysis demonstrates that external features like reservoir levels and gas prices remain crucial to stratify forecast errors, which consistently increase under stressed market regimes. This highlights the practical value of model interpretability and regime awareness for decision makers facing structural changes in market dynamics. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.26634 |
| By: | Yusuke Oh (Deputy Director, Institute for Monetary and Economic Studies, Bank of Japan (E-mail: yuusuke.ou@boj.or.jp)); Mototsugu Shintani (The University of Tokyo (E-mail: shintani@e.u-tokyo.ac.jp)) |
| Abstract: | We forecast Japanese recessions by integrating machine learning methods, mixed-frequency data, and text-based indicators within an unrestricted mixed data sampling (U-MIDAS) framework. The model combines monthly macroeconomic variables with weekly financial indicators and newspaper-based text indicators. A pseudo-real-time forecasting exercise over three decades shows that machine learning models consistently outperform traditional logit benchmarks. The model confidence set (MCS) suggests horizon dependence: Text indicators are more informative at short horizons, while financial variables are more informative at longer horizons. To improve interpretability, we apply sparse principal component analysis (Sparse PCA) to the text indicators and identify three economic narratives: 'Corporate Distress, ' 'Financial Distress, ' and 'Deflationary Pressure.' Furthermore, SHAP (SHapley Additive exPlanations) analysis indicates that different recession episodes are associated with different combinations of these narratives, underscoring the heterogeneous nature of economic downturns. |
| Keywords: | business cycles, mixed data sampling, model confidence set, text analysis, recession forecasting |
| JEL: | C32 C53 E37 O53 |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:ime:imedps:26-e-07 |
| By: | Alexis Lazanas; Spyridon Karpouzis |
| Abstract: | The problem of time-series forecasting in non-stationary and complex environments is a challenging task in machine learning, especially with heterogeneous numerical and textual data present. Traditional statistical models like AutoRegressive Integrated Moving Average (ARIMA) are based on the assumptions of linearity and stationarity, whereas recurrent neural networks like Long Short-Term Memory (LSTM) models do not necessarily represent distributional properties in highly volatile settings. This paper proposes a hybrid model that combines Generative Adversarial Networks (GANs) with Natural Language Processing (NLP)-based sentiment analysis to enable sentiment-conditioned time-series prediction. The model integrates adversarial learning on numerical sequences with contextual sentiment representations derived from unstructured text, enabling them to be jointly modelled to capture temporal dynamics and exogenous information. These results demonstrate the promise of hybrid generative and language-aware methods to enhance prediction robustness in non-stationary environments. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.22801 |
| By: | Lin William Cong; Guanhao Feng; Jingyu He; Yuanzhi Wang |
| Abstract: | We argue that return predictability is a latent, asset-specific, and state-dependent characteristic. We develop an interpretable Panel Tree that endogenously partitions the U.S. equity panel into out-of-sample and persistent “mosaic” patterns, and estimate cluster-specific forecasting models. Predictability concentrates in stocks with large earnings surprises, high earnings–price ratios, and low trading volume. It is countercyclical, stronger when market dividend yields are high and liquidity is low. Accounting for predictability heterogeneity, which conventional models ignore, improves forecasts and yields portfolios with out-of-sample Sharpe ratios around 2. Across 50 years of data, the mosaic map shows where signals arise and where noise dominates. |
| JEL: | C38 C53 C55 G12 |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:35158 |
| By: | Maksym Nechepurenko; Pavel Shuvalov |
| Abstract: | Evaluating the true forecasting ability of AI agents requires environments that are resistant to environments resistant to overfitting, free from centralized trust, and grounded in incentive-compatible scoring. Existing benchmarks either rely on static datasets vulnerable to training-data contamination, or measure trading PnL -- a metric conflating predictive accuracy with timing, sizing, and risk appetite. We introduce Foresight Arena, the first permissionless, on-chain benchmark for evaluating AI forecasting agents on real-world prediction markets. Agents submit probabilistic forecasts on binary Polymarket markets via a commit-reveal protocol enforced by Solidity smart contracts on Polygon PoS; outcomes are resolved trustlessly through the Gnosis Conditional Token Framework. Performance is measured by the Brier Score and a novel Alpha Score -- proper scoring rules that incentivize honest probability reporting and isolate predictive edge over market consensus. We provide a formal analysis: closed-form variance for per-market Alpha, the connection to Murphy's classical Brier decomposition, and a power analysis characterizing the number of rounds required to reliably distinguish agents of different skill levels. We show that detecting a true edge of $\alpha^* = 0.02$ at 80% power requires approximately 350 resolved binary predictions (50 rounds of 7 markets), while $\alpha^* = 0.01$ requires four times more. We complement these analytical results with a deterministic, seed-controlled simulation study calibrated to literature-reported Brier-score ranges, illustrating how Murphy decomposition distinguishes well-calibrated agents from market-tracking agents that fail through reduced resolution. Live results from the deployed benchmark will be reported in a future revision. All smart contracts and evaluation infrastructure are open-source. |
| Date: | 2026–05 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2605.00420 |
| By: | Sicco Kooiker (Vrije Universiteit Amsterdam); Janneke van Brummelen (Vrije Universiteit Amsterdam); Julia Schaumburg (Vrije Universiteit Amsterdam); Marcin Zamojski (Vrije Universiteit Amsterdam) |
| Abstract: | We propose a factor model with time-varying loadings for term structure modeling and forecasting. While maintaining the interpretation of the factors as level, slope, and curvature through explicit identification restrictions, we allow the loadings to take flexible shapes by specifying them as neural networks that evolve over time using a “self-driving†updating scheme based on past forecast errors, with gradient scaling to improve robustness. Using an empirically calibrated simulation study and an application to U.S. Treasury yields across 24 maturities, we show that flexible and dynamic factor loadings improve forecasting performance relative to standard benchmarks, including Nelson-Siegel models and the random walk. The gains are strongest at medium maturities and shorter forecast horizons, highlighting the importance of capturing curvature dynamics. In-sample results further illustrate how time-varying loadings provide insight into changes in yield curve shape beyond traditional parametric specifications. |
| Keywords: | time-varying neural networks, observation-driven dynamics, yield curve |
| JEL: | C38 C45 E43 |
| Date: | 2026–02–26 |
| URL: | https://d.repec.org/n?u=RePEc:tin:wpaper:20260007 |
| By: | Thomas Conlon; John Cotter; Iason Kynigakis |
| Abstract: | We demonstrate that machine learning methods provide a powerful framework for modelling conditional asymmetric risk. Using a large cross-section of US stocks and a comprehensive set of firm characteristics, we show that allowing for nonlinearities significantly increases the out-of-sample performance across a wide range of asymmetric beta measures and forecasting horizons. Trading frictions, followed by characteristics related to intangibles, momentum and growth, emerge as the most important drivers of future risk dynamics. Reconstructing CAPM beta from forecasts of asymmetric beta components indicates that a more granular decomposition of systematic risk yields a more accurate representation of market beta. We also find that incorporating conditional beta forecasts into discounted cash flow models that account for the term structure of betas enhances equity valuation accuracy. Finally, we show that the statistical outperformance of conditional betas translates into economically significant benefits for market-neutral portfolio investors. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.22933 |
| By: | Olivia Zhang; Zhilin Zhang |
| Abstract: | Large language models (LLMs) are increasingly deployed in quantitative finance for stock price forecasting. This review synthesizes recent applications of LLMs in this domain, including extracting sentiment from financial news and social media, analyzing financial reports and earnings-call transcripts, tokenizing or symbolizing stock price series, and constructing multi-agent trading systems. Particular attention is paid to practical pitfalls that are often understated in the literature, such as fragility in sentiment analysis, dataset and horizon design, performance evaluation metrics, data leakage, illiquidity premia, and limits of stock price predictability. Organized from a hedge-fund perspective, the review is intended to guide both academic researchers and hedge fund managers in integrating LLMs into real-world trading pipelines and in stress-testing their robustness under realistic market frictions. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2605.05211 |
| By: | Johannes Emmerling; Paul Waidelich; Mr. Matthieu Bellon; Emanuele Massetti |
| Abstract: | This paper assesses estimates of the economic impacts of climate change by leveraging the IMF’s World Economic Outlook (WEO) forecasts (1990-2023) as climate-free counterfactuals. Placebo tests confirm WEO forecasts do not capture climate effects. By adding climate damage estimates to forecasts and comparing with actual GDP growth, we find climate damage functions explain only a small share of forecast errors—reducing mean absolute errors by up to 0.4 percentage points (about 6% of the forecast error). The most severe damage functions predict contractions in some countries that are inconsistent with observed growth, suggesting overstated near-term climate impacts. |
| Keywords: | climate impacts; validation; forecasts; damage function |
| Date: | 2026–05–01 |
| URL: | https://d.repec.org/n?u=RePEc:imf:imfwpa:2026/085 |
| By: | Kevin J. Lansing; Adam Hale Shapiro |
| Abstract: | We develop a non-parametric filter that identifies sustained directional runs in shocks to monthly inflation—a concept we define as “inflation shock momentum.” By assessing the shocks to over 100 disaggregated Personal Consumption Expenditures (PCE) inflation categories, we isolate the share of categories experiencing positive or negative inflation shock momentum in a given month. We define the “Inflation Shock Momentum” (ISM) index as the net positive momentum share of expenditure-weighted categories (positive minus negative) in a given month. We show that the ISM index helps to forecast aggregate PCE inflation at horizons of 1 to 3 years, even after controlling for a variety of other inflation predictor variables. The ISM index is particularly useful in capturing emerging disinflationary pressure and can be used to help forecast future inflation movements in real time. |
| Keywords: | PCE Inflation; Non-parametric filter; Forecasting |
| JEL: | E31 E37 E52 C14 C53 |
| Date: | 2026–04–30 |
| URL: | https://d.repec.org/n?u=RePEc:fip:fedfwp:103112 |
| By: | Knüppel, Malte; Pavlova, Lora |
| Abstract: | Histogram forecasts of inflation and growth from the US Survey of Professional Forecasters (SPF) allow for an assessment of the evolution of forecast uncertainty. However, this assessment is complicated by structural breaks in measured uncertainty arising from changes in histogram bin widths over time. The existing literature typically does not take these breaks into account. We propose a break adjustment based on the insights provided by a structural break in 2014, during which bin widths-and consequently, measured inflation uncertainty-shifted significantly, despite true inflation uncertainty remaining virtually constant. Drawing on our results, we propose horizon-specific bin widths for inflation and growth to align measured uncertainty more closely with underlying uncertainty. |
| Keywords: | survey forecasts, volatility, structural breaks |
| JEL: | C53 E37 |
| Date: | 2026 |
| URL: | https://d.repec.org/n?u=RePEc:zbw:zewdip:340840 |