nep-for New Economics Papers
on Forecasting
Issue of 2026–04–27
twelve papers chosen by
Malte Knüppel, Deutsche Bundesbank


  1. Probabilistic Forecasting for Day-ahead Electricity Prices, Battery Trading Strategies and the Economic Evaluation of Predictive Accuracy By Simon Hirsch; Florian Ziel
  2. Forecasting Oil Prices Across the Distribution: A Quantile VAR Approach* By Hilde C. Bjørnland; Nicolás Hardy; Dimitris Korobilis
  3. Generalized Bayesian Composite Quantile Regression with an Application to Equity Premium Forecasting By Hardy, Nicolas; Korobilis, Dimitris
  4. Generalized Bayesian Composite Quantile Regression with an Application to Equity Premium Forecasting* By Nicolas Hardy; Dimitris Korobilis
  5. Forecasting Forced Displacement Flows Using Machine Learning with Text Data By Ramón Talvi Robledo; Christopher Rauh; Ben Seimon; Hannes Mueller; Laura Mayoral
  6. Who Saw It Coming? Historical Experience and the 2021 Inflation Forecast Failure By Dalibor Stevanovic
  7. Who Saw It Coming? Historical Experienceand the 2021 Inflation Forecast Failure By Dalibor Stevanovic
  8. The CTLNet for Shanghai Composite Index Prediction By Haibin Jiao
  9. aggreCAT: An R Package for Mathematically Aggregating Expert Judgments By Gould, Elliot; Gray, Charles T.; Willcox, Aaron; O'Dea, Rose E; Groenewegen, Rebecca; Wilkinson, David Peter
  10. Watching Trade from Space: Nowcasting and Spatial Extrapolation of Port-Level Maritime Trade Using Satellite Imagery By Yonggeun Jung
  11. Spurious Predictability in Financial Machine Learning By Sotirios D. Nikolopoulos
  12. GDP-Flash Estimates: An International Assessment By Philipp Wegmuller; Jan P.A.M. Jacobs; Marc Burri

  1. By: Simon Hirsch; Florian Ziel
    Abstract: Electricity price forecasting supports decision-making in energy markets and asset operation. Probabilistic forecasts are increasingly adopted to explicitly quantify uncertainty, typically issued as quantile predictions or ensembles of the full predictive distribution. However, how improvements in statistical forecast quality translate into economic value remains unclear. Battery storage arbitrage in day-ahead markets is a popular application-based benchmark for this purpose. We analyze quantile-based trading strategies (QBTS) and identify two critical flaws: they do not incentivize honest probabilistic forecasting and they ignore the intertemporal dependence structure of electricity prices. We therefore frame battery optimization as a stochastic program based on fully probabilistic forecasts and examine decision quality measurement for risk-neutral and risk-averse settings under different uncertainty models. Our discussion touches both sides of the coin: How reliable is the economic evaluation of forecasting models though (simplified) application studies - and how do improvements in statistical forecast quality for stochastic programs relate to the decision-quality and economic performance? We provide theoretical justification and empirical evidence from a case study on the German electricity market. Our results highlight the pitfalls of ranking forecasting models through battery trading strategies. We conclude with implications for evaluation practice and directions for future research in application-based forecast assessment.
    Date: 2026–04
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2604.19580
  2. By: Hilde C. Bjørnland; Nicolás Hardy; Dimitris Korobilis
    Abstract: We develop a Quantile Bayesian Vector Autoregression (QBVAR) to forecast real oil prices across different quantiles of the conditional distribution. The model allows predictor effects to vary across quantiles, capturing asymmetries that standard mean-focused approaches miss. Using monthly data from 1975 to 2025, we document three findings. First, the QBVAR improves median forecasts by 2-5% relative to Bayesian VARs, demonstrating that quantile-specific dynamics matter even for point prediction. Second, uncertainty and financial condition variables strongly predict downside risk, with left-tail forecast improvements of 10-25% that intensify during crisis episodes. Third, right-tail forecasting remains difficult; stochastic volatility models dominate for upside risk, though forecast combinations that include the QBVAR recover these losses. The results show that modeling the conditional distribution yields substantial gains for tail risk assessment, particularly during major oil market disruptions.
    Date: 2026–04
    URL: https://d.repec.org/n?u=RePEc:bny:wpaper:0148
  3. By: Hardy, Nicolas; Korobilis, Dimitris
    Abstract: Composite quantile regression (CQR) is a robust and efficient estimator under heavy-tailed and contaminated errors. Existing Bayesian extensions rely on working likelihoods that require latent-variable augmentation and can deliver poorly calibrated credible intervals. We develop generalized Bayesian CQR, which exponentiates the composite quantile loss directly, targeting the same objective as frequentist CQR. Because generalized Bayes replaces point optimization with posterior averaging over the loss surface, it is especially relevant under heavy-tailed errors where the composite quantile loss flattens near its minimum. In generalized Bayes posterior dispersion depends on a learning rate that we calibrate by matching marginal variances to their frequentist sandwich counterparts. The resulting credible intervals achieve near-nominal coverage in cross-sectional settings and substantially reduce the undercoverage of i.i.d.\ intervals under serial dependence, with a residual shortfall under high persistence that mirrors the finite-sample bias of frequentist HAC inference. The calibration has a closed-form solution under flat priors and extends to normal and spike-and-slab LASSO priors for shrinkage and variable selection. Sampling uses standard Metropolis-Hastings with no latent variables, achieving roughly 100-fold computational gains over likelihood-based Bayesian CQR at a common quantile grid. Monte Carlo experiments show competitive or improved point estimation relative to frequentist CQR, reliable coverage, and robust variable selection across Gaussian, heavy-tailed, and contaminated error distributions. An equity premium forecasting application demonstrates that the efficiency and robustness gains translate into economically meaningful improvements in out-of-sample portfolio performance.
    Keywords: Composite quantile regression, Gibbs posterior, Generalized Bayes, Learning rate calibration, Equity premium forecasting, Spike-and-slab priors
    JEL: C11 C14 C21 C52 C53 E37 G17
    Date: 2026–04–14
    URL: https://d.repec.org/n?u=RePEc:pra:mprapa:128752
  4. By: Nicolas Hardy; Dimitris Korobilis
    Abstract: Composite quantile regression (CQR) is a robust and efficient estimator under heavy-tailed and contaminated errors. Existing Bayesian extensions rely on working likelihoods that require latent-variable augmentation and can deliver poorly calibrated credible intervals. We develop generalized Bayesian CQR, which exponentiates the composite quantile loss directly, targeting the same objective as frequentist CQR. Because generalized Bayes replaces point optimization with posterior averaging over the loss surface, it is especially relevant under heavy-tailed errors where the composite quantile loss flattens near its minimum. In generalized Bayes posterior dispersion depends on a learning rate that we calibrate by matching marginal variances to their frequentist sandwich counterparts. The resulting credible intervals achieve near-nominal coverage in cross-sectional settings and substantially reduce the undercoverage of i.i.d. intervals under serial dependence, with a residual shortfall under high persistence that mirrors the finite-sample bias of frequentist HAC inference. The calibration has a closed-form solution under flat priors and extends to normal and spike-and-slab LASSO priors for shrinkage and variable selection. Sampling uses standard Metropolis-Hastings with no latent variables, achieving roughly 100-fold computational gains over likelihood-based Bayesian CQR at a common quantile grid. Monte Carlo experiments show competitive or improved point estimation relative to frequentist CQR, reliable coverage, and robust variable selection across Gaussian, heavy-tailed, and contaminated error distributions. An equity premium forecasting application demonstrates that the efficiency and robustness gains translate into economically meaningful improvements in out-of-sample portfolio performance.
    Date: 2026–04
    URL: https://d.repec.org/n?u=RePEc:bny:wpaper:0149
  5. By: Ramón Talvi Robledo; Christopher Rauh; Ben Seimon; Hannes Mueller; Laura Mayoral
    Abstract: Forced displacement is an important policy challenge, yet forecasting is hindered by sparse, annually observed flow data and reporting delays. This article proposes a forecasting method for country outflows and dyadic flows tailored to this sparse data setting. We combine slow-moving structural predictors with high-frequency text-based signals, compress high-dimensional news into low-dimensional topic representations via Latent Dirichlet Allocation to mitigate overfitting, and estimate a stacked ensemble of gradient-boosted trees that captures non-linear origin–destination interactions while making optimal use of the available data. We further apply conformal prediction to construct statistically valid prediction intervals for bilateral flows. Analyzing the text component yields that destination-specific search intensity of migration terms is a central predictor of subsequent dyadic displacement flows.
    Keywords: conformal prediction, dyadic, early warning, forced displacement, forecasting, Google trends, machine learning
    JEL: P16 C53 D72
    Date: 2026–04
    URL: https://d.repec.org/n?u=RePEc:bge:wpaper:1573
  6. By: Dalibor Stevanovic
    Abstract: This paper studies the 2021 U.S. inflation forecasting failure. The author shows that the failure was primarily driven by sample composition rather than functional-form misspecification: estimation samples dominated by the Great Moderation underweight supply-shock regimes, and expectations anchored to that regime were slow to recognize the shift. Three historically informed adjustments, an intercept correction, a similarity re-estimation on 1970s data, and a kernel-weighted estimator, substantially close the forecast gap, and the gains extend to eight additional U.S. price indices. Household survey respondents over 60, whose lifetime includes the 1970s, reported higher inflation expectations from early 2021, consistent with experience-based learning; younger cohorts remained anchored to the prevailing regime. A controlled experiment with large language models conditioned on “experienced” and “young” professional personas confirms that experiential priors generate significant forecast differences under a common training leakage assumption. Across all three exercises, the source of the prior mattered more than the sophistication of the model. Cet article étudie l’échec des prévisions d’inflation aux États-Unis en 2021. L'auteur montre que cet échec s’explique principalement par la composition de l’échantillon d’estimation plutôt que par une mauvaise spécification de la forme fonctionnelle : des échantillons dominés par la période de la Grande Modération ont sous-pondéré les régimes marqués par des chocs d’offre, et des anticipations ancrées dans ce régime ont tardé à reconnaître le changement. Trois ajustements fondés sur l’expérience historique, une correction de constante, une ré-estimation par similarité à partir des données des années 1970, et un estimateur pondéré par noyau, réduisent substantiellement l’écart de prévision, et ces gains s’étendent à huit indices de prix américains supplémentaires. Les répondants aux enquêtes auprès des ménages âgés de plus de 60 ans, dont l’expérience de vie inclut les années 1970, ont déclaré des anticipations d’inflation plus élevées dès le début de 2021, ce qui est cohérent avec l’hypothèse d’un apprentissage fondé sur l’expérience ; les cohortes plus jeunes sont restées ancrées dans le régime dominant. Une expérience contrôlée utilisant de grands modèles de langage conditionnés par des profils professionnels « expérimentés » et « jeunes » confirme que des priors expérientiels génèrent des différences significatives de prévision sous une hypothèse commune de fuite d’information liée à l’entraînement. Dans les trois exercices, la source des croyances initiales a compté davantage que la sophistication du modèle.
    Keywords: Inflation forecasting, regime change, historical analogy, experience-based learning, expectations anchoring, large language models, Prévision de l’inflation, changement de régime, analogie historique, apprentissage fondé sur l’expérience, ancrage des anticipations, grands modèles de langage
    JEL: C22 C53 D84 E31 E37
    Date: 2026–04–22
    URL: https://d.repec.org/n?u=RePEc:cir:cirwor:2026s-06
  7. By: Dalibor Stevanovic (University of Quebec in Montreal)
    Abstract: This paper studies the 2021 U.S. inflation forecasting failure. I show that the failure was primarily driven by sample composition rather than functional-form misspecification: estimation samples dominated by the Great Moderation underweight supplyshock regimes, and expectations anchored to that regime were slow to recognize the shift. Three historically informed adjustments, an intercept correction, a similarity re-estimation on 1970s data, and a kernel-weighted estimator, substantially close the forecast gap, and the gains extend to eight additional U.S. price indices. Household survey respondents over 60, whose lifetime includes the 1970s, reported higher inflation expectations from early 2021, consistent with experience-based learning; younger cohorts remained anchored to the prevailing regime. A controlled experiment with large language models conditioned on “experienced†and “young†professional personas confirms that experiential priors generate significant forecast differences under a common training leakage assumption. Across all three exercises, the source of the prior mattered more than the sophistication of the model.
    Keywords: Inflation forecasting, regime change, historical analogy, experience-based learning, expectations anchoring, large language models
    JEL: C22 C53 D84 E31 E37
    Date: 2026–04
    URL: https://d.repec.org/n?u=RePEc:bbh:wpaper:26-02
  8. By: Haibin Jiao
    Abstract: Shanghai Composite Index prediction has become a hot issue for many investors and academic researchers. Deep learning models are widely applied in multivariate time series forecasting, including recurrent neural networks (RNN), convolutional neural networks (CNN), and transformers. Specifically, the Transformer encoder, with its unique attention mechanism and parallel processing capabilities, has become an important tool in time series prediction, and has an advantage in dealing with long sequence dependencies and multivariate data correlations. Drawing on the strengths of various models, we propose the CNN-Transformer-LSTM Networks (CTLNet). This paper explores the application of CTLNet for Shanghai Composite Index prediction and the comparative experiments show that the proposed model outperforms state-of-the-art baselines.
    Date: 2026–04
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2604.16835
  9. By: Gould, Elliot (Interdisciplinary MetaResearch Group (SCORE Project)); Gray, Charles T.; Willcox, Aaron (Melbourne University); O'Dea, Rose E; Groenewegen, Rebecca; Wilkinson, David Peter
    Abstract: Structured elicitation protocols, such as the IDEA protocol, are used to elicit probabilistic judgements from multiple domain experts about uncertain events across fields including ecology, biosecurity risk assessment, and metascience. Individual expert judgements must subsequently be mathematically aggregated into a single group forecast. While the simplest case involves combining a set of point-estimates from multiple individuals, this process is further complicated when judgements include uncertainty bounds, or when elicitation is conducted across multiple rounds. This paper presents aggreCAT, an open-source R package that provides 29 aggregation methods for combining individual expert judgements into a single probabilistic estimate, accommodating designs ranging from single-round point estimates to multi-round three-point elicitation. The package follows tidy data principles, enabling straightforward integration with existing R workflows for application at scale. Methods range from unweighted arithmetic combinations to performance-weighted schemes and Bayesian models, with weights derived from uncertainty intervals, shifts in judgements between elicitation rounds, and breadth of expert reasoning. We provide worked examples illustrating the mechanics of representative aggregation methods, a general workflow for batch aggregation across multiple forecasts and methods, and built-in functions for evaluating and visualising forecast performance against known outcomes. aggreCAT fills a substantive gap in open software for mathematically aggregating expert judgement, and is intended to support researchers and decision analysts in rapidly and rigorously synthesising outputs from structured elicitation exercises.
    Date: 2026–04–14
    URL: https://d.repec.org/n?u=RePEc:osf:metaar:74tfv_v2
  10. By: Yonggeun Jung
    Abstract: Satellite data are increasingly used to measure economic activity, yet port-level trade remains largely unmeasured from space. This paper combines synthetic aperture radar imagery, nighttime lights, and port characteristics to measure monthly port-level maritime trade using only publicly available data. The model achieves strong out-of-sample accuracy for U.S. ports, with satellite signals and port attributes playing complementary roles. While absolute levels are difficult to extrapolate beyond the training domain, percentage changes are reliably recovered, as we confirm through a leave-one-region-out exercise and Monte Carlo simulation. Applying the framework to Russian ports after the 2022 sanctions, we detect shifts consistent with trade reorientation toward the Far East. The approach complements AIS-based methods by remaining robust to strategic signal manipulation.
    Date: 2026–04
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2604.15444
  11. By: Sotirios D. Nikolopoulos
    Abstract: Adaptive specification search generates statistically significant backtests even under martingale-difference nulls. We introduce a falsification audit testing complete predictive workflows against synthetic reference classes, including zero-predictability environments and microstructure placebos. Workflows generating significant walk-forward evidence in these environments are falsified. For passing workflows, we quantify selection-induced performance inflation using an absolute magnitude gap linking optimized in-sample evidence to disjoint walk-forward realizations, adjusted for effective multiplicity. Simulations validate extreme-value scaling under correlated searches and demonstrate detection power under genuine structure. Empirical case studies confirm that many apparent findings represent methodological artifacts rather than genuine predictability.
    Date: 2026–04
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2604.15531
  12. By: Philipp Wegmuller; Jan P.A.M. Jacobs; Marc Burri
    Abstract: We compile a harmonised real-time vintage dataset of quarterly GDP releases for 12 countries as well as the European Union and euro area aggregates, and evaluate the quality of flash estimates relative to later releases and more mature benchmarks. We document substantial cross-country heterogeneity in revision behaviour, with revision magnitudes increasing markedly during periods of elevated volatility. We further show that revision-aware state-space methods can, in some settings and depending on the evaluation benchmark, improve upon the raw flash release as an approximation to more mature GDP growth. Overall, the results highlight the trade-off between timeliness and precision in early national-accounts data and show that the real-time reliability of flash GDP depends importantly on benchmark choice, revision dynamics, and national compilation practices.
    Keywords: GDP, advanced releases, revisions, nowcasting, state-space models
    JEL: E23 E32 E37
    Date: 2026–04
    URL: https://d.repec.org/n?u=RePEc:een:camaaa:2026-26

This nep-for issue is ©2026 by Malte Knüppel. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the Griffith Business School of Griffith University in Australia.