|
on Forecasting |
| By: | Abdelfatah, Omar Sharafeldin Mohamed |
| Abstract: | Accurate demand forecasting remains one of the most critical yet persistently challenging functions in retail supply chain management. Traditional statistical forecasting methods such as ARIMA and exponential smoothing have long served as industry standards; however, their limited capacity to capture nonlinear demand patterns, seasonal volatility, and external market signals has prompted growing interest in machine learning (ML) alternatives. This study investigates the comparative effectiveness of multiple ML approaches including Random Forest, Gradient Boosting (XGBoost), Long Short-Term Memory (LSTM) neural networks, and hybrid ensemble models against traditional baseline methods in the context of retail supply chain demand forecasting. Employing a quantitative research design, the study utilizes a panel dataset comprising 36 months of point-of-sale (POS) transaction records, promotional calendars, macroeconomic indicators, and weather data from 14 retail organizations operating across grocery, fashion, and consumer electronics segments. Forecasting accuracy is evaluated using Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE), and Forecast Bias metrics across multiple product categories and forecasting horizons (1-week, 4-week, and 12-week ahead). Results demonstrate that ensemble ML models particularly hybrid LSTM-XGBoost architectures achieve statistically significant improvements in forecasting accuracy over traditional methods, with MAPE reductions averaging 28.6% at the 4-week horizon. Feature importance analysis identifies promotional activity, competitor pricing signals, and lagged POS data as the most influential demand drivers. The study further reveals that ML forecasting benefits are heterogeneous across product categories, with highest gains 2 observed in high-velocity, promotion-sensitive SKUs and smallest gains in slow-moving, low-volatility items. A practical implementation framework is proposed, offering retail supply chain practitioners a structured pathway from data readiness assessment through model deployment and ongoing performance monitoring. |
| Date: | 2026–04–03 |
| URL: | https://d.repec.org/n?u=RePEc:osf:socarx:4z9be_v1 |
| By: | Abhinav Das (Universität Ulm (Germany, Ulm)); Stephan Schlüter (Technische Hochschule Ulm (Germany, Ulm)); Lorenz Schneider (EM - EMLyon Business School) |
| Abstract: | This work integrates Bayesian regime detection with conditional neural processes for 24-hour electricity price forecasting in the German, French, and Norwegian markets. Regimes are inferred via a disentangled sticky hierarchical Dirichlet process hidden Markov model (DS-HDP-HMM). For each regime, an independent conditional neural process (CNP) learns localized mappings from input contexts to 24-dimensional hourly price trajectories; final forecasts are produced as regime-weighted mixtures of the regime-specific CNP outputs. Temporal robustness and cross-market generalization are evaluated on Germany (2021–2023) and on France and Norway (2023). We benchmark against deep neural networks (DNN), the Lasso estimated autoregressive (LEAR) model, extreme gradient boosting (XGBoost), Bayesian long short-term memory (BLSTM), and the temporal fusion transformer (TFT), and assess downstream value through battery storage optimization. Results indicate that the proposed regime-aware CNP often delivers higher profits or lower costs, while DNN can be exceptionally competitive in specific cost-minimization settings. Because point accuracy does not necessarily translate into operational optimality, we apply the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) to aggregate forecasting and operational criteria. TOPSIS ranks the CNP as the leading model for 2023 and, overall, as the most balanced and consistently preferred solution across the considered markets. |
| Keywords: | Battery energy storage systems, Regime-aware prediction, MCDM, Electricity price forecasting |
| Date: | 2026–05–01 |
| URL: | https://d.repec.org/n?u=RePEc:hal:journl:hal-05562231 |
| By: | Alexandre Alouadi; Gr\'egoire Loeper; C\'elian Marsala; Othmane Mazhar; Huy\^en Pham |
| Abstract: | We study the problem of generating synthetic time series that reproduce both marginal distributions and temporal dynamics, a central challenge in financial machine learning. Existing approaches typically fail to jointly model drift and stochastic volatility, as diffusion-based methods fix the volatility while martingale transport models ignore drift. We introduce the Schr\"odinger-Bass Bridge for Time Series (SBBTS), a unified framework that extends the Schr\"odinger-Bass formulation to multi-step time series. The method constructs a diffusion process that jointly calibrates drift and volatility and admits a tractable decomposition into conditional transport problems, enabling efficient learning. Numerical experiments on the Heston model demonstrate that SBBTS accurately recovers stochastic volatility and correlation parameters that prior Schr\"odingerBridge methods fail to capture. Applied to S&P 500 data, SBBTS-generated synthetic time series consistently improve downstream forecasting performance when used for data augmentation, yielding higher classification accuracy and Sharpe ratio compared to real-data-only training. These results show that SBBTS provides a practical and effective framework for realistic time series generation and data augmentation in financial applications. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.07159 |
| By: | Zheqi Fan (Melody); Meng (Melody); Wang; Yifan Ye |
| Abstract: | We examine whether model-based spot volatility estimators extracted from traded options data enhance the predictive power of the Heterogeneous Autoregressive (HAR) model for realized volatility. Specifically, we infer spot volatility under the rough stochastic volatility model via an iterative two-step approach following Andersen et al. (2015a) and adopt a deep learning surrogate to accelerate model estimation from large-scale options panels. Benchmarked against traditional stochastic volatility models (Heston, Bates, SVCJ) and the VIX index, our results demonstrate that the augmented HAR-RV-RHeston model improves daily realized volatility forecasting accuracy and sustains superior performance across horizons up to one month. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.02743 |
| By: | Christopher Gerling; Hanqiu Peng; Ying Chen; Stefan Lessmann |
| Abstract: | Accurate forecasting of recovery rates (RR) is central to credit risk management and regulatory capital determination. In many loan portfolios, however, RR modeling is constrained by data scarcity arising from infrequent default events. Transfer learning (TL) offers a promising avenue to mitigate this challenge by exploiting information from related but richer source domains, yet its effectiveness critically depends on the presence and strength of distributional shifts, and on potential heterogeneity between source and target feature spaces. This paper introduces FT-MDN-Transformer, a mixture-density tabular Transformer architecture specifically designed for TL in RR forecasting across heterogeneous feature sets. The model produces both loan-level point estimates and portfolio-level predictive distributions, thereby supporting a wide range of practical RR forecasting applications. We evaluate the proposed approach in a controlled Monte Carlo simulation that facilitates systematic variation of covariate, conditional, and label shifts, as well as in a real-world transfer setting using the Global Credit Data (GCD) loan dataset as source and a novel bonds dataset as target. Our results show that FT-MDN-Transformer outperforms baseline models when target-domain data are limited, with particularly pronounced gains under covariate and conditional shifts, while label shift remains challenging. We also observe its probabilistic forecasts to closely track empirical recovery distributions, providing richer information than conventional point-prediction metrics alone. Overall, the findings highlight the potential of distribution-aware TL architectures to improve RR forecasting in data-scarce credit portfolios and offer practical insights for risk managers operating under heterogeneous data environments. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.02832 |
| By: | Tashreef Muhammad; Tahsin Ahmed; Meherun Farzana; Md. Mahmudul Hasan; Abrar Eyasir; Md. Emon Khan; Mahafuzul Islam Shawon; Ferdous Mondol; Mahmudul Hasan; Muhammad Ibrahim |
| Abstract: | Accurate short-term forecasting of agricultural commodity prices is critical for food security planning and smallholder income stabilisation in developing economies, yet machine-learning-ready datasets for this purpose remain scarce in South Asia. This paper makes two contributions. First, we introduce AgriPriceBD, a benchmark dataset of 1, 779 daily retail mid-prices for five Bangladeshi commodities - garlic, chickpea, green chilli, cucumber, and sweet pumpkin - spanning July 2020 to June 2025, extracted from government reports via an LLM-assisted digitisation pipeline. Second, we evaluate seven forecasting approaches spanning classical models - na\"{i}ve persistence, SARIMA, and Prophet - and deep learning architectures - BiLSTM, Transformer, Time2Vec-enhanced Transformer, and Informer - with Diebold-Mariano statistical significance tests. Commodity price forecastability is fundamentally heterogeneous: na\"{i}ve persistence dominates on near-random-walk commodities. Time2Vec temporal encoding provides no statistically significant advantage over fixed sinusoidal encoding and causes catastrophic degradation on green chilli (+146.1% MAE, p |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.06227 |
| By: | Zhenyu Gao; Wenxi Jiang; Yutong Yan |
| Abstract: | Prior research shows that large language models (LLMs) exhibit systematic extrapolation bias when forming predictions from both experimental and real-world data, and that prompt-based approaches appear limited in alleviating this bias. We propose a supervised fine-tuning (SFT) approach that uses Low-Rank Adaptation (LoRA) to train off-the-shelf LLMs on instruction datasets constructed from rational benchmark forecasts. By intervening at the parameter level, SFT changes how LLMs map observed information into forecasts and thereby mitigates extrapolation bias. We evaluate the fine-tuned model in two settings: controlled forecasting experiments and cross-sectional stock return prediction. In both settings, fine-tuning corrects the extrapolative bias out-of-sample, establishing a low-cost and generalizable method for debiasing LLMs. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.02921 |
| By: | Mykola Babiak; Jozef Barunik; Josef Kurka |
| Abstract: | Cross-sectional dispersion in firm-level realized skewness is significantly and negatively related to future stock market returns. The predictive power of skewness dispersion is robust to in-sample and out-of-sample estimation and is incremental over a broad set of existing predictors, with only a few alternatives retaining independent explanatory ability. Skewness dispersion also delivers substantial economic gains in portfolio allocation. Its forecasting power is concentrated in months with monetary policy announcements, reflecting an information-based mechanism. The empirical evidence suggests that skewness dispersion captures the gradual incorporation of macro news into prices, which is driven by variation in aggregate risk and valuation adjustments. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.07870 |
| By: | Mostapha Benhenda |
| Abstract: | Forecasting startup success is notoriously difficult, partly because meaningful outcomes, such as exits, large funding rounds, and sustained revenue growth, are rare and can take years to materialize. As a result, signals are sparse and evaluation cycles are slow. Y Combinator batches offer a unique mitigation: each batch comprises around 200 startups, funded simultaneously, with evaluation at Demo Day only three months later. We introduce YC Bench, a live benchmark for forecasting early outperformance within YC batches. Using the YC W26 batch as a case study (196 startups), we measure outperformance with a Pre-Demo Day Score, a KPI combining publicly available traction signals and web visibility. This short-term metric enables rapid evaluation of forecasting models. As a baseline, we take Google mentions prior to the YC W26 application deadline, a simple proxy for prior brand recognition, recovering 6 of 11 top performers at YC Demo Day (55% recall). YC Bench provides a live benchmark for studying startup success forecasting, with iteration cycles measured in months rather than years. Code and Data are available on GitHub: https://github.com/benstaf/ycbench |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.02378 |
| By: | Kazim, Zeeshan |
| Abstract: | Short-horizon trader-support systems in financial markets should be judged not only by predictive skill but also by auditability, inspectability, and suitability for decision support under uncertainty. This paper studies an auditable 20-bar chart-image branch built from a retained MOEX Si futures artifact bundle. Starting from 14 contract-level one-minute candle files plus an auxiliary participant-position file, the pipeline resamples to 15-minute candles, selects the daily front contract by realized volume, applies a conservative liquidity filter, constructs a clean continuous-front series of 29, 926 bars across 502 trading days, and generates 10, 350 same-day, same-contract 20-bar windows rendered as 64 x 60 grayscale candlestick images. A three-block convolutional neural network (CNN) and a Grad-CAM-style local explanation layer are then embedded in a dashboard-centered inspection workflow. On the held-out test split (1, 646 windows), the model attains 0.552 accuracy, 0.530 balanced accuracy, 0.408 F1, 0.557 ROC-AUC, and 0.063 MCC. Performance is modest and, at the default 0.50 threshold, trails the naive majority-class baseline on raw accuracy, while remaining better than random in threshold-free ranking terms. Results are contract-concentrated: SiZ5 materially outperforms SiU5. Quantitative explanation summaries show that heat mass is concentrated mainly in the price panel (0.807) rather than volume (0.193), but remains broad and only weakly focused on the final bar (0.043). The contribution is therefore bounded: not novelty of chart-image CNN forecasting or Grad-CAM in futures, both of which already exist, but the integration of chronology-aware futures engineering, auditable chart-image modeling, local explanation outputs, and row-level dashboard inspection in a trader-facing decision-support artifact. Fee-aware economic validation and formal user evaluation remain separated as pending next-stage work. |
| Date: | 2026–03–31 |
| URL: | https://d.repec.org/n?u=RePEc:osf:socarx:btnz8_v1 |
| By: | Jaden Zhang; Gardenia Liu; Oliver Johansson; Hileamlak Yitayew; Kamryn Ohly; Grace Li |
| Abstract: | We introduce Prediction Arena, a benchmark for evaluating AI models' predictive accuracy and decision-making by enabling them to trade autonomously on live prediction markets with real capital. Unlike synthetic benchmarks, Prediction Arena tests models in environments where trades execute on actual exchanges (Kalshi and Polymarket), providing objective ground truth that cannot be gamed or overfitted. Each model operates as an independent agent starting with $10, 000, making autonomous decisions every 15-45 minutes. Over a 57-day longitudinal evaluation (January 12 to March 9, 2026), we track two cohorts: six frontier models in live trading (Cohort 1, full period) and four next-generation models in paper trading (Cohort 2, 3-day preliminary). For Cohort 1, final Kalshi returns range from -16.0% to -30.8%. Our analysis identifies a clear performance hierarchy: initial prediction accuracy and the ability to capitalize on correct predictions are the main drivers, while research volume shows no correlation with outcomes. A striking cross-platform contrast emerges from parallel Polymarket live trading: Cohort 1 models averaged only -1.1% on Polymarket vs. -22.6% on Kalshi, with grok-4-20-checkpoint achieving a 71.4% settlement win rate - the highest across any platform or cohort. gemini-3.1-pro-preview (Cohort 2), which executed zero trades on Kalshi, achieved +6.02% on Polymarket in 3 days - the best return of any model across either cohort - demonstrating that platform design has a profound effect on which models succeed. Beyond performance, we analyze computational efficiency (token usage, cycle time), settlement accuracy, exit patterns, and market preferences, providing a comprehensive view of how frontier models behave under real financial pressure. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.07355 |
| By: | Nolan Alexander; William Scherer |
| Abstract: | We propose a novel model to achieve superior out-of-sample Sharpe ratios. While most research in asset allocation focuses on estimating the return vector and covariance matrix, the first component of our novel model instead forecasts the future tangency portfolio, and the second component then determines the optimal investment portfolio. First, to forecast the tangency portfolio, we forecast the efficient frontier by decomposing its functional form, a square root second-order polynomial, into three interpretable coefficients, which can then be used to calculate a forecasted tangency portfolio. These coefficients can be forecasted using vector autoregressions. Second, the model invests in the portfolio on the efficient frontier that is the minimum Euclidean distance from this forecasted tangency portfolio. A motivation for our approach is to address the limitation that the tangency portfolio only maximizes the Sharpe ratio when future returns and covariances are stationary, and can be directly estimated with historical data, which often does not hold in out-of-sample data. Our approach addresses this shortcoming in a novel way by forecasting the tangency portfolio, rather than estimating return and covariance. For empirical testing, we employ two sets of assets that span the market to demonstrate and validate the performance of this novel method. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.03948 |
| By: | Daichi Hiraki; Siddhartha Chib; Yasuhiro Omori |
| Abstract: | We develop a dynamic factor stochastic volatility-in-mean (SVM) specification for vector autoregressions (VARs) that embeds an SVM component within a dynamic factor stochastic volatility structure. A small number of latent volatility factors capture common movements in conditional variances, while volatility enters the conditional mean of the VAR. This specification allows time-varying uncertainty to influence macroeconomic dynamics through both second moments and expected outcomes while preserving tractability in large panels. We construct an efficient Markov chain Monte Carlo algorithm for estimation in this high-dimensional, non-Gaussian setting. Using quarterly data on twenty variables from the FRED-QD database, we compare predictive performance with the benchmark stochastic volatility VAR model. The dynamic factor SVM specification delivers superior forecasts for more variables during major macroeconomic disruptions such as the 2008 global financial crisis. The results indicate that allowing volatility to enter the mean captures an important transmission channel in macroeconomic dynamics. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.04529 |
| By: | Mr. Sam Ouliaris; Ms. Celine Rochon |
| Abstract: | The Quarterly Projection Model (QPM) is one of the IMF’s standard frameworks for monetary policy analysis and forms a core component of a forward‑looking Forecasting and Policy Analysis System (FPAS). Traditionally, the QPM is solved using simulation tools available in MATLAB. This technical note demonstrates how the canonical QPM can instead be implemented using the EViews econometric package, with the aim of reducing the technical barriers to applying the model in practice. The note is intended for policy analysts and economists who wish to adapt the QPM to country‑specific settings without requiring advanced proficiency in the EViews programming language. The approach is illustrated using a notional “Country Z, ” but the methodology can be readily adapted to real‑world country applications. Users need only to assemble the required data and make targeted modifications to existing EViews code, such as selecting the appropriate exchange‑rate regime equations and calibrating key model constants. In addition to solving the QPM for its baseline projection, the technical note shows how to construct and analyze alternative scenarios. These scenarios may involve multiple exogenous shocks and constraints on selected endogenous variables, enabling users to assess the dynamic response of the economy and the speed and path of its return to the baseline. |
| Keywords: | forecasting; monetary policy analysis; quarterly projection model; time series analysis; real exchange rates; |
| Date: | 2026–04–06 |
| URL: | https://d.repec.org/n?u=RePEc:imf:imftnm:2026/003 |