| 
 | on Forecasting | 
| By: | Xiaoqian Wang; Yanfei Kang; Rob J Hyndman; Feng Li | 
| Abstract: | Providing forecasts for ultra-long time series plays a vital role in various activities, such as investment decisions, industrial production arrangements, and farm management. This paper develops a novel distributed forecasting framework to tackle challenges associated with forecasting ultra-long time series by utilizing the industrystandard MapReduce framework. The proposed model combination approach facilitates distributed time series forecasting by combining the local estimators of ARIMA (AutoRegressive Integrated Moving Average) models delivered from worker nodes and minimizing a global loss function. In this way, instead of unrealistically assuming the data generating process (DGP) of an ultra-long time series stays invariant, we make assumptions only on the DGP of subseries spanning shorter time periods. We investigate the performance of the proposed distributed ARIMA models on an electricity demand dataset. Compared to ARIMA models, our approach results in significantly improved forecasting accuracy and computational efficiency both in point forecasts and prediction intervals, especially for longer forecast horizons. Moreover, we explore some potential factors that may affect the forecasting performance of our approach. | 
| Keywords: | ultra-long time series, distributed forecasting, ARIMA models, least squares approximatio, MapReduce | 
| Date: | 2020 | 
| URL: | http://d.repec.org/n?u=RePEc:msh:ebswps:2020-29&r=all | 
| By: | Jorge Fornero; Andrés Gatty | 
| Abstract: | Any forecast has associated a measure of predictive uncertainty. The Central Bank of Chile (CBoC) communicates with fan charts the projections’ uncertainty of inflation and GDP growth in the Monetary Policy Report (MPR). This work aims at evaluating ex post their properties with empirical techniques. In general, we find that fan charts have been a relatively accurate in illustrating the true density generated by the conditional mean within forecasting horizons of up to one year. While inflation forecasts are unbiased, forecasts of GDP growth have been optimistic on average. The analysis of a recent sub-sample in which risks for GDP growth was made explicit, we graphically examine whether asymmetric fan charts are more accurate ex –post than symmetric fan charts. For these cases, the median projection seem to have provided a better guide than the mode. | 
| Date: | 2020–06 | 
| URL: | http://d.repec.org/n?u=RePEc:chb:bcchwp:881&r=all | 
| By: | Jesus Lago; Grzegorz Marcjasz; Bart De Schutter; Rafa{\l} Weron | 
| Abstract: | While the field of electricity price forecasting has benefited from plenty of contributions in the last two decades, it arguably lacks a rigorous approach to evaluating new predictive algorithms. The latter are often compared using unique, not publicly available datasets and across too short and limited to one market test samples. The proposed new methods are rarely benchmarked against well established and well performing simpler models, the accuracy metrics are sometimes inadequate and testing the significance of differences in predictive performance is seldom conducted. Consequently, it is not clear which methods perform well nor what are the best practices when forecasting electricity prices. In this paper, we tackle these issues by performing a literature survey of state-of-the-art models, comparing state-of-the-art statistical and deep learning methods across multiple years and markets, and by putting forward a set of best practices. In addition, we make available the considered datasets, forecasts of the state-of-the-art models, and a specifically designed python toolbox, so that new algorithms can be rigorously evaluated in future studies. | 
| Date: | 2020–08 | 
| URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.08004&r=all | 
| By: | Boyuan Zhang | 
| Abstract: | In this paper, we estimate and leverage latent constant group structure to generate the point, set, and density forecasts for short dynamic panel data. We implement a nonparametric Bayesian approach to simultaneously identify coefficients and group membership in the random effects which are heterogeneous across groups but fixed within a group. This method allows us to incorporate subjective prior knowledge on the group structure that potentially improves the predictive accuracy. In Monte Carlo experiments, we demonstrate that our Bayesian grouped random effects (BGRE) estimators produce accurate estimates and score predictive gains over standard panel data estimators. With a data-driven group structure, the BGRE estimators exhibit comparable accuracy of clustering with the nonsupervised machine learning algorithm Kmeans and outperform Kmeans in a two-step procedure. In the empirical analysis, we apply our method to forecast the investment rate across a broad range of firms and illustrate that the estimated latent group structure facilitate forecasts relative to standard panel data estimators. | 
| Date: | 2020–07 | 
| URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.02435&r=all | 
| By: | Grzegorz Marcjasz; Jesus Lago; Rafa{\l} Weron | 
| Abstract: | Recent advancements in the fields of artificial intelligence and machine learning methods resulted in a significant increase of their popularity in the literature, including electricity price forecasting. Said methods cover a very broad spectrum, from decision trees, through random forests to various artificial neural network models and hybrid approaches. In electricity price forecasting, neural networks are the most popular machine learning method as they provide a non-linear counterpart for well-tested linear regression models. Their application, however, is not straightforward, with multiple implementation factors to consider. One of such factors is the network's structure. This paper provides a comprehensive comparison of two most common structures when using the deep neural networks -- one that focuses on each hour of the day separately, and one that reflects the daily auction structure and models vectors of the prices. The results show a significant accuracy advantage of using the latter, confirmed on data from five distinct power exchanges. | 
| Date: | 2020–08 | 
| URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.08006&r=all | 
| By: | Jan Wessel (Institute of Transport Economics, Muenster) | 
| Abstract: | Although several papers have shown that bike ridership is affected by actual weather conditions, this is the first study to comprehensively investigate the impact of forecasted weather conditions on bike ridership. The results show that both actual and forecasted weather conditions can be used as useful explanatory variables for predicting bicycle usage. Even incorrect weather forecasts can impact on bike ridership, which underlines the importance of weather forecast effects for traffic planners; for example, forecasted rain can reduce bike traffic by 3.6% in periods that turn out to be rain-free. Additionally, a digital image-processing method is used to calculate the darkness of the cloud coverage displayed on weather forecast maps. The results imply that bike ridership is significantly smaller in regions with darker forecasted clouds. It is also shown that weather forecasts have a stronger impact on recreational bike traffic than on utilitarian traffic. Furthermore, various lagging and leading effects of rain forecasts are outlined. Morning rain forecasts can, for example, reduce bike ridership in midday and afternoon hours that were predicted to be rain-free. To derive these results, hourly bicycle counts from 188 automated counting stations in Germany are collected for the years 2017 and 2018. They are linked to actual weather data from Germany's National Meteorological Service and with historical weather forecasts that are deduced from weather maps of Germany's most-watched television news program. Log-linear and negative binomial regression models are then used to estimate the weather forecast effects. | 
| Keywords: | Cycling, bike ridership, automated counting stations, weather conditions, weather forecasts, image processing | 
| JEL: | R49 | 
| Date: | 2020–06 | 
| URL: | http://d.repec.org/n?u=RePEc:mut:wpaper:32&r=all | 
| By: | Kaukin, Andrei (Каукин, Андрей) (The Russian Presidential Academy of National Economy and Public Administration); Kosarev, Vladimir (Косарев, Владимир) (The Russian Presidential Academy of National Economy and Public Administration) | 
| Abstract: | The paper presents a method for conditional forecasting of the economic cycle taking into account industry dynamics. The predictive model includes a neural network auto-encoder and an adapted deep convolutional network of the «WaveNet» architecture. The first function block reduces the dimension of the data. The second block predicts the phase of the economic cycle of the studied industry. A neural network uses the main components of the explanatory factors as input. The proposed model can be used both as an independent and an additional method for estimating the growth rate of the industrial production index along with dynamic factor models. | 
| Date: | 2020–05 | 
| URL: | http://d.repec.org/n?u=RePEc:rnp:wpaper:052019&r=all | 
| By: | Hannes Mueller (Institut d’Analisi Economica (CSIC), Barcelona GSE); Christopher Rauh (Université de Montréal, CIREQ) | 
| Abstract: | There is a rising interest in conflict prevention and this interest provides a strong motivation for better conflict forecasting. A key problem of conflict forecasting for preventionis that predicting the start of conflict in previously peaceful countries is extremely hard.To make progress in this hard problem this project exploits both supervised and unsupervised machine learning. Specifically, the latent Dirichlet allocation (LDA) model is usedfor feature extraction from 3.8 million newspaper articles and these features are then usedin a random forest model to predict conflict. We find that several features are negativelyassociated with the outbreak of conflict and these gain importance when predicting hardonsets. This is because the decision tree uses the text features in lower nodes where theyare evaluated conditionally on conflict history, which allows the random forest to adapt tothe hard problem and provides useful forecasts for prevention. | 
| Date: | 2019–04 | 
| URL: | http://d.repec.org/n?u=RePEc:mtl:montec:02-2019&r=all | 
| By: | Paulina Concha Larrauri; Upmanu Lall | 
| Abstract: | Frozen concentrated orange juice (FCOJ) is a commodity traded in the International Commodity Exchange. The FCOJ future price volatility is high because the world's orange production is concentrated in a few places, which results in extreme sensitivity to weather and disease. Most of the oranges produced in the United States are from Florida. The United States Department of Agriculture (USDA) issues orange production forecasts on the second week of each month from October to July. The October forecast in particular seems to affect FCOJ price volatility. We assess how a prediction of the directionality and magnitude of the error of the USDA October forecast could affect the decision making process of multiple FCOJ market participants, and if the "production uncertainty" of the forecast could be reduced by incorporating other climate variables. The models developed open up the opportunity to assess the application of the resulting probabilistic forecasts of the USDA production forecast error on the trading decisions of the different FCOJ stakeholders, and to perhaps consider the inclusion of climate predictors in the USDA forecast. | 
| Date: | 2020–07 | 
| URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.03015&r=all | 
| By: | Eduardo Ramos-P\'erez; Pablo J. Alonso-Gonz\'alez; Jos\'e Javier N\'u\~nez-Vel\'azquez | 
| Abstract: | Currently, legal requirements demand that insurance companies increase their emphasis on monitoring the risks linked to the underwriting and asset management activities. Regarding underwriting risks, the main uncertainties that insurers must manage are related to the premium sufficiency to cover future claims and the adequacy of the current reserves to pay outstanding claims. Both risks are calibrated using stochastic models due to their nature. This paper introduces a reserving model based on a set of machine learning techniques such as Gradient Boosting, Random Forest and Artificial Neural Networks. These algorithms and other widely used reserving models are stacked to predict the shape of the runoff. To compute the deviation around a former prediction, a log-normal approach is combined with the suggested model. The empirical results demonstrate that the proposed methodology can be used to improve the performance of the traditional reserving techniques based on Bayesian statistics and a Chain Ladder, leading to a more accurate assessment of the reserving risk. | 
| Date: | 2020–08 | 
| URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.07564&r=all | 
| By: | Daniel \v{S}tifani\'c; Jelena Musulin; Adrijana Mio\v{c}evi\'c; Sandi Baressi \v{S}egota; Roman \v{S}ubi\'c; Zlatan Car | 
| Abstract: | COVID-19 is an infectious disease that mostly affects the respiratory system. At the time of this research being performed, there were more than 1.4 million cases of COVID-19, and one of the biggest anxieties is not just our health, but our livelihoods, too. In this research, authors investigate the impact of COVID-19 on the global economy, more specifically, the impact of COVID-19 on financial movement of Crude Oil price and three U.S. stock indexes: DJI, S&P 500 and NASDAQ Composite. The proposed system for predicting commodity and stock prices integrates the Stationary Wavelet Transform (SWT) and Bidirectional Long Short-Term Memory (BDLSTM) networks. Firstly, SWT is used to decompose the data into approximation and detail coefficients. After decomposition, data of Crude Oil price and stock market indexes along with COVID-19 confirmed cases were used as input variables for future price movement forecasting. As a result, the proposed system BDLSTM+WT-ADA achieved satisfactory results in terms of five-day Crude Oil price forecast. | 
| Date: | 2020–07 | 
| URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.02673&r=all | 
| By: | Nathaniel Tomasetti; Catherine Forbes; Anastasios Panagiotelis | 
| Abstract: | Variational Bayesian (VB) methods produce posterior inference in a time frame considerably smaller than traditional Markov Chain Monte Carlo approaches. Although the VB posterior is an approximation, it has been shown to produce good parameter estimates and predicted values when a rich classes of approximating distributions are considered. In this paper we propose the use of recursive algorithms to update a sequence of VB posterior approximations in an online, time series setting, with the computation of each posterior update requiring only the data observed since the previous update. We show how importance sampling can be incorporated into online variational inference allowing the user to trade accuracy for a substantial increase in computational speed. The proposed methods and their properties are detailed in two separate simulation studies. Two empirical illustrations of the methods are provided, including one where a Dirichlet Process Mixture model with a novel posterior dependence structure is repeatedly updated in the context of predicting the future behaviour of vehicles on a stretch of the US Highway 101. | 
| Keywords: | importance sampling, forecasting, clustering, Dirichlet process mixture, variational inference | 
| JEL: | C11 G18 G39 | 
| Date: | 2020 | 
| URL: | http://d.repec.org/n?u=RePEc:msh:ebswps:2020-27&r=all |