nep-for New Economics Papers
on Forecasting
Issue of 2013‒07‒20
thirteen papers chosen by
Rob J Hyndman
Monash University

  1. Mining Big Data Using Parsimonious Factor and Shrinkage Methods By Hyun Hak Kim; Norman Swanson
  2. Diffusion Index Model Specification and Estimation Using Mixed Frequency Datasets By Kihwan Kim; Norman Swanson
  3. Testing for Structural Stability of Factor Augmented Forecasting Models By Valentina Corradi; Norman Swanson
  5. A Survey of Recent Advances in Forecast Accuracy Comparison Testing, with an Extension to Stochastic Dominance By Valentina Corradi; Norman Swanson
  6. Forecasting multivariate time series under present-value-model short- and long-run co-movement restrictions By Guillén, Osmani Teixeira de Carvalho; Hecq, Alain; Issler, João Victor; Saraiva, Diogo
  7. Density and Conditional Distribution Based Specification Analysis By Diep Duong; Norman Swanson
  8. Long-Run Risk and Hidden Growth Persistence By Pakos, Michal
  9. Predicting financial markets with Google Trends and not so random keywords By Challet Damien; Bel Hadj Ayed Ahmed
  10. "Modfiied Conditional AIC in Linear Mixed Models" By Yuki Kawakubo; Tatsuya Kubokawa
  11. On the Size of Fiscal Multipliers: A Counterfactual Analysis By Jan Kuckuck; Frank Westermann
  12. A Note on the Forward- and the Equity-Premium Puzzles: Two Symptoms of the Same Illness? By Costa, Carlos E. da; Issler, João Victor; Matos, Paulo F.
  13. A Random Coefficients Logit Analysis of the Counterfactual: A Merger and Divestiture in the Australian Cigarette Industry By Vivienne Pham; David Prentice

  1. By: Hyun Hak Kim (Bank of Korea); Norman Swanson (Rutgers University)
    Abstract: A number of recent studies in the economics literature have focused on the usefulness of factor models in the context of prediction using "big data". In this paper, our over-arching question is whether such "big data" are useful for modelling low frequency macroeconomic variables such as unemployment, inflation and GDP. In particular, we analyze the predictive benefits associated with the use dimension reducing independent component analysis (ICA) and sparse principal component analysis (SPCA), coupled with a variety of other factor estimation as well as data shrinkage methods, including bagging, boosting, and the elastic net, among others. We do so by carrying out a forecasting "horse-race", involving the estimation of 28 different baseline model types, each constructed using a variety of specification approaches, estimation approaches, and benchmark econometric models; and all used in the prediction of 11 key macroeconomic variables relevant for monetary policy assessment. In many instances, we find that various of our benchmark specifications, including autoregressive (AR) models, AR models with exogenous variables, and (Bayesian) model averaging, do not dominate more complicated nonlinear methods, and that using a combination of factor and other shrinkage methods often yields superior predictions. For example, simple averaging methods are mean square forecast error (MSFE) "best" in only 9 of 33 key cases considered. This is rather surprising new evidence that model averaging methods do not necessarily yield MSFE-best predictions. However, in order to "beat" model averaging methods, including arithmetic mean and Bayesian averaging approaches, we have introduced into our "horse-race" numerous complex new models involve combining complicated factor estimation methods with interesting new forms of shrinkage. For example, SPCA yields MSFE-best prediction models in many cases, particularly when coupled with shrinkage. This result provides strong new evidence of the usefulness of sophisticated factor based forecasting, and therefore, of the use of "big data" in macroeconometric forecasting.
    Keywords: prediction, independent component analysis, robust regression, shrinkage, factors
    JEL: C32 C53 G17
    Date: 2013–07–16
  2. By: Kihwan Kim (Rutgers University); Norman Swanson (Rutgers University)
    Abstract: In this chapter, we discuss the use of mixed frequency models and diffusion index approximation methods in the context of prediction. In particular, select recent specification and estimation methods are outlined, and an empirical illustration is provided wherein U.S. unemployment forecasts are constructed using both classical principal components based diffusion indexes as well as using a combination of diffusion indexes and factors formed using small mixed frequency datasets. Preliminary evidence that mixed frequency based forecasting models yield improvements over standard fixed frequency models is presented.
    Keywords: forecasting, diffusion index, mixed frequency, recursive estimation, Kalman filter
    JEL: C22 C51
    Date: 2013–07–16
  3. By: Valentina Corradi (Warwick University); Norman Swanson (Rutgers University)
    Abstract: Mild factor loading instability, particularly if sufficiently independent across the different constituent variables, does not affect the estimation of the number of factors, nor subsequent estimation of the factors themselves (see e.g. Stock and Watson (2009)). This result does not hold in the presence of large common breaks in the factor loadings, however. In this case, information criteria overestimate the number of breaks. Additionally, estimated factors are no longer consistent estimators of "true" factors. Hence, various recent research papers in the diffusion index literature focus on testing the constancy of factor loadings. One reason why this is a positive development is that in applied work, factor augmented forecasting models are used widely for prediction, and it is important to understand when such models are stable. Now, forecast failure of factor augmented models can be due to either factor loading instability, regression coefficient instability, or both. To address this issue, we develop a test for the joint hypothesis of structural stability of both factor loadings and factor augmented forecasting model regression coefficients. The proposed statistic is based on the difference between full sample and rolling sample estimators of the sample covariance of the factors and the variable to be forecasted. Failure to reject the null ensures the structural stability of the factor augmented forecasting model. If the null is instead rejected, one can proceed to disentangle the cause of the rejection as being due to either (or both) of the afore mentioned varieties of instability. Standard inference can be carried out, as the suggested statistic has a chi-squared limiting distribution. We also establish the first order validity of (block) bootstrap critical values. Finally, we provide an empirical illustration by testing for the structural stability of factor augmented forecasting models for 11 U.S. macroeconomic indicators.
    Keywords: diffusion index, factor loading stability, forecast failure, forecast stability, regression coefficient stability
    JEL: C12 C22 C53
    Date: 2013–07–16
  4. By: Bratu, Mihaela (Academy of Economic Studies. Faculty of Cybernetics, Statistics and Economic Informatics.)
    Abstract: The objective of this research is to present some accuracy measures associated to forecast intervals, taken into account the fact that in literature some specific accuracy indicators for this type of prediction have not been proposed yet. For the quarterly inflation rate provided by the National Bank of Romania, forecast intervals were built on the horizon 2010-2012. According to the number of intervals that include the real value and to an econometric procedure based on DUMMY variables, the intervals based on historical errors (RMSE- root mean squared errors) are better than those based on BCA bootstrap procedure. However, the new indicator proposed in this paper as a measure of global accuracy, M indicator, the forecast intervals based on BCA bootstraping are more accurate than the intervals based on historical RMSE. Bayesian intervals were constructed for quarterly USA inflation in 2012 using aprioristic information, but the smaller intervals did not imply an increase in the degree of accuracy.
    Keywords: forecast intervals, accuracy, uncertainty, BCA bootstrap intervals, indicator M
    JEL: C10 C14 L6
    Date: 2013–07
  5. By: Valentina Corradi (Warwick University); Norman Swanson (Rutgers University)
    Abstract: In recent years, an impressive body or research on predictive accuracy testing and model comparison has been published in the econometrics discipline. Key contributions to this literature include the paper by Diebold and Mariano (DM: 1995) that sets the groundwork for much of the subsequent work in the area, West (1996) who considers a variant of the DM test that allows for parameter estimation error in certain contexts, and White (2000) who develops testing methodology suitable for comparing many models. In this chapter, we begin by reviewing various key testing results in the extant literature, both under vanishing and non-vanishing parameter estimation error, with focus on the construction of valid bootstrap critical values in the case of non-vanishing parameter estimation error, under recursive estimation schemes, drawing on Corradi and Swanson (2007a). We then review recent extensions to the evaluation of multiple confidence intervals and predictive densities, for both the case of a known conditional distribution (Corradi and Swanson 2006a,b) and of an unknown conditional distribution (Corradi and Swanson 2007b). Finally, we introduce a novel approach in which forecast combinations are evaluated via the examination of the quantiles of the expected loss distribution. More precisely, we compare models looking at cumulative distribution functions (CDFs) of prediction errors, for a given loss function, via the principle of stochastic dominance; and we choose the model whose CDF is stochastically dominated, over some given range of interest.
    Keywords: block bootstrap, recursive estimation scheme, reality check, parameter estimation error, forecasting
    JEL: C22 C51
    Date: 2013–07–15
  6. By: Guillén, Osmani Teixeira de Carvalho; Hecq, Alain; Issler, João Victor; Saraiva, Diogo
    Abstract: It is well known that cointegration between the level of two variables (e.g.prices and dividends) is a necessary condition to assess the empirical validityof a present-value model (PVM) linking them. The work on cointegration,namelyon long-run co-movements, has been so prevalent that it is often over-looked that another necessary condition for the PVM to hold is that the forecast error entailed by the model is orthogonal to the past. This amounts toinvestigate whether short-run co-movememts steming from common cyclicalfeature restrictions are also present in such a system.In this paper we test for the presence of such co-movement on long- andshort-term interest rates and on price and dividend for the U.S. economy. Wefocuss on the potential improvement in forecasting accuracies when imposingthose two types of restrictions coming from economic theory.
    Date: 2013–07–01
  7. By: Diep Duong (Rutgers University); Norman Swanson (Rutgers University)
    Abstract: The technique of using densities and conditional distributions to carry out consistent specification testing and model selection amongst multiple diffusion processes have received considerable attention from both financial theoreticians and empirical econometricians over the last two decades. One reason for this interest is that correct specification of diffusion models describing dynamics of financial assets is crucial for many areas in finance including equity and option pricing, term structure modeling, and risk management, for example. In this paper, we discuss advances to this literature introduced by Corradi and Swanson (2005), who compare the cumulative distribution (marginal or joint) implied by a hypothesized null model with corresponding empirical distributions of observed data. We also outline and expand upon further testing results from Bhardwaj, Corradi and Swanson (BCS: 2008) and Corradi and Swanson (2011). In particular, parametric specification tests in the spirit of the conditional Kolmogorov test of Andrews (1997) that rely on block bootstrap resampling methods in order to construct test critical values are first discussed. Thereafter, extensions due to BCS (2008) for cases where the functional form of the conditional density is unknown are introduced, and related continuous time simulation methods are introduced. Finally, we broaden our discussion from single process specification testing to multiple process model selection by discussing how to construct predictive densities and how to compare the accuracy of predictive densities derived from alternative (possibly misspecified) diffusion models. In particular, we generalize simulation Steps outlined in Cai and Swanson (2011) to multifactor models where the number of latent variables is larger than three. These final tests can be thought of as continuous time generalizations of the discrete time "reality check" test statistics of White (2000), which are widely used in empirical finance (see e.g. Sullivan, Timmermann and White (1999, 2001)). We finish the chapter with an empirical illustration of model selection amongst alternative short term interest rate models.
    Keywords: multi-factor diffusion process, specification test, out-of-sample forecast, jump process, block bootstrap
    JEL: C22 C51
    Date: 2013–07–16
  8. By: Pakos, Michal
    Abstract: An extensive literature has analyzed the implications of hidden shifts in the dividend growth rate. However, corresponding research on learning about growth persistence is completely lacking. Hidden persistence is a novel way to introduce long-run risk into standard business-cycle models of asset prices because it tightly intertwines the cyclical and long-run frequencies. Hidden persistence magnifies endogenous changes in the forecast variance of the long-run dividend growth rate despite homoscedastic consumption innovations. Not only does changing forecast variance make discrimination between protracted spells of anemic growth and brief business recessions difficult, it also endogenously induces additional variation in asset price discounts due to the preference for early uncertainty resolution.
    Keywords: Asset Pricing, Learning, Hidden Persistence, Forecast Variance, Economic Uncertainty, Business Cycles, Long-Run Risk, Peso Problem, Timing Premium
    JEL: E13 E21 E27 E32 E37 E44 G12 G14
    Date: 2013–04–17
  9. By: Challet Damien; Bel Hadj Ayed Ahmed
    Abstract: We check the claims that data from Google Trends contain enough data to predict future financial index returns. We first discuss the many subtle (and less subtle) biases that may affect the backtest of a trading strategy, particularly when based on such data. Expectedly, the choice of keywords is crucial: by using an industry-grade backtesting system, we verify that random finance-related keywords do not to contain more exploitable predictive information than random keywords related to illnesses, classic cars and arcade games. We however show that other keywords applied on suitable assets yield robustly profitable strategies, thereby confirming the intuition of Preis et al. (2013)
    Date: 2013–07
  10. By: Yuki Kawakubo (Graduate School of Economics, University of Tokyo); Tatsuya Kubokawa (Faculty of Economics, University of Tokyo)
    Abstract:    In linear mixed models, the conditional Akaike Information Criterion (cAIC) is a procedure for variable selection in light of the prediction of specific clusters or random effects. This is useful in problems involving prediction of random effects such as small area estimation, and much attention has been received since suggested by Vaida and Blanchard (2005). A weak point of cAIC is that it is derived as an unbiased estimator of conditional Akaike information (cAI) in the overspecified case, namely in the case that candidate models include the true model. This results in larger biases in the underspecified case that the true model is not included in candidate models. In this paper, we derive the modified cAIC (McAIC) to cover both the underspecified and overspecified cases, and investigate properties of McAIC. It is numerically shown that McAIC has less biases and less prediction errors than cAIC.
    Date: 2013–07
  11. By: Jan Kuckuck (Universitaet Osnabrueck); Frank Westermann (Universitaet Osnabrueck)
    Abstract: The Structural Vector Auto-regression (SVAR) approach to estimating fiscal multipliers, following the seminal paper by Blanchard and Perotti (2002), has been widely applied in the literature. In our paper we discuss the interpretation of these estimates and suggest that they are more useful for forecasting purposes than for policy advice. Our key point is that policy instruments often react to each other. We analyze a data set from the US and document that these interactions are economically and statistically significant. Increases in spending have been financed by subsequent increases in taxes. Increases in taxes have been complemented by additional spending cuts in subsequent quarters. In a counterfactual analysis we report fiscal multipliers that abstract from these dynamic responses of policy instruments to each other.
    Keywords: Fiscal policy, government spending, net revenues, structural vector autoregression
    JEL: E62 H20 H50
    Date: 2013–06–28
  12. By: Costa, Carlos E. da; Issler, João Victor; Matos, Paulo F.
    Abstract: We build a stochastic discount factor—SDF— using information on US domestic financialdata only, and provide evidence that it accounts for foreign markets stylized facts that escapeSDF’s generated by consumption based models. By interpreting our SDF as the projection of thepricing kernel from a fully specified model in the space of returns, our results indicate that amodel that accounts for the behavior of domestic assets goes a long way toward accounting forthe behavior of foreign assets prices. In our tests, we address predictability, a defining featureof the Forward Premium Puzzle—FPP— by using instruments that are known to forecast excessreturns in the moments restrictions associated with Euler equations both in the equity and theforeign markets.
    Date: 2013–07–12
  13. By: Vivienne Pham (School Economics, La Trobe University); David Prentice (School of Economics, La Trobe University)
    Abstract: In this paper we empirically analyse two counterfactual situations facing an antitrust au- thority following the merger of two of the largest international cigarette companies. First we estimate a random coecients model of demand for cigarettes. The implied elasticity of demand for smoking and implied marginal costs are consistent with the independent estimates available. We then use the model to simulate the proposed merger and the partial divesti- ture that was accepted by the Australian antitrust authority. A comparison of the relative price changes predicted by the divestiture simulation with the actual post-divestiture price changes shows that the model is partially successful in predicting the ranking of price changes across companies following the divestiture. This suggests structural econometric analysis us- ing a random coecients model can provide information for antitrust authorities assessing the implications of a potential merger and partial divestiture.
    Keywords: mergers, divestitures, cigarettes, tobacco, anti-trust policy, competition policy
    JEL: L41 L66
    Date: 2013

This nep-for issue is ©2013 by Rob J Hyndman. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.