
on Econometrics 
By:  David Harris; Hsein Kew; A. M. Robert Taylor 
Abstract:  This paper focuses on the estimation of the location of level breaks in time series whose shocks display nonstationary volatility (permanent changes in unconditional volatility). We propose a new feasible weighted least squares (WLS) estimator, based on an adaptive estimate of the volatility path of the shocks. We show that this estimator belongs to a generic class of weighted residual sum of squares which also contains the ordinary least squares (OLS) and WLS estimators, the latter based on the true volatility process. For fixed magnitude breaks we show that the consistency rate of the generic estimator is unaffected by nonstationary volatility. We also provide local limiting distribution theory for cases where the break magnitude is either localtozero at some polynomial rate in the sample size or is exactly zero. The former includes the Pitman drift rate which is shown via Monte Carlo experiments to predict well the key features of the finite sample behaviour of both the OLS and our feasible WLS estimators. The simulations highlight the importance of the break location, break magnitude, and the form of nonstationary volatility for the finite sample performance of these estimators, and show that our proposed feasible WLS estimator can deliver significant improvements over the OLS estimator under heteroskedasticity. We discuss how these results can be applied, by using level break fraction estimators on the first differences of the data, when testing for a unit root in the presence of trend breaks and/or nonstationary volatility. Methods to select between the break and no break cases, using standard information criteria and feasible weighted information criteria based on our adaptive volatility estimator, are also discussed. Simulation evidence suggests that unit root tests based on these weighted quantities can display significantly improved finite sample behaviour under heteroskedasticity relative to their unweighted counterparts. An empirical illustration to U.S. and U.K. real GDP is also considered. 
Keywords:  Level break fraction, nonstationary volatility, adaptive estimation, feasible weighted estimator, information criteria, unit root tests and trend breaks. 
JEL:  C12 C22 
Date:  2020 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:20208&r=all 
By:  Yi He; Sombut Jaidee; Jiti Gao 
Abstract:  We propose a powerful quadratic test for the overall significance of many weak exogenous variables in a dense autoregressive model. By shrinking the classical weighting matrix on the sample moments to be identity, the test is asymptotically correct in high dimensions even when the number of coefficients is larger than the sample size. Our theory allows a nonparametric error distribution and estimation of the autoregressive coefficients. Using random matrix theory, we show that the test has the optimal asymptotic testing power among a large class of competitors against local dense alternatives whose direction is free in the eigenbasis of the sample covariance matrix among regressors. The asymptotic results are adaptive to the predictors' crosssectional and temporal dependence structure, and do not require a limiting spectral law of their sample covariance matrix. The method extends beyond autoregressive models, and allows more general nuisance parameters. Monte Carlo studies suggest a good power performance of our proposed test against high dimensional dense alternative for various data generating processes. We apply our tests to detect the overall significance of over one hundred exogenous variables in the latest FREDMD database for predicting the monthly growth in the US industrial production index. 
Keywords:  Highdimensional linear model, null hypothesis, uniformly power test. 
JEL:  C12 C21 C55 
Date:  2020 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:202013&r=all 
By:  Bo Zhang; Jiti Gao; Guangming Pan 
Abstract:  This paper considers a pdimensional time series model of the form x(t)=Π x(t1)+Σ^(1/2)y(t), 1≤t≤T, where y(t)=(y(t1),...,y(tp))^T and Σ is the square root of a symmetric positive definite matrix. Here Π is a symmetric matrix which satisfies that ∥Π ∥_2≤ 1 and T(1∥Π ∥_min) is bounded. The linear processes Y(tj) is of the form ∑_{k=0}^∞b(k)Z(tk,j) where ∑_{i=0}^∞b(i) < ∞ and {Z(ij) } are are independent and identically distributed (i.i.d.) random variables with E Z ij =0, EZ(ij)²=1 and EZ(ij)^4< ∞. We first investigate the asymptotic behavior of the first k largest eigenvalues of the sample covariance matrices of the time series model. Then we propose a new estimator for the highdimensional near unit root setting through using the largest eigenvalues of the sample covariance matrices and use it to test for near unit roots. Such an approach is theoretically novel and addresses some important estimation and testing issues in the highdimensional near unit root setting. Simulations are also conducted to demonstrate the finitesample performance of the proposed test statistic. 
Keywords:  Asymptotic normality, largest eigenvalue, linear process, near unit root test. 
JEL:  C21 C32 
Date:  2020 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:202012&r=all 
By:  Andrea Gazzani (Bank of Italy); Alejandro Vicondoa (Instituto de Economía, Pontificia Universidad Católica de Chile) 
Abstract:  This paper proposes a novel methodology, the Bridge ProxySVAR, which exploits highfrequency information for the identification of the Vector Autoregressive (VAR) models employed in macroeconomic analysis. The methodology is comprised of three steps: (I) identifying the structural shocks of interest in highfrequency systems; (II) aggregating the series of highfrequency shocks at a lower frequency; and (III) using the aggregated series of shocks as a proxy for the corresponding structural shock in lower frequency VARs. We show that the methodology correctly recovers the impact effect of the shocks, both formally and in Monte Carlo experiments. Thus the Bridge ProxySVAR can improve causal inference in macroeconomics that typically relies on VARs identified at lowfrequency. In an empirical application, we identify uncertainty shocks in the U.S. by imposing weaker restrictions relative to the existing literature and find that they induce mildly recessionary effects. 
Keywords:  structural vector autoregression, external instrument, highfrequency identification, proxy variable, uncertainty shocks. 
JEL:  C32 C36 E32 
Date:  2020–04 
URL:  http://d.repec.org/n?u=RePEc:bdi:wptemi:td_1274_20&r=all 
By:  Edward P. Herbst; Benjamin K. Johannsen 
Abstract:  Local projections (LPs) are a popular tool in applied macroeconomic research. We survey the related literature and find that LPs are often used with very small samples in the time dimension. With small sample sizes, given the high degree of persistence in most macroeconomic data, impulse responses estimated by LPs can be severely biased. This is true even if the righthandside variable in the LP is iid, or if the data set includes a large crosssection (i.e., panel data). We derive a simple expression to elucidate the source of the bias. Our expression highlights the interdependence between coefficients of LPs at different horizons. As a byproduct, we propose a way to biascorrect LPs. Using U.S. macroeconomic data and identified monetary policy shocks, we demonstrate that the bias correction can be large. 
Keywords:  Local projections; Bias 
JEL:  C20 E00 
Date:  2020–01–31 
URL:  http://d.repec.org/n?u=RePEc:fip:fedgfe:202010&r=all 
By:  Natalia Bailey; George Kapetanios; M. Hashem Pesaran 
Abstract:  This paper proposes an estimator of factor strength and establishes its consistency and asymptotic distribution. The proposed estimator is based on the number of statistically significant factor loadings, taking account of the multiple testing problem. We focus on the case where the factors are observed which is of primary interest in many applications in macroeconomics and finance. We also consider using cross section averages as a proxy in the case of unobserved common factors. We face a fundamental factor identification issue when there are more than one unobserved common factors. We investigate the small sample properties of the proposed estimator by means of Monte Carlo experiments under a variety of scenarios. In general, we find that the estimator, and the associated inference, perform well. The test is conservative under the null hypothesis, but, nevertheless, has excellent power properties, especially when the factor strength is sufficiently high. Application of the proposed estimation strategy to factor models of asset returns shows that out of 146 factors recently considered in the finance literature, only the market factor is truly strong, while all other factors are at best semistrong, with their strength varying considerably over time. Similarly, we only find evidence of semistrong factors in an updated version of the Stock and Watson (2012) macroeconomic dataset. 
Keywords:  Factor models, factor strength, measures of pervasiveness, crosssectional dependence, market factor. 
JEL:  C38 E20 G20 
Date:  2020 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:20207&r=all 
By:  Babii, Andrii; Florens, JeanPierre 
Abstract:  This paper documents the consequences of the identification failures in a class of linear illposed inverse models. The Tikhonovregularized estimator converges to a welldefined limit equal to the best approximation of the structural parameter in the orthogonal complement to the null space of the operator. We illustrate that in many instances the best approximation may coincide with the structural parameter or at least may reasonably approximate it. We obtain new nonasymptotic risk bounds in the uniform and the Hilbert space norms for the best approximation. Nonidentification has important implications for the large sample distribution of the Tikhonovregularized estimator, and we document the transition between the Gaussian and the weighted chisquared limits. The theoretical results are illustrated for the nonparametric IV and the functional linear IV regressions and are further supported by the Monte Carlo experiments. 
Keywords:  nonidentified linear models; weak identification; nonparametric IV regression; functional linear IV regression; Tikhonov regularization. 
JEL:  C14 C26 
Date:  2020–04 
URL:  http://d.repec.org/n?u=RePEc:tse:wpaper:124211&r=all 
By:  Guowei Cui; Vasilis Sarafidis; Takashi Yamagata 
Abstract:  The present paper develops a new Instrumental Variables (IV) estimator for spatial, dynamic panel data models with interactive effects under large N and T asymptotics. For this class of models, the only approaches available in the literature are based on quasimaximum likelihood estimation. The approach put forward in this paper is appealing from both a theoretical and a practical point of view for a number of reasons. Firstly, the proposed IV estimator is linear in the parameters of interest and it is computationally inexpensive. Secondly, the IV estimator is free from asymptotic bias. In contrast, existing QML estimators suffer from incidental parameter bias, depending on the magnitude of unknown parameters. Thirdly, the IV estimator retains the attractive feature of Method of Moments estimation in that it can accommodate endogenous regressors, so long as external exogenous instruments are available. The IV estimator is consistent and asymptotically normal as N, T → ∞, with N/T^2 → 0 and T /N^2 → 0. The proposed methodology is employed to study the determinants of risk attitude of banking institutions. The results of our analysis provide evidence that the more risksensitive capital regulation that was introduced by the Basel III framework in 2011 has succeeded in influencing banks' behaviour in a substantial manner. 
Keywords:  Panel data, instrumental variables, state dependence, social interactions, common factors, large N and T asymptotics 
JEL:  C33 C36 C38 C55 
Date:  2020 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:202011&r=all 
By:  Jan Pablo Burgard; Joscha Krause; Ralf Münnich 
Abstract:  We consider a situation where the sample design of a survey is modified over time in order to save resources. The former design is a classical largescale survey. The new design is a mixed mode survey where a smaller classical sample is augmented by records of an online survey. For the online survey no inclusion probabilities are available. We study how this change of data collection affects regression coefficient estimation when the model remains constant in the population over time. Special emphasis is placed on situations where the online records are selective with respect to the model. We develop a statistical framework to quantify socalled survey discontinuities in regression analysis. The term refers to differences between coefficient estimates that solely stem from the survey redesign. For this purpose, we apply hypothesis tests to identify whether observed differences in estimates are significant. Further, we discuss propensity estimation and calibration as potential methods to reduce selection biases stemming from the web survey. A Monte Carlo simulation study is conducted to test the methods under different degrees of selectivity. We find that even mild informativeness significantly impairs regression inference relative to the former survey despite bias correction. 
Keywords:  Calibration, hypothesis test, informative sampling, propensity score estimation 
Date:  2020 
URL:  http://d.repec.org/n?u=RePEc:trr:wpaper:202003&r=all 
By:  Lusompa, Amaze 
Abstract:  It is well known that Local Projections (LP) residuals are autocorrelated. Conventional wisdom says that LP have to be estimated by OLS with Newey and West (1987) (or some type of Heteroskedastic and Autocorrelation Consistent (HAC)) standard errors and that GLS is not possible because the autocorrelation process is unknown. I show that the autocorrelation process of LP is known and that autocorrelation can be corrected for using GLS. Estimating LP with GLS has three major implications: 1) LP GLS can be substantially more efficient and less biased than estimation by OLS with NeweyWest standard errors. 2) Since the autocorrelation process can be modeled explicitly, it is possible to give a fully Bayesian treatment of LP. That is, LP can be estimated using frequentist/classical or fully Bayesian methods. 3) Since the autocorrelation process can be modeled explicitly, it is now possible to estimate timevarying parameter LP. 
Keywords:  Impulse Response, Local Projections, Autocorrelation, GLS 
JEL:  C1 C11 C2 C22 C3 C32 
Date:  2019–11–14 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:99856&r=all 
By:  Sung Jae Jun; Sokbae Lee 
Abstract:  We investigate identification of causal parameters in casecontrol and related studies. The odds ratio in the sample is our main estimand of interest and we articulate its relationship with causal parameters under various scenarios. It turns out that the odds ratio is generally a sharp upper bound for counterfactual relative risk under some monotonicity assumptions, without resorting to strong ignorability, nor to the raredisease assumption. Further, we propose semparametrically efficient, easytoimplement, machinelearningfriendly estimators of the aggregated (log) odds ratio by exploiting an explicit form of the efficient influence function. Using our new estimators, we develop methods for causal inference and illustrate the usefulness of our methods by a realdata example. 
Date:  2020–04 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2004.08318&r=all 
By:  Oskar Gustafsson; Mattias Villani; P\"ar Stockhammar 
Abstract:  Bayesian models often involve a small set of hyperparameters determined by maximizing the marginal likelihood. Bayesian optimization is a popular iterative method where a Gaussian process posterior of the underlying function is sequentially updated by new function evaluations. An acquisition strategy uses this posterior distribution to decide where to place the next function evaluation. We propose a novel Bayesian optimization framework for situations where the user controls the computational effort, and therefore the precision of the function evaluations. This is a common situation in econometrics where the marginal likelihood is often computed by Markov Chain Monte Carlo (MCMC) methods, with the precision of the marginal likelihood estimate determined by the number of MCMC draws. The proposed acquisition strategy gives the optimizer the option to explore the function with cheap noisy evaluations and therefore finds the optimum faster. Prior hyperparameter estimation in the steadystate Bayesian vector autoregressive (BVAR) model on US macroeconomic time series data is used for illustration. The proposed method is shown to find the optimum much quicker than traditional Bayesian optimization or grid search. 
Date:  2020–04 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2004.10092&r=all 
By:  S. Boragan Aruoba; Pablo CubaBorda; Kenji HigaFlores; Frank Schorfheide; Sergio Villalvazo 
Abstract:  We develop an algorithm to construct approximate decision rules that are piecewiselinear and continuous for DSGE models with an occasionally binding constraint. The functional form of the decision rules allows us to derive a conditionally optimal particle ﬁlter (COPF) for the evaluation of the likelihood function that exploits the structure of the solution. We document the accuracy of the likelihood approximation and embed it into a particle Markov chain Monte Carlo algorithm to conduct Bayesian estimation. Compared with a standard bootstrap particle ﬁlter, the COPF signiﬁcantly reduces the persistence of the Markov chain, improves the accuracy of Monte Carlo approximations of posterior moments, and drastically speeds up computations. We use the techniques to estimate a smallscale DSGE model to assess the eﬀects of the government spending portion of the American Recovery and Reinvestment Act in 2009 when interest rates reached the zero lower bound. 
Keywords:  ZLB; Bayesian Estimation; Nonlinear Solution Methods; Nonlinear Filtering; Particle MCMC 
JEL:  C5 E5 E4 
Date:  2020–04–06 
URL:  http://d.repec.org/n?u=RePEc:fip:fedpwp:87720&r=all 
By:  Christiane Baumeister; James D. Hamilton 
Abstract:  This paper examines methods for structural interpretation of vector autoregressions when the identifying information is regarded as imperfect or incomplete. We suggest that a Bayesian approach offers a unifying theme for guiding inference in such settings. Among other advantages, the unified approach solves a problem with calculating elasticities that appears not to have been recognized by earlier researchers. We also call attention to some computational concerns of which researchers who approach this problem using other methods should be aware. 
JEL:  C11 C32 Q43 
Date:  2020–04 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:27014&r=all 
By:  Benjamin Avanzi; Gregory Clive Taylor; Bernard Wong; Xinda Yang 
Abstract:  In this paper, we focus on estimating ultimate claim counts in multiple insurance processes and thus extend the associated literature of microlevel stochastic reserving models to the multivariate context. Specifically, we develop a multivariate Cox process to model the joint arrival process of insurance claims in multiple Lines of Business. The dependency structure is introduced via multivariate shot noise intensity processes which are connected with the help of L\'evy copulas. Such a construction is more parsimonious and tractable in higher dimensions than plain vanilla common shock models. We also consider practical implementation and explicitly incorporate known covariates, such as seasonality patterns and trends, which may explain some of the relationship between two insurance processes (or at least help tease out those relationships). We develop a filtering algorithm based on the reversiblejump Markov Chain Monte Carlo (RJMCMC) method to estimate the unobservable stochastic intensities. Model calibration is illustrated using real data from the AUSI data set. 
Date:  2020–04 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2004.11169&r=all 
By:  Marek Stelmach (Faculty of Economic Sciences, University of Warsaw); Marcin Chlebus (Faculty of Economic Sciences, University of Warsaw) 
Abstract:  Stacked ensembles approaches have been recently gaining importance in complex predictive problems where extraordinary performance is desirable. In this paper we develop a multilayer stacking framework and apply it to a large dataset related to credit scoring with multiple, imbalanced classes. Diverse base estimators (among others, bagged and boosted tree algorithms, regularized logistic regression, neural networks, Naive Bayes classifier) are examined and we propose three meta learners to be finally combined into a novel, weighted ensemble. To prevent bias in meta features construction, we introduce a nested crossvalidation schema into the architecture, while weighted log loss evaluation metric is used to overcome training bias towards the majority class. Additional emphasis is placed on a proper data preprocessing steps and Bayesian optimization for hyperparameter tuning to ensure that the solution do not overfits. Our study indicates better stacking results compared to all individual base classifiers, yet we stress the importance of an assessment whether the improvement compensates increased computational time and design complexity. Furthermore, conducted analysis shows extremely good performance among bagged and boosted trees, both in base and meta learning phase. We conclude with a thesis that a weighted meta ensemble with regularization properties reveals the least overfitting tendencies. 
Keywords:  stacked ensembles, nested crossvalidation, Bayesian optimization, multiclass problem, imbalanced classes 
JEL:  G32 C38 C51 C52 C55 
Date:  2020 
URL:  http://d.repec.org/n?u=RePEc:war:wpaper:202008&r=all 
By:  Faria, Gonçalo; Verona, Fabio 
Abstract:  Any time series can be decomposed into cyclical components fluctuating at different frequencies. Accordingly, in this paper we propose a method to forecast the stock market's equity premium which exploits the frequency relationship between the equity premium and several predictor variables. We evaluate a large set of models and find that, by selecting the relevant frequencies for equity premium forecasting, this method significantly improves in both statistical and economic sense upon standard time series forecasting methods. This improvement is robust regardless of the predictor used, the outofsample period considered, and the frequency of the data used. 
JEL:  C58 G11 G17 
Date:  2020–04–27 
URL:  http://d.repec.org/n?u=RePEc:bof:bofrdp:2020_006&r=all 
By:  Jose Apesteguia; Miguel Ángel Ballester 
Abstract:  We study random utility models in which heterogeneity of preferences is modeled using an ordered collection of utilities, or types. The paper shows that these models are particularly amenable when combined with domains in which the alternatives of each decision problem are ordered by the structure of the types. We enhance their applicability by: (i) working with arbitrary domains composed of such decision problems, i.e., we do not need to assume any particularly rich data domain, and (ii) making no parametric assumption, i.e., we do not need to formulate any particular assumption on the distribution over the collection of types. We characterize the model by way of two simple properties and show the applicability of our result in settings involving decisions under risk. We also propose a goodnessof t measure for the model and prove the strong consistency of extremum estimators defined upon it. We conclude by applying the model to a dataset on lottery choices. 
Keywords:  random utility model, ordered typedependent utilities, arbitrary domains, nonparametric, goodnessoffit, extremum estimators, decision under risk 
JEL:  C00 D00 
Date:  2020–04 
URL:  http://d.repec.org/n?u=RePEc:bge:wpaper:1176&r=all 
By:  JeanMarie Dufour; Emmanuel Flachaire; Lynda Khalaf; Abdallah Zalghout 
Abstract:  We propose confidence sets for inequality indices and their differences, which are robust to the fact that such measures involve possibly weakly identified parameter ratios. We also document the fragility of decisions that rely on traditional interpretations of  significant or insignificant  comparisons when the tested differences can be weakly identified. Proposed methods are applied to study economic convergence across U.S. states and nonOECD countries. With reference to the growth literature which typically uses the variance of log percapita income to measure dispersion, results confirm the importance of accounting for microfounded axioms and shed new light on enduring controversies surrounding convergence. 
Date:  2020–04–23 
URL:  http://d.repec.org/n?u=RePEc:cir:cirwor:2020s23&r=all 
By:  Alje van Dam; Andres GomezLievano; Frank Neffke; Koen Frenken 
Abstract:  We propose a statistical framework to quantify location and colocation associations of economic activities using informationtheoretic measures. We relate the resulting measures to existing measures of revealed comparative advantage, localization and specialization and show that they can all be seen as part of the same framework. Using a Bayesian approach, we provide measures of uncertainty of the estimated quantities. Furthermore, the informationtheoretic approach can be readily extended to move beyond pairwise colocations and instead capture multivariate associations. To illustrate the framework, we apply our measures to the colocation of occupations in US cities, showing the associations between different groups of occupations. 
Date:  2020–04 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2004.10548&r=all 
By:  Wei Wei; Asger Lunde 
Abstract:  We propose a multifactor model and an estimation method based on particle MCMC to identify risk factors in electricity prices. Our model identifies longrun prices, shortrun deviations, and spikes as three main risk factors in electricity spot prices. Under our model, different risk factors have distinct impacts on futures prices and can carry different risk premia. We generalize the FamaFrench regressions to analyze properties of true risk premia. We show that model specification plays an important role in detecting time varying risk premia. Using spot and futures prices in the Germany/Austria market, we demonstrate that our proposed model surpasses alternative models that have less risk factors in forecasting spot prices and in detecting time varying risk premia. 
Keywords:  Risk factors, risk premia, futures, particle filter, MCMC. 
JEL:  C51 G13 Q4 
Date:  2020 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:202010&r=all 
By:  Sergio Firpo; Antonio F. Galvao; Martyna Kobus; Thomas Parker; Pedro RosaDias 
Abstract:  In this paper we develop theoretical criteria and econometric methods to rank policy interventions in terms of welfare when individuals are lossaverse. The new criterion for "loss aversionsensitive dominance" defines a weak partial ordering of the distributions of policyinduced gains and losses. It applies to the class of welfare functions which model individual preferences with nondecreasing and lossaverse attitudes towards changes in outcomes. We also develop new statistical methods to test loss aversionsensitive dominance in practice, using nonparametric plugin estimates. We establish the limiting distributions of uniform test statistics by showing that they are directionally differentiable. This implies that inference can be conducted by a special resampling procedure. Since pointidentification of the distribution of policyinduced gains and losses may require very strong assumptions, we also extend comparison criteria, test statistics, and resampling procedures to the partiallyidentified case. Finally, we illustrate our methods with an empirical application to welfare comparison of two income support programs. 
Date:  2020–04 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2004.08468&r=all 
By:  Charles F. Manski; Francesca Molinari 
Abstract:  As a consequence of missing data on tests for infection and imperfect accuracy of tests, reported rates of population infection by the SARS CoV2 virus are lower than actual rates of infection. Hence, reported rates of severe illness conditional on infection are higher than actual rates. Understanding the time path of the COVID19 pandemic has been hampered by the absence of bounds on infection rates that are credible and informative. This paper explains the logical problem of bounding these rates and reports illustrative findings, using data from Illinois, New York, and Italy. We combine the data with assumptions on the infection rate in the untested population and on the accuracy of the tests that appear credible in the current context. We find that the infection rate might be substantially higher than reported. We also find that the infection fatality rate in Italy is substantially lower than reported. 
JEL:  C13 I10 
Date:  2020–04 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:27023&r=all 