
on Econometrics 
By:  Bo Zhang; Jiti Gao; Guangming Pan; Yanrong Yang 
Abstract:  This paper establishes asymptotic properties for spiked empirical eigenvalues of sample covariance matrices for highdimensional data with both crosssectional dependence and a dependent sample structure. A new finding from the established theoretical results is that spiked empirical eigenvalues will reflect the dependent sample structure instead of the crosssectional structure under some scenarios, which indicates that principal component analysis (PCA) may provide inaccurate inference for crosssectional structures. An illustrated example is provided to show that some commonly used statistics based on spiked empirical eigenvalues misestimate the true number of common factors. As an application of highdimensional time series, we propose a test statistic to distinguish the unit root from the factor structure and demonstrate its effective finite sample performance on simulated data. Our results are then applied to analyze OECD healthcare expenditure data and U.S. mortality data, both of which possess crosssectional dependence as well as nonstationary temporal dependence. It is worth mentioning that we contribute to statistical justification for the benchmark paper by Lee and Carter [25] in mortality forecasting. 
Keywords:  factor model, highdimensional data, principal component analysis, spiked empirical eigenvalue. 
JEL:  C21 C32 C55 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:201931&r=all 
By:  Bodnar, Taras; Dette, Holger; Parolya, Nestor 
Abstract:  In this paper, new tests for the independence of two highdimensional vectors are investigated. We consider the case where the dimension of the vectors increases with the sample size and propose multivariate analysis of variancetype statistics for the hypothesis of a block diagonal covariance matrix. The asymptotic properties of the new test statistics are investigated under the null hypothesis and the alternative hypothesis using random matrix theory. For this purpose, we study the weak convergence of linear spectral statistics of central and (conditionally) noncentral Fisher matrices. In particular, a central limit theorem for linear spectral statistics of large dimensional(conditionally) noncentral Fisher matrices is derived which is then used to analyse the power of the tests under the alternative. The theoretical results are illustrated by means of a simulation study where we also compare the new tests with several alternatives, in particular with the commonly used corrected likelihood ratio test. It is demonstrated that the latter test does not keep its nominal level if the dimension of one subvector is relatively small compared to the dimension of the other subvector.On the other hand, the tests proposed in this paper provide a reasonable approximation of the nominal level in such situations. Moreover, we observe that one of the proposed tests is most powerful under a variety of correlation scenarios. 
Keywords:  Testing for independence, large dimensional covariance matrix, noncentral Fisher random matrix, linear spectral statistics, asymptotic normality 
JEL:  C12 C18 
Date:  2019–08–03 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:97997&r=all 
By:  Maxwell King; Xibin Zhang; Muhammad Akram 
Abstract:  This paper presents a new approach to hypothesis testing based on a vector of statistics. It involves simulating the statistics under the null hypothesis and then estimating the joint density of the statistics. This allows the pvalue of the smallest acceptance region test to be estimated. We prove this pvalue is a consistent estimate under some regularity conditions. The smallsample properties of the proposed procedure are investigated in the context of testing for autocorrelation, testing for normality, and testing for model misspecification through the information matrix. We find that our testing procedure has appropriate sizes and good powers. 
Keywords:  bootstrap, crossmarket prediction, information matrix test, Markov chain Monte Carlo, multivariate kernel density, pvalue. 
JEL:  C01 C12 C14 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:201930&r=all 
By:  Ruben LoaizaMaya; Gael M Martin; David T. Frazier 
Abstract:  We propose a new method for conducting Bayesian prediction that delivers accurate predictions without correctly specifying the unknown true data generating process. A prior is defined over a class of plausible predictive models. After observing data, we update the prior to a posterior over these models, via a criterion that captures a userspecified measure of predictive accuracy. Under regularity, this update yields posterior concentration onto the element of the predictive class that maximizes the expectation of the accuracy measure. In a series of simulation experiments and empirical examples we find notable gains in predictive accuracy relative to conventional likelihoodbased prediction.. 
Keywords:  lossbased prediction, Bayesian forecasting, proper scoring rules, stochastic volatility model, expected shortfall, M4 forecasting competition. 
JEL:  C11 C53 C58 
Date:  2020 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:20201&r=all 
By:  D Bloznelis; Gerda Claeskens; Jing Zhou 
Abstract:  The composite quantile estimator is a robust and efficient alternative to the leastsquares estimator in linear models. However, it is computationally demanding when the number of quantiles is large. We consider a modelaveraged quantile estimator as a computationally cheaper alternative. We derive its asymptotic properties in highdimensional linear models and compare its performance to the composite quantile estimator in both low and highdimensional settings. We also assess the effect on efficiency of using equal weights, theoretically optimal weights, and estimated optimal weights for combining the different quantiles. None of the estimators dominates in all settings under consideration, thus leaving room for both modelaveraged and composite estimators, both with equal and estimated optimal weights in practice. 
Keywords:  Quantile regression, Model averaging, Composite estimation, Penalized estimation, Weight choice 
Date:  2018–10 
URL:  http://d.repec.org/n?u=RePEc:ete:kbiper:627929&r=all 
By:  Paul H\"unermund (Maastricht University); Elias Bareinboim (Columbia University) 
Abstract:  Learning about cause and effect is arguably the main goal in applied econometrics. In practice, the validity of these causal inferences is contingent on a number of critical assumptions regarding the type of data and the substantive knowledge that is available about the phenomenon under investigation. For instance, unobserved confounding factors threaten the internal validity of estimates, data availability is often limited to nonrandom, selectionbiased samples, causal effects need to be learned from surrogate experiments with imperfect compliance, and causal knowledge has to be extrapolated across structurally heterogeneous populations. A powerful causal inference framework is required in order to tackle all of these challenges, which plague essentially any data analysis to varying degrees. Building on the structural approach to causality introduced by Haavelmo (1943) and the graphtheoretic framework proposed by Pearl (1995), the AI literature has developed a wide array of techniques for causal learning that allow to leverage information from various imperfect, heterogeneous, and biased data sources. In this paper, we discuss recent advances made in this literature that have the potential to contribute to econometric methodology along three broad dimensions. First, they provide a unified and comprehensive framework for causal inference, in which the abovementioned problems can be addressed in full generality. Second, due to their origin in AI, they come together with sound, efficient, and complete algorithmic criteria for automatization of the corresponding identification task. And third, because of the nonparametric description of structural models that graphtheoretic approaches build on, they combine the strengths of both structural econometrics as well as the potential outcomes framework, and thus offer a perfect middle ground between these two competing literature streams. 
Date:  2019–12 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1912.09104&r=all 
By:  Bakbergenuly, Ilyas; Hoaglin, David C.; Kulinskaya, Elena 
Abstract:  Methods for randomeffects metaanalysis require an estimate of the betweenstudy variance, $\tau^2$. The performance of estimators of $\tau^2$ (measured by bias and coverage) affects their usefulness in assessing heterogeneity of studylevel effects, and also the performance of related estimators of the overall effect. For the effect measure logresponseratio (LRR, also known as the logarithm of the ratio of means, RoM), we review four point estimators of $\tau^2$ (the popular methods of DerSimonianLaird (DL), restricted maximum likelihood, and Mandel and Paule (MP), and the lessfamiliar method of Jackson), four interval estimators for $\tau^2$ (profile likelihood, Qprofile, Biggerstaff and Jackson, and Jackson), five point estimators of the overall effect (the four related to the point estimators of $\tau^2$ and an estimator whose weights use only studylevel sample sizes), and seven interval estimators for the overall effect (four based on the point estimators for $\tau^2$, the HartungKnappSidikJonkman (HKSJ) interval, a modification of HKSJ that uses the MP estimator of $\tau^2$ instead of the DL estimator, and an interval based on the samplesizeweighted estimator). We obtain empirical evidence from extensive simulations of data from normal distributions. Simulations from lognormal distributions are in a separate report Bakbergenuly et al. 2019b. 
Date:  2020–01–07 
URL:  http://d.repec.org/n?u=RePEc:osf:metaar:3bnxs&r=all 
By:  Döhrn, Roland 
Abstract:  The DieboldMarianoTest has become a common tool to compare the accuracy of macroeconomic forecasts. Since these are typically modelfree forecasts, distribution free tests might be a good alternative to the DieboldMarianoTest. This paper suggests a permutation test. Stochastic simulations show that permutation tests outperform the DieboldMarianoTest. Furthermore, a test statistic based on absolute errors seems to be more sensitive to differences in forecast accuracy than a statistic based on squared errors. 
Keywords:  macroeconomic forecast,forecast accuracy,DieboldMariano test,permutation test 
JEL:  C14 C15 C53 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:zbw:rwirep:833&r=all 
By:  HAFNER Christian M., (Université catholique de Louvain, Belgium); WANG Linqi, (Université catholique de Louvain, Belgium) 
Abstract:  This paper proposes a new model for the dynamics of correlation matrices, where the dynamics are driven by the likelihood score with respect to the matrix logarithm of the correlation matrix. In analogy to the exponential GARCH model for volatility, this transformation ensures that the correlation matrices remain positive defi nite, even in high dimensions. For the conditional distribution of returns we assume a studentt copula to explain the dependence structure and univariate studentt for the marginals with potentially diff erent degrees of freedom. The separation into volatility and correlation parts allows twostep estimation, which facilitates estimation in high dimensions. We derive estimation theory for onestep and twostep estimation. In an application to a set of six asset indices including nancial and alternative assets we show that the model performs well in terms of various diagnostics and speci cation tests. 
Keywords:  score, correlation, matrix logarithm, identification 
JEL:  C14 C43 Z11 
Date:  2019–12–17 
URL:  http://d.repec.org/n?u=RePEc:cor:louvco:2019031&r=all 
By:  BAUWENS Luc, (Université catholique de Louvain, CORE, Belgium); XU Yongdeng, (Cardiff University) 
Abstract:  This paper introduces the DCCHEAVY and DECOHEAVY models, which are dynamic models for conditional variances and correlations for daily returns based on measures of realized variances and correlations built from intraday data. Formulas for multistep forecasts of conditional variances and correlations are provideid. Asymmetric versions of the models are developed. An empirical study shows that in terms of forecoaosts the new HEAVY models outperform the BEKKHEAVY model based on realized covariances, and the BEKK, DCC and DECO multivariate GARCH models based exclusively on daily data. 
Keywords:  dynamic conditional correlations, forecasting, multivariate HEAVY, multivariate GARCH, realized correlations 
JEL:  C32 C58 G17 
Date:  2019–12–17 
URL:  http://d.repec.org/n?u=RePEc:cor:louvco:2019025&r=all 
By:  Badi Baltagi (Center for Policy Research, Maxwell School, Syracuse University, 426 Eggers Hall, Syracuse, NY 13244); Long Liu (College of Business, University of Texas at San Antonio) 
Abstract:  This paper derives the best linear unbiased prediction (BLUP) for an unbalanced panel data model. Starting with a simple error component regression model with unbalanced panel data and random effects, it generalizes the BLUP derived by Taub (1979) to unbalanced panels. Next it derives the BLUP for an unequally spaced panel data model with serial correlation of the AR(1) type in the remainder disturbances considered by Baltagi and Wu (1999). This in turn extends the BLUP for a panel data model with AR(1) type remainder disturbances derived by Baltagi and Li (1992) from the balanced to the unequally spaced panel data case. The derivations are easily implemented and reduce to tractable expressions using an extension of the Fuller and Battese (1974) transformation from the balanced to the unbalanced panel data case. 
Keywords:  Forecasting, BLUP, Unbalanced Panel Data, Unequally Spaced Panels, Serial Correlation 
JEL:  C33 
Date:  2020–01 
URL:  http://d.repec.org/n?u=RePEc:max:cprwps:221&r=all 
By:  Rafael Wildauer (University of Greenwich); Jakob Kapeller 
Abstract:  Taking survey data on household wealth as our major example, this short paper discusses some of the issues applied researchers are facing when fitting (type I) Pareto distributions to complex survey data. The major contribution of this paper is twofold: First, we propose a new and intuitive way of deriving Gabaix and Ibragimov’s (2011) bias correction for Pareto tail estimations from which the generalization to complex survey data follows naturally. Second, we summarise how KolmogorovSmirnof and CramervonMises goodness of fit tests can be generalized to complex survey data. Taken together we think the paper provides a concise and useful presentation of the fundamentals of Pareto tail fitting with complex survey data. 
Keywords:  Pareto distribution, complex survey data, wealth distribution 
JEL:  C46 C83 D31 
Date:  2020–01 
URL:  http://d.repec.org/n?u=RePEc:pke:wpaper:pkwp2001&r=all 
By:  Abito, Jose Miguel 
Abstract:  I propose an estimation procedure that can accommodate fixed effects in the widely used proxy variable approach to estimating production functions. The procedure allows unobserved productivity to have a permanent component in addition to a (nonlinear) Markov shock. The procedure does not rely on differencing out the fixed effect and thus is not restricted to withinfirm variation for identification. Finally, the procedure is easy to implement as it only entails adding a two stage least squares step using internal instruments. 
Keywords:  Production function, Estimation, Fixed Effects, Unobserved productivity, Proxy variables, ErrorsinVariables, Instrumental variables 
JEL:  C0 C01 L0 L00 O4 
Date:  2019–12–24 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:97825&r=all 
By:  Maxwell Kellogg; Magne Mogstad; Guillaume Pouliot; Alexander Torgovitsky 
Abstract:  The synthetic control method is widely used in comparative case studies to adjust for differences in pretreatment characteristics. A major attraction of the method is that it limits extrapolation bias that can occur when untreated units with different pretreatment characteristics are combined using a traditional adjustment, such as a linear regression. Instead, the SC estimator is susceptible to interpolation bias because it uses a convex weighted average of the untreated units to create a synthetic untreated unit with pretreatment characteristics similar to those of the treated unit. More traditional matching estimators exhibit the opposite behavior: They limit interpolation bias at the potential expense of extrapolation bias. We propose combining the matching and synthetic control estimators through model averaging. We show how to use a rollingorigin crossvalidation procedure to train the model averaging estimator to resolve tradeoffs between interpolation and extrapolation bias. We evaluate the estimator through Monte Carlo simulations and placebo studies before using it to reexamine the economic costs of conflicts. Not only does the model averaging estimator perform far better than synthetic controls and other alternatives in the simulations and placebo exercises. It also yields treatment effect estimates that are substantially different from the other estimators. 
JEL:  C0 H0 J0 
Date:  2020–01 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:26624&r=all 
By:  Rahal, Charles 
Abstract:  We outline a gridbased approach to provide further evidence against the misconception that the results of spatial econometric models are sensitive to the exact specification of the exogenously set weighting matrix (otherwise known as the 'biggest myth in spatial econometrics'). Our application estimates three large sets of specifications using an original dataset which contains information on the Prime Central London housing market. We show that while posterior model probabilities may indicate a strong preference for an extremely small number of models, and while the spatial autocorrelation parameter varies substantially, median direct effects remain stable across the entire permissible spatial weighting matrix space. We argue that spatial econometric models should be estimated across this entire space, as opposed to the current convention of merely estimating a cursory number of points for robustness. 
Date:  2019–12–10 
URL:  http://d.repec.org/n?u=RePEc:osf:socarx:nt2yq&r=all 
By:  Christiane Baumeister; James D. Hamilton 
Abstract:  This paper discusses the problems associated with using information about the signs of certain magnitudes as a basis for drawing structural conclusions in vector autoregressions. We also review available tools to solve these problems. For illustration we use Dahlhaus and Vasishtha's (2019) study of the effects of a U.S. monetary contraction on capital flows to emerging markets. We explain why sign restrictions alone are not enough to allow us to answer the question and suggest alternative approaches that could be used. 
JEL:  C30 E5 F2 
Date:  2020–01 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:26606&r=all 
By:  James J. Heckman (University of Chicago) 
Abstract:  This paper examines the case for randomized controlled trials in economics. I revisit my previous paper"Randomization and Social Policy Evaluation"and update its message. I present a brief summary of the history of randomization in economics. I identify two waves of enthusiasm for the method as "Two Awakenings" because of the nearreligious zeal associated with each wave. The First Wave substantially contributed to the development of microeconometrics because of the flawed nature of the experimental evidence. The Second Wave has improved experimental designs to avoid some of the technical statistical issues identified by econometricians in the wake of the First Wave. However, the deep conceptual issues about parameters estimated, and the economic interpretation and the policy relevance of the experimental results have not been addressed in the Second Wave. 
Keywords:  field experiments, randomized control trials 
JEL:  C93 
Date:  2020–01 
URL:  http://d.repec.org/n?u=RePEc:hka:wpaper:2020001&r=all 