nep-ecm New Economics Papers
on Econometrics
Issue of 2020‒01‒20
seventeen papers chosen by
Sune Karlsson
Örebro universitet

  1. Spiked Eigenvalues of High-Dimensional Separable Sample Covariance Matrices By Bo Zhang; Jiti Gao; Guangming Pan; Yanrong Yang
  2. Testing for independence of large dimensional vectors By Bodnar, Taras; Dette, Holger; Parolya, Nestor
  3. Hypothesis Testing Based on A Vector of Statistics By Maxwell King; Xibin Zhang; Muhammad Akram
  4. Focused Bayesian Prediction By Ruben Loaiza-Maya; Gael M Martin; David T. Frazier
  5. Composite versus model-averaged quantile regression By D Bloznelis; Gerda Claeskens; Jing Zhou
  6. Causal Inference and Data-Fusion in Econometrics By Paul H\"unermund; Elias Bareinboim
  7. Simulation study of estimating between-study variance and overall effect in meta-analyses of log-response-ratio for normal data By Bakbergenuly, Ilyas; Hoaglin, David C.; Kulinskaya, Elena
  8. Comparing forecast accuracy in small samples By Döhrn, Roland
  9. A dynamic conditional score model for the log correlation matrix By HAFNER Christian M.,; WANG Linqi,
  10. DCC-HEAVY: A multivariate GARCH model based on realized variances and correlations By BAUWENS Luc,; XU Yongdeng,
  11. Forecasting with Unbalanced Panel Data By Badi Baltagi; Long Liu
  12. A comment on fitting Pareto tails to complex survey data By Rafael Wildauer; Jakob Kapeller
  13. Estimating Production Functions with Fixed Effects By Abito, Jose Miguel
  14. Combining Matching and Synthetic Controls to Trade off Biases from Extrapolation and Interpolation By Maxwell Kellogg; Magne Mogstad; Guillaume Pouliot; Alexander Torgovitsky
  15. A Grid Based Approach to Analysing Spatial Weighting Matrix Specification By Rahal, Charles
  16. Drawing Conclusions from Structural Vector Autoregressions Identified on the Basis of Sign Restrictions By Christiane Baumeister; James D. Hamilton
  17. Randomization and Social Policy Evaluation Revisited By James J. Heckman

  1. By: Bo Zhang; Jiti Gao; Guangming Pan; Yanrong Yang
    Abstract: This paper establishes asymptotic properties for spiked empirical eigenvalues of sample covariance matrices for high-dimensional data with both cross-sectional dependence and a dependent sample structure. A new finding from the established theoretical results is that spiked empirical eigenvalues will reflect the dependent sample structure instead of the cross-sectional structure under some scenarios, which indicates that principal component analysis (PCA) may provide inaccurate inference for cross-sectional structures. An illustrated example is provided to show that some commonly used statistics based on spiked empirical eigenvalues misestimate the true number of common factors. As an application of high-dimensional time series, we propose a test statistic to distinguish the unit root from the factor structure and demonstrate its effective finite sample performance on simulated data. Our results are then applied to analyze OECD healthcare expenditure data and U.S. mortality data, both of which possess cross-sectional dependence as well as non-stationary temporal dependence. It is worth mentioning that we contribute to statistical justification for the benchmark paper by Lee and Carter [25] in mortality forecasting.
    Keywords: factor model, high-dimensional data, principal component analysis, spiked empirical eigenvalue.
    JEL: C21 C32 C55
    Date: 2019
  2. By: Bodnar, Taras; Dette, Holger; Parolya, Nestor
    Abstract: In this paper, new tests for the independence of two high-dimensional vectors are investigated. We consider the case where the dimension of the vectors increases with the sample size and propose multivariate analysis of variance-type statistics for the hypothesis of a block diagonal covariance matrix. The asymptotic properties of the new test statistics are investigated under the null hypothesis and the alternative hypothesis using random matrix theory. For this purpose, we study the weak convergence of linear spectral statistics of central and (conditionally) noncentral Fisher matrices. In particular, a central limit theorem for linear spectral statistics of large dimensional(conditionally) noncentral Fisher matrices is derived which is then used to analyse the power of the tests under the alternative. The theoretical results are illustrated by means of a simulation study where we also compare the new tests with several alternatives, in particular with the commonly used corrected likelihood ratio test. It is demonstrated that the latter test does not keep its nominal level if the dimension of one sub-vector is relatively small compared to the dimension of the other sub-vector.On the other hand, the tests proposed in this paper provide a reasonable approximation of the nominal level in such situations. Moreover, we observe that one of the proposed tests is most powerful under a variety of correlation scenarios.
    Keywords: Testing for independence, large dimensional covariance matrix, noncentral Fisher random matrix, linear spectral statistics, asymptotic normality
    JEL: C12 C18
    Date: 2019–08–03
  3. By: Maxwell King; Xibin Zhang; Muhammad Akram
    Abstract: This paper presents a new approach to hypothesis testing based on a vector of statistics. It involves simulating the statistics under the null hypothesis and then estimating the joint density of the statistics. This allows the p-value of the smallest acceptance region test to be estimated. We prove this p-value is a consistent estimate under some regularity conditions. The small-sample properties of the proposed procedure are investigated in the context of testing for autocorrelation, testing for normality, and testing for model misspecification through the information matrix. We find that our testing procedure has appropriate sizes and good powers.
    Keywords: bootstrap, cross-market prediction, information matrix test, Markov chain Monte Carlo, multivariate kernel density, p-value.
    JEL: C01 C12 C14
    Date: 2019
  4. By: Ruben Loaiza-Maya; Gael M Martin; David T. Frazier
    Abstract: We propose a new method for conducting Bayesian prediction that delivers accurate predictions without correctly specifying the unknown true data generating process. A prior is defined over a class of plausible predictive models. After observing data, we update the prior to a posterior over these models, via a criterion that captures a user-specified measure of predictive accuracy. Under regularity, this update yields posterior concentration onto the element of the predictive class that maximizes the expectation of the accuracy measure. In a series of simulation experiments and empirical examples we find notable gains in predictive accuracy relative to conventional likelihood-based prediction..
    Keywords: loss-based prediction, Bayesian forecasting, proper scoring rules, stochastic volatility model, expected shortfall, M4 forecasting competition.
    JEL: C11 C53 C58
    Date: 2020
  5. By: D Bloznelis; Gerda Claeskens; Jing Zhou
    Abstract: The composite quantile estimator is a robust and efficient alternative to the least-squares estimator in linear models. However, it is computationally demanding when the number of quantiles is large. We consider a model-averaged quantile estimator as a computationally cheaper alternative. We derive its asymptotic properties in high-dimensional linear models and compare its performance to the composite quantile estimator in both low- and high-dimensional settings. We also assess the effect on efficiency of using equal weights, theoretically optimal weights, and estimated optimal weights for combining the different quantiles. None of the estimators dominates in all settings under consideration, thus leaving room for both model-averaged and composite estimators, both with equal and estimated optimal weights in practice.
    Keywords: Quantile regression, Model averaging, Composite estimation, Penalized estimation, Weight choice
    Date: 2018–10
  6. By: Paul H\"unermund (Maastricht University); Elias Bareinboim (Columbia University)
    Abstract: Learning about cause and effect is arguably the main goal in applied econometrics. In practice, the validity of these causal inferences is contingent on a number of critical assumptions regarding the type of data and the substantive knowledge that is available about the phenomenon under investigation. For instance, unobserved confounding factors threaten the internal validity of estimates, data availability is often limited to non-random, selection-biased samples, causal effects need to be learned from surrogate experiments with imperfect compliance, and causal knowledge has to be extrapolated across structurally heterogeneous populations. A powerful causal inference framework is required in order to tackle all of these challenges, which plague essentially any data analysis to varying degrees. Building on the structural approach to causality introduced by Haavelmo (1943) and the graph-theoretic framework proposed by Pearl (1995), the AI literature has developed a wide array of techniques for causal learning that allow to leverage information from various imperfect, heterogeneous, and biased data sources. In this paper, we discuss recent advances made in this literature that have the potential to contribute to econometric methodology along three broad dimensions. First, they provide a unified and comprehensive framework for causal inference, in which the above-mentioned problems can be addressed in full generality. Second, due to their origin in AI, they come together with sound, efficient, and complete algorithmic criteria for automatization of the corresponding identification task. And third, because of the nonparametric description of structural models that graph-theoretic approaches build on, they combine the strengths of both structural econometrics as well as the potential outcomes framework, and thus offer a perfect middle ground between these two competing literature streams.
    Date: 2019–12
  7. By: Bakbergenuly, Ilyas; Hoaglin, David C.; Kulinskaya, Elena
    Abstract: Methods for random-effects meta-analysis require an estimate of the between-study variance, $\tau^2$. The performance of estimators of $\tau^2$ (measured by bias and coverage) affects their usefulness in assessing heterogeneity of study-level effects, and also the performance of related estimators of the overall effect. For the effect measure log-response-ratio (LRR, also known as the logarithm of the ratio of means, RoM), we review four point estimators of $\tau^2$ (the popular methods of DerSimonian-Laird (DL), restricted maximum likelihood, and Mandel and Paule (MP), and the less-familiar method of Jackson), four interval estimators for $\tau^2$ (profile likelihood, Q-profile, Biggerstaff and Jackson, and Jackson), five point estimators of the overall effect (the four related to the point estimators of $\tau^2$ and an estimator whose weights use only study-level sample sizes), and seven interval estimators for the overall effect (four based on the point estimators for $\tau^2$, the Hartung-Knapp-Sidik-Jonkman (HKSJ) interval, a modification of HKSJ that uses the MP estimator of $\tau^2$ instead of the DL estimator, and an interval based on the sample-size-weighted estimator). We obtain empirical evidence from extensive simulations of data from normal distributions. Simulations from lognormal distributions are in a separate report Bakbergenuly et al. 2019b.
    Date: 2020–01–07
  8. By: Döhrn, Roland
    Abstract: The Diebold-Mariano-Test has become a common tool to compare the accuracy of macroeconomic forecasts. Since these are typically model-free forecasts, distribution free tests might be a good alternative to the Diebold-Mariano-Test. This paper suggests a permutation test. Stochastic simulations show that permutation tests outperform the Diebold-Mariano-Test. Furthermore, a test statistic based on absolute errors seems to be more sensitive to differences in forecast accuracy than a statistic based on squared errors.
    Keywords: macroeconomic forecast,forecast accuracy,Diebold-Mariano test,permutation test
    JEL: C14 C15 C53
    Date: 2019
  9. By: HAFNER Christian M., (Université catholique de Louvain, Belgium); WANG Linqi, (Université catholique de Louvain, Belgium)
    Abstract: This paper proposes a new model for the dynamics of correlation matrices, where the dynamics are driven by the likelihood score with respect to the matrix logarithm of the correlation matrix. In analogy to the exponential GARCH model for volatility, this transformation ensures that the correlation matrices remain positive defi nite, even in high dimensions. For the conditional distribution of returns we assume a student-t copula to explain the dependence structure and univariate student-t for the marginals with potentially diff erent degrees of freedom. The separation into volatility and correlation parts allows two-step estimation, which facilitates estimation in high dimensions. We derive estimation theory for one-step and two-step estimation. In an application to a set of six asset indices including nancial and alternative assets we show that the model performs well in terms of various diagnostics and speci cation tests.
    Keywords: score, correlation, matrix logarithm, identification
    JEL: C14 C43 Z11
    Date: 2019–12–17
  10. By: BAUWENS Luc, (Université catholique de Louvain, CORE, Belgium); XU Yongdeng, (Cardiff University)
    Abstract: This paper introduces the DCC-HEAVY and DECO-HEAVY models, which are dynamic models for conditional variances and correlations for daily returns based on measures of realized variances and correlations built from intraday data. Formulas for multi-step forecasts of conditional variances and correlations are provideid. Asymmetric versions of the models are developed. An empirical study shows that in terms of forecoaosts the new HEAVY models outperform the BEKK-HEAVY model based on realized covariances, and the BEKK, DCC and DECO multivariate GARCH models based exclusively on daily data.
    Keywords: dynamic conditional correlations, forecasting, multivariate HEAVY, multivariate GARCH, realized correlations
    JEL: C32 C58 G17
    Date: 2019–12–17
  11. By: Badi Baltagi (Center for Policy Research, Maxwell School, Syracuse University, 426 Eggers Hall, Syracuse, NY 13244); Long Liu (College of Business, University of Texas at San Antonio)
    Abstract: This paper derives the best linear unbiased prediction (BLUP) for an unbalanced panel data model. Starting with a simple error component regression model with unbalanced panel data and random effects, it generalizes the BLUP derived by Taub (1979) to unbalanced panels. Next it derives the BLUP for an unequally spaced panel data model with serial correlation of the AR(1) type in the remainder disturbances considered by Baltagi and Wu (1999). This in turn extends the BLUP for a panel data model with AR(1) type remainder disturbances derived by Baltagi and Li (1992) from the balanced to the unequally spaced panel data case. The derivations are easily implemented and reduce to tractable expressions using an extension of the Fuller and Battese (1974) transformation from the balanced to the unbalanced panel data case.
    Keywords: Forecasting, BLUP, Unbalanced Panel Data, Unequally Spaced Panels, Serial Correlation
    JEL: C33
    Date: 2020–01
  12. By: Rafael Wildauer (University of Greenwich); Jakob Kapeller
    Abstract: Taking survey data on household wealth as our major example, this short paper discusses some of the issues applied researchers are facing when fitting (type I) Pareto distributions to complex survey data. The major contribution of this paper is twofold: First, we propose a new and intuitive way of deriving Gabaix and Ibragimov’s (2011) bias correction for Pareto tail estimations from which the generalization to complex survey data follows naturally. Second, we summarise how Kolmogorov-Smirnof and Cramer-von-Mises goodness of fit tests can be generalized to complex survey data. Taken together we think the paper provides a concise and useful presentation of the fundamentals of Pareto tail fitting with complex survey data.
    Keywords: Pareto distribution, complex survey data, wealth distribution
    JEL: C46 C83 D31
    Date: 2020–01
  13. By: Abito, Jose Miguel
    Abstract: I propose an estimation procedure that can accommodate fixed effects in the widely used proxy variable approach to estimating production functions. The procedure allows unobserved productivity to have a permanent component in addition to a (nonlinear) Markov shock. The procedure does not rely on differencing out the fixed effect and thus is not restricted to within-firm variation for identification. Finally, the procedure is easy to implement as it only entails adding a two stage least squares step using internal instruments.
    Keywords: Production function, Estimation, Fixed Effects, Unobserved productivity, Proxy variables, Errors-in-Variables, Instrumental variables
    JEL: C0 C01 L0 L00 O4
    Date: 2019–12–24
  14. By: Maxwell Kellogg; Magne Mogstad; Guillaume Pouliot; Alexander Torgovitsky
    Abstract: The synthetic control method is widely used in comparative case studies to adjust for differences in pre-treatment characteristics. A major attraction of the method is that it limits extrapolation bias that can occur when untreated units with different pre-treatment characteristics are combined using a traditional adjustment, such as a linear regression. Instead, the SC estimator is susceptible to interpolation bias because it uses a convex weighted average of the untreated units to create a synthetic untreated unit with pre-treatment characteristics similar to those of the treated unit. More traditional matching estimators exhibit the opposite behavior: They limit interpolation bias at the potential expense of extrapolation bias. We propose combining the matching and synthetic control estimators through model averaging. We show how to use a rolling-origin cross-validation procedure to train the model averaging estimator to resolve trade-offs between interpolation and extrapolation bias. We evaluate the estimator through Monte Carlo simulations and placebo studies before using it to re-examine the economic costs of conflicts. Not only does the model averaging estimator perform far better than synthetic controls and other alternatives in the simulations and placebo exercises. It also yields treatment effect estimates that are substantially different from the other estimators.
    JEL: C0 H0 J0
    Date: 2020–01
  15. By: Rahal, Charles
    Abstract: We outline a grid-based approach to provide further evidence against the misconception that the results of spatial econometric models are sensitive to the exact specification of the exogenously set weighting matrix (otherwise known as the 'biggest myth in spatial econometrics'). Our application estimates three large sets of specifications using an original dataset which contains information on the Prime Central London housing market. We show that while posterior model probabilities may indicate a strong preference for an extremely small number of models, and while the spatial autocorrelation parameter varies substantially, median direct effects remain stable across the entire permissible spatial weighting matrix space. We argue that spatial econometric models should be estimated across this entire space, as opposed to the current convention of merely estimating a cursory number of points for robustness.
    Date: 2019–12–10
  16. By: Christiane Baumeister; James D. Hamilton
    Abstract: This paper discusses the problems associated with using information about the signs of certain magnitudes as a basis for drawing structural conclusions in vector autoregressions. We also review available tools to solve these problems. For illustration we use Dahlhaus and Vasishtha's (2019) study of the effects of a U.S. monetary contraction on capital flows to emerging markets. We explain why sign restrictions alone are not enough to allow us to answer the question and suggest alternative approaches that could be used.
    JEL: C30 E5 F2
    Date: 2020–01
  17. By: James J. Heckman (University of Chicago)
    Abstract: This paper examines the case for randomized controlled trials in economics. I revisit my previous paper--"Randomization and Social Policy Evaluation"--and update its message. I present a brief summary of the history of randomization in economics. I identify two waves of enthusiasm for the method as "Two Awakenings" because of the near-religious zeal associated with each wave. The First Wave substantially contributed to the development of microeconometrics because of the flawed nature of the experimental evidence. The Second Wave has improved experimental designs to avoid some of the technical statistical issues identified by econometricians in the wake of the First Wave. However, the deep conceptual issues about parameters estimated, and the economic interpretation and the policy relevance of the experimental results have not been addressed in the Second Wave.
    Keywords: field experiments, randomized control trials
    JEL: C93
    Date: 2020–01

This nep-ecm issue is ©2020 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.