nep-ecm New Economics Papers
on Econometrics
Issue of 2022‒08‒08
eighteen papers chosen by
Sune Karlsson
Örebro universitet

  1. CCE Estimation of High-Dimensional Panel Data Models with Interactive Fixed Effects By Vogt, M.; Walsh, C.; Linton, O.
  2. GMM Estimation for High-Dimensional Panel Data Models By Tingting Cheng; Chaohua Dong; Jiti Gao; Oliver Linton
  3. A mixture of ordered probit models with endogenous switching between two latent classes By Jochem Huismans; Andrei Sirchenko; Jan Willem Nijenhuis
  4. A Constructive GAN-based Approach to Exact Estimate Treatment Effect without Matching By Boyang You; Kerry Papps
  5. Consistency without inference: instrumental variables in practical application By Young, Alwyn
  6. Should Copula Endogeneity Correction Include Generated Regressors for Higher-order Terms? No, It Hurts By Yi Qian; Hui Xie; Anthony Koschmann
  7. A Closed-form Alternative Estimator for GLM with Categorical Explanatory Variables By Alexandre Brouste; Christophe Dutang; Tom Rohmer
  8. Optimality of Matched-Pair Designs in Randomized Controlled Trials By Yuehao Bai
  9. Instrumented Common Confounding By Christian Tien
  10. Estimation of DSGE Models With the Effective Lower Bound By Gregor Boehl, Felix Strobel
  11. Ensemble MCMC Sampling for DSGE Models By Gregor Boehl
  12. Likelihood ratio test for structural changes in factor models By Jushan Bai; Jiangtao Duan; Xu Han
  13. On the Performance of the Neyman Allocation with Small Pilots By Yong Cai; Ahnaf Rafi
  14. Nonparametric Analysis of the Mixed-Demand Model By Hjertstrand, Per
  15. A Maximum Entropy Estimate of Uncertainty about a Wine Rating By Bodington, Jeff
  16. Misspecification and Weak Identification in Asset Pricing By Frank Kleibergen; Zhaoguo Zhan
  17. Robust Knockoffs for Controlling False Discoveries With an Application to Bond Recovery Rates By Konstantin G\"orgen; Abdolreza Nazemi; Melanie Schienle
  18. Quantum Monte Carlo for Economics: Stress Testing and Macroeconomic Deep Learning By Vladimir Skavysh; Sofia Priazhkina; Diego Guala; Thomas Bromley

  1. By: Vogt, M.; Walsh, C.; Linton, O.
    Abstract: Interactive fixed effects are a popular means to model unobserved heterogeneity in panel data. Models with interactive fixed effects are well studied in the low-dimensional case where the number of parameters to be estimated is small. However, they are largely unexplored in the high-dimensional case where the number of parameters is large, potentially much larger than the sample size itself. In this paper, we develop new econometric methods for the estimation of high-dimensional panel data models with interactive fixed effects. Our estimator is based on similar ideas as the very popular common correlated effects (CCE) estimator which is frequently used in the low-dimensional case. We thus call our estimator a high-dimensional CCE estimator. We derive theory for the estimator both in the large-T-case, where the time series length T tends to infinity, and in the small-T-case, where T is a fixed natural number. The theoretical analysis of the paper is complemented by a simulation study which evaluates the finite sample performance of the estimator.
    Keywords: CCE estimator, high-dimensional model, interactive fixed effects, lasso, panel data
    JEL: C13 C23 C55
    Date: 2022–06–28
  2. By: Tingting Cheng; Chaohua Dong; Jiti Gao; Oliver Linton
    Abstract: In this paper, we study a class of high dimensional moment restriction panel data models with interactive effects, where factors are unobserved and factor loadings are nonparametrically unknown smooth functions of individual characteristics variables. We allow the dimension of the parameter vector and the number of moment conditions to diverge with sample size. This is a very general framework and includes many existing linear and nonlinear panel data models as special cases. In order to estimate the unknown parameters, factors and factor loadings, we propose a sieve-based generalized method of moments estimation method and we show that under a set of simple identification conditions, all those unknown quantities can be consistently estimated. Further we establish asymptotic distributions of the proposed estimators. In addition, we propose tests for over-identification, specification of factor loading functions, and establish their large sample properties. Moreover, a number of simulation studies are conducted to examine the performance of the proposed estimators and test statistics in finite samples. An empirical example on stock return prediction is studied to demonstrate the usefulness of the proposed framework and corresponding estimation methods and testing procedures.
    Keywords: generalized method of moments, high dimensional moment model, interactive effect, over-identification issue, panel data, sieve method
    JEL: C13 C14 C23
    Date: 2022
  3. By: Jochem Huismans (Universiteit van Amsterdam); Andrei Sirchenko (Universiteit Maastricht); Jan Willem Nijenhuis (Universiteit Twente)
    Abstract: Ordinal responses can be generated, in a time-series context, by different latent regimes or, in a cross-sectional context, by different unobserved classes of population. We introduce a new command, swopit, that fits a mixture of ordered probit models with either exogenous or endogenous switching between two latent classes (or regimes). Switching is endogenous if the unobservables in the class-assignment model are correlated with the unobservables in the outcome models. We provide a battery of postestimation commands, assess by Monte Carlo experiments the finite-sample performance of the maximum likelihood estimator of the parameters, probabilities and their standard errors (both the asymptotic and bootstrap ones), and apply the new command to model the policy interest rates.
    Date: 2022–06–10
  4. By: Boyang You; Kerry Papps
    Abstract: Matching has become the mainstream in counterfactual inference, with which selection bias between sample groups can be significantly eliminated. However in practice, when estimating average treatment effect on the treated (ATT) via matching, no matter which method, the trade-off between estimation accuracy and information loss constantly exist. Attempting to completely replace the matching process, this paper proposes the GAN-ATT estimator that integrates generative adversarial network (GAN) into counterfactual inference framework. Through GAN machine learning, the probability density functions (PDFs) of samples in both treatment group and control group can be approximated. By differentiating conditional PDFs of the two groups with identical input condition, the conditional average treatment effect (CATE) can be estimated, and the ensemble average of corresponding CATEs over all treatment group samples is the estimate of ATT. Utilizing GAN-based infinite sample augmentations, problems in the case of insufficient samples or lack of common support domains can be easily solved. Theoretically, when GAN could perfectly learn the PDFs, our estimators can provide exact estimate of ATT. To check the performance of the GAN-ATT estimator, three sets of data are used for ATT estimations: Two toy data sets with 1/2 dimensional covariate inputs and constant/covariate-dependent treatment effect are tested. The estimates of GAN-ATT are proved close to the ground truth and are better than traditional matching approaches; A real firm-level data set with high-dimensional input is tested and the applicability towards real data sets is evaluated by comparing matching approaches. Through the evidences obtained from the three tests, we believe that the GAN-ATT estimator has significant advantages over traditional matching methods in estimating ATT.
    Date: 2022–06
  5. By: Young, Alwyn
    Abstract: I use Monte Carlo simulations, the jackknife and multiple forms of the bootstrap to study a comprehensive sample of 1309 instrumental variables regressions in 30 papers published in the journals of the American Economic Association. Monte Carlo simulations based upon published regressions show that non-iid error processes in highly leveraged regressions, both prominent features of published work, adversely affect the size and power of IV tests, while increasing the bias and mean squared error of IV relative to OLS. Weak instrument pre-tests based upon F-statistics are found to be largely uninformative of both size and bias. In published papers IV has little power as, despite producing substantively different estimates, it rarely rejects the OLS point estimate or the null that OLS is unbiased, while the statistical significance of excluded instruments is exaggerated.
    Keywords: Elsevier deal
    JEL: J1
    Date: 2022–04–10
  6. By: Yi Qian; Hui Xie; Anthony Koschmann
    Abstract: Causal inference in empirical studies is often challenging because of the presence of endogenous regressors. The classical approach to the problem requires using instrumental variables that must satisfy the stringent condition of exclusion restriction. A forefront of recent research is a new paradigm of handling endogenous regressors without using instrumental variables. Park and Gupta (Marketing Science, 2012) proposed instrument-free estimation using copulas that has been increasingly used in practical applications to address endogeneity bias. A relevant issue not studied is how to handle the higher-order terms (e.g., interaction and quadratic terms) of endogenous regressors using the copula approach. Recent applications of the approach have used disparate ways of handling these higher-order endogenous terms with unclear consequences. We show that once copula correction terms for the main effects of endogenous regressors are included as generated regressors, there is no need to include additional correction terms for the higher-order terms. This simplicity in handling higher-order endogenous regression terms is a merit of the instrument-free copula bias correction approach. More importantly, adding these unnecessary correction terms has harmful effects and leads to sub-optimal solutions of endogeneity bias, including finite-sample estimation bias and substantially inflated variability in estimates.
    JEL: C01 C1
    Date: 2022–04
  7. By: Alexandre Brouste (LMM - Laboratoire Manceau de Mathématiques - UM - Le Mans Université); Christophe Dutang (CEREMADE - CEntre de REcherches en MAthématiques de la DEcision - Université Paris Dauphine-PSL - PSL - Université Paris sciences et lettres - CNRS - Centre National de la Recherche Scientifique); Tom Rohmer (GenPhySE - Génétique Physiologie et Systèmes d'Elevage - ENVT - Ecole Nationale Vétérinaire de Toulouse - Toulouse INP - Institut National Polytechnique (Toulouse) - Université Fédérale Toulouse Midi-Pyrénées - École nationale supérieure agronomique de Toulouse [ENSAT] - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement)
    Abstract: The parameters of generalized linear models (GLMs) are usually estimated by the maximum likelihood estimator (MLE) which is known to be asymptotically efficient. But the MLE is computed using a Newton-Raphson-type algorithm which is time-consuming for a large number of variables or modalities, or a large sample size. An alternative closed-form estimator is proposed in this paper in the case of categorical explanatory variables. Asymptotic properties of the alternative estimator is studied. The performances in terms of both computation time and asymptotic variance of the proposed estimator are compared with the MLE for a Gamma distributed GLM.
    Keywords: Regression models,explicit estimators,categorical explanatory variables,GLM,asymptotic distribution
    Date: 2022–06–07
  8. By: Yuehao Bai
    Abstract: In randomized controlled trials (RCTs), treatment is often assigned by stratified randomization. I show that among all stratified randomization schemes which treat all units with probability one half, a certain matched-pair design achieves the maximum statistical precision for estimating the average treatment effect (ATE). In an important special case, the optimal design pairs units according to the baseline outcome. In a simulation study based on datasets from 10 RCTs, this design lowers the standard error for the estimator of the ATE by 10% on average, and by up to 34%, relative to the original designs.
    Date: 2022–06
  9. By: Christian Tien
    Abstract: Causal inference is difficult in the presence of unobserved confounders. We introduce the instrumented common confounding (ICC) approach to (nonparametrically) identify causal effects with instruments, which are exogenous only conditional on some unobserved common confounders. The ICC approach is most useful in rich observational data with multiple sources of unobserved confounding, where instruments are at most exogenous conditional on some unobserved common confounders. Suitable examples of this setting are various identification problems in the social sciences, nonlinear dynamic panels, and problems with multiple endogenous confounders. The ICC identifying assumptions are closely related to those in mixture models, negative control and IV. Compared to mixture models [Bonhomme et al., 2016], we require less conditionally independent variables and do not need to model the unobserved confounder. Compared to negative control [Cui et al., 2020], we allow for non-common confounders, with respect to which the instruments are exogenous. Compared to IV [Newey and Powell, 2003], we allow instruments to be exogenous conditional on some unobserved common confounders, for which a set of relevant observed variables exists. We prove point identification with outcome model and alternatively first stage restrictions. We provide a practical step-by-step guide to the ICC model assumptions and present the causal effect of education on income as a motivating example.
    Date: 2022–06
  10. By: Gregor Boehl, Felix Strobel
    Abstract: We propose a set of tools for the efficient and robust Bayesian estimation of medium- and large-scale DSGE models while accounting for the effective lower bound on nominal interest rates. We combine a novel nonlinear recursive filter with a computationally efficient piece-wise linear solution method and a state-of-the-art MCMC sampler. The filter allows for fast likelihood approximations, in particular of models with large state spaces. Using artificial data, we demonstrate that our methods accurately capture the true model parameters even with very long lower bound episodes. We apply our approach to analyze post-2008 US business cycle properties.
    Keywords: Effective Lower Bound, Bayesian Estimation, Great Recession, Business Cycles
    JEL: C11 C63 E31 E32 E44
    Date: 2022–06
  11. By: Gregor Boehl
    Abstract: This paper develops an adaptive differential evolution Markov chain Monte Carlo (ADEMC) sampler. The sampler satisfies five requirements that make it suitable especially for the estimation of models with high-dimensional posterior distributions and which are computationally expensive to evaluate: (i) A large number of chains (the "ensemble") where the number of chains scales inversely (nearly one-to-one) with the number of necessary ensemble iterations until convergence, (ii) fast burn-in and convergence (thereby superseding the need for numerical optimization), (iii) good performance for bimodal distributions, (iv) an endogenous proposal density generated from the state of the full ensemble, which (v) respects the bounds of prior distribution. Consequently, ADEMC is straightforward to parallelize. I use the sampler to estimate a heterogeneous agent New Keynesian (HANK) model including the micro parameters linked to the stationary distribution of the model.
    Keywords: Bayesian Estimation, Monte Carlo Methods, DSGE Models, Heterogeneous Agents
    JEL: C11 C13 C15 E10
    Date: 2022–06
  12. By: Jushan Bai; Jiangtao Duan; Xu Han
    Abstract: A factor model with a break in its factor loadings is observationally equivalent to a model without changes in the loadings but a change in the variance of its factors. This effectively transforms a structural change problem of high dimension into a problem of low dimension. This paper considers the likelihood ratio (LR) test for a variance change in the estimated factors. The LR test implicitly explores a special feature of the estimated factors: the pre-break and post-break variances can be a singular matrix under the alternative hypothesis, making the LR test diverging faster and thus more powerful than Wald-type tests. The better power property of the LR test is also confirmed by simulations. We also consider mean changes and multiple breaks. We apply the procedure to the factor modelling and structural change of the US employment using monthly industry-level-data.
    Date: 2022–06
  13. By: Yong Cai; Ahnaf Rafi
    Abstract: The Neyman Allocation and its conditional counterpart are used in many papers on experiment design, which typically assume that researchers have access to large pilot studies. This may not be realistic. To understand the properties of the Neyman Allocation with small pilots, we study its behavior in a novel asymptotic framework for two-wave experiments in which the pilot size is assumed to be fixed while the main wave sample size grows. Our analysis shows that the Neyman Allocation can lead to estimates of the ATE with higher asymptotic variance than with (non-adaptive) balanced randomization, particularly when the population is relatively homoskedastic. We also provide a series of empirical examples showing that the Neyman Allocation may perform poorly for values of homoskedasticity that are relevant for researchers. Our results suggest caution when employing experiment design methods involving the Neyman Allocation estimated from a small pilot study.
    Date: 2022–06
  14. By: Hjertstrand, Per (Research Institute of Industrial Economics (IFN))
    Abstract: The mixed-demand model allows for very flexible specification of what should be considered endogenous and exogenous in demand system estimation. This paper introduces a revealed preference framework to analyze the mixed-demand model. The proposed methods can be used to test whether observed data (with measurement errors) are consistent with the mixed-demand model and calculate goodness-of-fit measures. The framework is purely non-parametric in the sense that it does not require any functional form assumptions on the direct or indirect utility functions. The framework is applied to demand data for food and provides the first nonparametric empirical analysis of the mixed-demand model.
    Keywords: Demand systems; Measurement errors; Mixed-demand; Non-parametric; Revealed preference
    JEL: D11 D12
    Date: 2022–05–11
  15. By: Bodington, Jeff
    Abstract: Much research shows that the ratings that judges assign to wines are uncertain and an acute difficulty in ratings-related research, and in calculating consensus among judges, is that each rating is one observation drawn from a unique and latent distribution that is wine- and judge-specific. A simple maximum entropy estimator is proposed that yields a maximum-entropy probability distribution for sample sizes of none, one, and more. A test of that estimator yields results that are consistent with the results of experiments in which blind replicates are embedded within flights of wines evaluated by trained and tested judges
    Keywords: Research Methods/ Statistical Methods, Agribusiness
    Date: 2021
  16. By: Frank Kleibergen; Zhaoguo Zhan
    Abstract: The widespread co-existence of misspecification and weak identification in asset pricing has led to an overstated performance of risk factors. Because the conventional Fama and MacBeth (1973) methodology is jeopardized by misspecification and weak identification, we infer risk premia by using a double robust Lagrange multiplier test that remains reliable in the presence of these two empirically relevant issues. Moreover, we show how the identification, and the resulting appropriate interpretation, of the risk premia is governed by the relative magnitudes of the misspecification J-statistic and the identification IS-statistic. We revisit several prominent empirical applications and all specifications with one to six factors from the factor zoo of Feng, Giglio, and Xiu (2020) to emphasize the widespread occurrence of misspecification and weak identification.
    Date: 2022–06
  17. By: Konstantin G\"orgen; Abdolreza Nazemi; Melanie Schienle
    Abstract: We address challenges in variable selection with highly correlated data that are frequently present in finance, economics, but also in complex natural systems as e.g. weather. We develop a robustified version of the knockoff framework, which addresses challenges with high dependence among possibly many influencing factors and strong time correlation. In particular, the repeated subsampling strategy tackles the variability of the knockoffs and the dependency of factors. Simultaneously, we also control the proportion of false discoveries over a grid of all possible values, which mitigates variability of selected factors from ad-hoc choices of a specific false discovery level. In the application for corporate bond recovery rates, we identify new important groups of relevant factors on top of the known standard drivers. But we also show that out-of-sample, the resulting sparse model has similar predictive power to state-of-the-art machine learning models that use the entire set of predictors.
    Date: 2022–06
  18. By: Vladimir Skavysh; Sofia Priazhkina; Diego Guala; Thomas Bromley
    Abstract: Computational methods both open the frontiers of economic analysis and serve as a bottleneck in what can be achieved. Using the quantum Monte Carlo (QMC) algorithm, we are the first to study whether quantum computing can improve the run time of economic applications and challenges in doing so. We identify a large class of economic problems suitable for improvements. Then, we illustrate how to formulate and encode on quantum circuit two applications: (a) a bank stress testing model with credit shocks and fire sales and (b) a dynamic stochastic general equilibrium (DSGE) model solved with deep learning, and further demonstrate potential efficiency gain. We also present a few innovations in the QMC algorithm itself and in how to benchmark it to classical MC.
    Keywords: Business fluctuations and cycles; Central bank research; Econometric and statistical methods; Economic models; Financial stability
    Date: 2022–06

This nep-ecm issue is ©2022 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.