Econometrics
http://lists.repec.org/mailman/listinfo/nep-ecm
Econometrics
2022-08-08
CCE Estimation of High-Dimensional Panel Data Models with Interactive Fixed Effects
http://d.repec.org/n?u=RePEc:cam:camdae:2242&r=&r=ecm
Interactive fixed effects are a popular means to model unobserved heterogeneity in panel data. Models with interactive fixed effects are well studied in the low-dimensional case where the number of parameters to be estimated is small. However, they are largely unexplored in the high-dimensional case where the number of parameters is large, potentially much larger than the sample size itself. In this paper, we develop new econometric methods for the estimation of high-dimensional panel data models with interactive fixed effects. Our estimator is based on similar ideas as the very popular common correlated effects (CCE) estimator which is frequently used in the low-dimensional case. We thus call our estimator a high-dimensional CCE estimator. We derive theory for the estimator both in the large-T-case, where the time series length T tends to infinity, and in the small-T-case, where T is a fixed natural number. The theoretical analysis of the paper is complemented by a simulation study which evaluates the finite sample performance of the estimator.
Vogt, M.
Walsh, C.
Linton, O.
CCE estimator, high-dimensional model, interactive fixed effects, lasso, panel data
2022-06-28
GMM Estimation for High-Dimensional Panel Data Models
http://d.repec.org/n?u=RePEc:msh:ebswps:2022-11&r=&r=ecm
In this paper, we study a class of high dimensional moment restriction panel data models with interactive effects, where factors are unobserved and factor loadings are nonparametrically unknown smooth functions of individual characteristics variables. We allow the dimension of the parameter vector and the number of moment conditions to diverge with sample size. This is a very general framework and includes many existing linear and nonlinear panel data models as special cases. In order to estimate the unknown parameters, factors and factor loadings, we propose a sieve-based generalized method of moments estimation method and we show that under a set of simple identification conditions, all those unknown quantities can be consistently estimated. Further we establish asymptotic distributions of the proposed estimators. In addition, we propose tests for over-identification, specification of factor loading functions, and establish their large sample properties. Moreover, a number of simulation studies are conducted to examine the performance of the proposed estimators and test statistics in finite samples. An empirical example on stock return prediction is studied to demonstrate the usefulness of the proposed framework and corresponding estimation methods and testing procedures.
Tingting Cheng
Chaohua Dong
Jiti Gao
Oliver Linton
generalized method of moments, high dimensional moment model, interactive effect, over-identification issue, panel data, sieve method
2022
A mixture of ordered probit models with endogenous switching between two latent classes
http://d.repec.org/n?u=RePEc:boc:dsug22:02&r=&r=ecm
Ordinal responses can be generated, in a time-series context, by different latent regimes or, in a cross-sectional context, by different unobserved classes of population. We introduce a new command, swopit, that fits a mixture of ordered probit models with either exogenous or endogenous switching between two latent classes (or regimes). Switching is endogenous if the unobservables in the class-assignment model are correlated with the unobservables in the outcome models. We provide a battery of postestimation commands, assess by Monte Carlo experiments the finite-sample performance of the maximum likelihood estimator of the parameters, probabilities and their standard errors (both the asymptotic and bootstrap ones), and apply the new command to model the policy interest rates.
Jochem Huismans
Andrei Sirchenko
Jan Willem Nijenhuis
2022-06-10
A Constructive GAN-based Approach to Exact Estimate Treatment Effect without Matching
http://d.repec.org/n?u=RePEc:arx:papers:2206.06116&r=&r=ecm
Matching has become the mainstream in counterfactual inference, with which selection bias between sample groups can be significantly eliminated. However in practice, when estimating average treatment effect on the treated (ATT) via matching, no matter which method, the trade-off between estimation accuracy and information loss constantly exist. Attempting to completely replace the matching process, this paper proposes the GAN-ATT estimator that integrates generative adversarial network (GAN) into counterfactual inference framework. Through GAN machine learning, the probability density functions (PDFs) of samples in both treatment group and control group can be approximated. By differentiating conditional PDFs of the two groups with identical input condition, the conditional average treatment effect (CATE) can be estimated, and the ensemble average of corresponding CATEs over all treatment group samples is the estimate of ATT. Utilizing GAN-based infinite sample augmentations, problems in the case of insufficient samples or lack of common support domains can be easily solved. Theoretically, when GAN could perfectly learn the PDFs, our estimators can provide exact estimate of ATT. To check the performance of the GAN-ATT estimator, three sets of data are used for ATT estimations: Two toy data sets with 1/2 dimensional covariate inputs and constant/covariate-dependent treatment effect are tested. The estimates of GAN-ATT are proved close to the ground truth and are better than traditional matching approaches; A real firm-level data set with high-dimensional input is tested and the applicability towards real data sets is evaluated by comparing matching approaches. Through the evidences obtained from the three tests, we believe that the GAN-ATT estimator has significant advantages over traditional matching methods in estimating ATT.
Boyang You
Kerry Papps
2022-06
Consistency without inference: instrumental variables in practical application
http://d.repec.org/n?u=RePEc:ehl:lserod:115011&r=&r=ecm
I use Monte Carlo simulations, the jackknife and multiple forms of the bootstrap to study a comprehensive sample of 1309 instrumental variables regressions in 30 papers published in the journals of the American Economic Association. Monte Carlo simulations based upon published regressions show that non-iid error processes in highly leveraged regressions, both prominent features of published work, adversely affect the size and power of IV tests, while increasing the bias and mean squared error of IV relative to OLS. Weak instrument pre-tests based upon F-statistics are found to be largely uninformative of both size and bias. In published papers IV has little power as, despite producing substantively different estimates, it rarely rejects the OLS point estimate or the null that OLS is unbiased, while the statistical significance of excluded instruments is exaggerated.
Young, Alwyn
Elsevier deal
2022-04-10
Should Copula Endogeneity Correction Include Generated Regressors for Higher-order Terms? No, It Hurts
http://d.repec.org/n?u=RePEc:nbr:nberwo:29978&r=&r=ecm
Causal inference in empirical studies is often challenging because of the presence of endogenous regressors. The classical approach to the problem requires using instrumental variables that must satisfy the stringent condition of exclusion restriction. A forefront of recent research is a new paradigm of handling endogenous regressors without using instrumental variables. Park and Gupta (Marketing Science, 2012) proposed instrument-free estimation using copulas that has been increasingly used in practical applications to address endogeneity bias. A relevant issue not studied is how to handle the higher-order terms (e.g., interaction and quadratic terms) of endogenous regressors using the copula approach. Recent applications of the approach have used disparate ways of handling these higher-order endogenous terms with unclear consequences. We show that once copula correction terms for the main effects of endogenous regressors are included as generated regressors, there is no need to include additional correction terms for the higher-order terms. This simplicity in handling higher-order endogenous regression terms is a merit of the instrument-free copula bias correction approach. More importantly, adding these unnecessary correction terms has harmful effects and leads to sub-optimal solutions of endogeneity bias, including finite-sample estimation bias and substantially inflated variability in estimates.
Yi Qian
Hui Xie
Anthony Koschmann
2022-04
A Closed-form Alternative Estimator for GLM with Categorical Explanatory Variables
http://d.repec.org/n?u=RePEc:hal:journl:hal-03689206&r=&r=ecm
The parameters of generalized linear models (GLMs) are usually estimated by the maximum likelihood estimator (MLE) which is known to be asymptotically efficient. But the MLE is computed using a Newton-Raphson-type algorithm which is time-consuming for a large number of variables or modalities, or a large sample size. An alternative closed-form estimator is proposed in this paper in the case of categorical explanatory variables. Asymptotic properties of the alternative estimator is studied. The performances in terms of both computation time and asymptotic variance of the proposed estimator are compared with the MLE for a Gamma distributed GLM.
Alexandre Brouste
Christophe Dutang
Tom Rohmer
Regression models,explicit estimators,categorical explanatory variables,GLM,asymptotic distribution
2022-06-07
Optimality of Matched-Pair Designs in Randomized Controlled Trials
http://d.repec.org/n?u=RePEc:arx:papers:2206.07845&r=&r=ecm
In randomized controlled trials (RCTs), treatment is often assigned by stratified randomization. I show that among all stratified randomization schemes which treat all units with probability one half, a certain matched-pair design achieves the maximum statistical precision for estimating the average treatment effect (ATE). In an important special case, the optimal design pairs units according to the baseline outcome. In a simulation study based on datasets from 10 RCTs, this design lowers the standard error for the estimator of the ATE by 10% on average, and by up to 34%, relative to the original designs.
Yuehao Bai
2022-06
Instrumented Common Confounding
http://d.repec.org/n?u=RePEc:arx:papers:2206.12919&r=&r=ecm
Causal inference is difficult in the presence of unobserved confounders. We introduce the instrumented common confounding (ICC) approach to (nonparametrically) identify causal effects with instruments, which are exogenous only conditional on some unobserved common confounders. The ICC approach is most useful in rich observational data with multiple sources of unobserved confounding, where instruments are at most exogenous conditional on some unobserved common confounders. Suitable examples of this setting are various identification problems in the social sciences, nonlinear dynamic panels, and problems with multiple endogenous confounders. The ICC identifying assumptions are closely related to those in mixture models, negative control and IV. Compared to mixture models [Bonhomme et al., 2016], we require less conditionally independent variables and do not need to model the unobserved confounder. Compared to negative control [Cui et al., 2020], we allow for non-common confounders, with respect to which the instruments are exogenous. Compared to IV [Newey and Powell, 2003], we allow instruments to be exogenous conditional on some unobserved common confounders, for which a set of relevant observed variables exists. We prove point identification with outcome model and alternatively first stage restrictions. We provide a practical step-by-step guide to the ICC model assumptions and present the causal effect of education on income as a motivating example.
Christian Tien
2022-06
Estimation of DSGE Models With the Effective Lower Bound
http://d.repec.org/n?u=RePEc:bon:boncrc:crctr224_2022_356&r=&r=ecm
We propose a set of tools for the efficient and robust Bayesian estimation of medium- and large-scale DSGE models while accounting for the effective lower bound on nominal interest rates. We combine a novel nonlinear recursive filter with a computationally efficient piece-wise linear solution method and a state-of-the-art MCMC sampler. The filter allows for fast likelihood approximations, in particular of models with large state spaces. Using artificial data, we demonstrate that our methods accurately capture the true model parameters even with very long lower bound episodes. We apply our approach to analyze post-2008 US business cycle properties.
Gregor Boehl, Felix Strobel
Effective Lower Bound, Bayesian Estimation, Great Recession, Business Cycles
2022-06
Ensemble MCMC Sampling for DSGE Models
http://d.repec.org/n?u=RePEc:bon:boncrc:crctr224_2022_355&r=&r=ecm
This paper develops an adaptive differential evolution Markov chain Monte Carlo (ADEMC) sampler. The sampler satisfies five requirements that make it suitable especially for the estimation of models with high-dimensional posterior distributions and which are computationally expensive to evaluate: (i) A large number of chains (the "ensemble") where the number of chains scales inversely (nearly one-to-one) with the number of necessary ensemble iterations until convergence, (ii) fast burn-in and convergence (thereby superseding the need for numerical optimization), (iii) good performance for bimodal distributions, (iv) an endogenous proposal density generated from the state of the full ensemble, which (v) respects the bounds of prior distribution. Consequently, ADEMC is straightforward to parallelize. I use the sampler to estimate a heterogeneous agent New Keynesian (HANK) model including the micro parameters linked to the stationary distribution of the model.
Gregor Boehl
Bayesian Estimation, Monte Carlo Methods, DSGE Models, Heterogeneous Agents
2022-06
Likelihood ratio test for structural changes in factor models
http://d.repec.org/n?u=RePEc:arx:papers:2206.08052&r=&r=ecm
A factor model with a break in its factor loadings is observationally equivalent to a model without changes in the loadings but a change in the variance of its factors. This effectively transforms a structural change problem of high dimension into a problem of low dimension. This paper considers the likelihood ratio (LR) test for a variance change in the estimated factors. The LR test implicitly explores a special feature of the estimated factors: the pre-break and post-break variances can be a singular matrix under the alternative hypothesis, making the LR test diverging faster and thus more powerful than Wald-type tests. The better power property of the LR test is also confirmed by simulations. We also consider mean changes and multiple breaks. We apply the procedure to the factor modelling and structural change of the US employment using monthly industry-level-data.
Jushan Bai
Jiangtao Duan
Xu Han
2022-06
On the Performance of the Neyman Allocation with Small Pilots
http://d.repec.org/n?u=RePEc:arx:papers:2206.04643&r=&r=ecm
The Neyman Allocation and its conditional counterpart are used in many papers on experiment design, which typically assume that researchers have access to large pilot studies. This may not be realistic. To understand the properties of the Neyman Allocation with small pilots, we study its behavior in a novel asymptotic framework for two-wave experiments in which the pilot size is assumed to be fixed while the main wave sample size grows. Our analysis shows that the Neyman Allocation can lead to estimates of the ATE with higher asymptotic variance than with (non-adaptive) balanced randomization, particularly when the population is relatively homoskedastic. We also provide a series of empirical examples showing that the Neyman Allocation may perform poorly for values of homoskedasticity that are relevant for researchers. Our results suggest caution when employing experiment design methods involving the Neyman Allocation estimated from a small pilot study.
Yong Cai
Ahnaf Rafi
2022-06
Nonparametric Analysis of the Mixed-Demand Model
http://d.repec.org/n?u=RePEc:hhs:iuiwop:1430&r=&r=ecm
The mixed-demand model allows for very flexible specification of what should be considered endogenous and exogenous in demand system estimation. This paper introduces a revealed preference framework to analyze the mixed-demand model. The proposed methods can be used to test whether observed data (with measurement errors) are consistent with the mixed-demand model and calculate goodness-of-fit measures. The framework is purely non-parametric in the sense that it does not require any functional form assumptions on the direct or indirect utility functions. The framework is applied to demand data for food and provides the first nonparametric empirical analysis of the mixed-demand model.
Hjertstrand, Per
Demand systems; Measurement errors; Mixed-demand; Non-parametric; Revealed preference
2022-05-11
A Maximum Entropy Estimate of Uncertainty about a Wine Rating
http://d.repec.org/n?u=RePEc:ags:aawewp:321847&r=&r=ecm
Much research shows that the ratings that judges assign to wines are uncertain and an acute difficulty in ratings-related research, and in calculating consensus among judges, is that each rating is one observation drawn from a unique and latent distribution that is wine- and judge-specific. A simple maximum entropy estimator is proposed that yields a maximum-entropy probability distribution for sample sizes of none, one, and more. A test of that estimator yields results that are consistent with the results of experiments in which blind replicates are embedded within flights of wines evaluated by trained and tested judges
Bodington, Jeff
Research Methods/ Statistical Methods, Agribusiness
2021
Misspecification and Weak Identification in Asset Pricing
http://d.repec.org/n?u=RePEc:arx:papers:2206.13600&r=&r=ecm
The widespread co-existence of misspecification and weak identification in asset pricing has led to an overstated performance of risk factors. Because the conventional Fama and MacBeth (1973) methodology is jeopardized by misspecification and weak identification, we infer risk premia by using a double robust Lagrange multiplier test that remains reliable in the presence of these two empirically relevant issues. Moreover, we show how the identification, and the resulting appropriate interpretation, of the risk premia is governed by the relative magnitudes of the misspecification J-statistic and the identification IS-statistic. We revisit several prominent empirical applications and all specifications with one to six factors from the factor zoo of Feng, Giglio, and Xiu (2020) to emphasize the widespread occurrence of misspecification and weak identification.
Frank Kleibergen
Zhaoguo Zhan
2022-06
Robust Knockoffs for Controlling False Discoveries With an Application to Bond Recovery Rates
http://d.repec.org/n?u=RePEc:arx:papers:2206.06026&r=&r=ecm
We address challenges in variable selection with highly correlated data that are frequently present in finance, economics, but also in complex natural systems as e.g. weather. We develop a robustified version of the knockoff framework, which addresses challenges with high dependence among possibly many influencing factors and strong time correlation. In particular, the repeated subsampling strategy tackles the variability of the knockoffs and the dependency of factors. Simultaneously, we also control the proportion of false discoveries over a grid of all possible values, which mitigates variability of selected factors from ad-hoc choices of a specific false discovery level. In the application for corporate bond recovery rates, we identify new important groups of relevant factors on top of the known standard drivers. But we also show that out-of-sample, the resulting sparse model has similar predictive power to state-of-the-art machine learning models that use the entire set of predictors.
Konstantin G\"orgen
Abdolreza Nazemi
Melanie Schienle
2022-06
Quantum Monte Carlo for Economics: Stress Testing and Macroeconomic Deep Learning
http://d.repec.org/n?u=RePEc:bca:bocawp:22-29&r=&r=ecm
Computational methods both open the frontiers of economic analysis and serve as a bottleneck in what can be achieved. Using the quantum Monte Carlo (QMC) algorithm, we are the first to study whether quantum computing can improve the run time of economic applications and challenges in doing so. We identify a large class of economic problems suitable for improvements. Then, we illustrate how to formulate and encode on quantum circuit two applications: (a) a bank stress testing model with credit shocks and fire sales and (b) a dynamic stochastic general equilibrium (DSGE) model solved with deep learning, and further demonstrate potential efficiency gain. We also present a few innovations in the QMC algorithm itself and in how to benchmark it to classical MC.
Vladimir Skavysh
Sofia Priazhkina
Diego Guala
Thomas Bromley
Business fluctuations and cycles; Central bank research; Econometric and statistical methods; Economic models; Financial stability
2022-06