nep-ecm New Economics Papers
on Econometrics
Issue of 2023‒03‒13
thirteen papers chosen by
Sune Karlsson
Örebro universitet

  1. Structural Break Detection in Quantile Predictive Regression Models with Persistent Covariates By Christis Katsouris
  2. Time-Weighted Difference-in-Differences: Accounting for Common Factors in Short T Panels By Timo Schenk
  3. On semiparametric estimation of the intercept of the sample selection model: a kernel approach By Zhewen Pan
  4. A two sample size estimator for large data sets By O’Connell, Martin; Smith, Howard; Thomassen, Øyvind
  5. Instrument Strength in IV Estimation and Inference: A Guide to Theory and Practice By Michael Keane; Timothy Neal
  6. Minimax Instrumental Variable Regression and $L_2$ Convergence Guarantees without Identification or Closedness By Andrew Bennett; Nathan Kallus; Xiaojie Mao; Whitney Newey; Vasilis Syrgkanis; Masatoshi Uehara
  7. Sparse High-Dimensional Vector Autoregressive Bootstrap By Robert Adamek; Stephan Smeekes; Ines Wilms
  8. Extensions for Inference in Difference-in-Differences with Few Treated Clusters By Luis Alvarez; Bruno Ferman
  9. Spurious Precision in Meta-Analysis By Zuzana Irsova; Pedro R. D. Bom; Tomas Havranek; Heiko Rachinger
  10. Flexible Non-parametric Regression Models for Compositional Response Data with Zeros By Michail Tsagris; Abdulaziz Alenazi; Connie Stewart
  11. Cauchy Robust Principal Component Analysis with Applications to High-Dimensional Data Sets By Aisha Fayomi; Yannis Pantazis; Michail Tsagris; Andrew Wood
  12. Axiomatization of Random Utility Model with Unobservable Alternatives By Haruki Kono; Kota Saito; Alec Sandroni
  13. Heterogeneity of Consumption Responses to Income Shocks in the Presence of Nonlinear Persistence By Manuel Arellano; Richard Blundell; Stéphane Bonhomme; Jack Light

  1. By: Christis Katsouris
    Abstract: We propose an econometric environment for structural break detection in nonstationary quantile predictive regressions. We establish the limit distributions for a class of Wald and fluctuation type statistics based on both the ordinary least squares estimator and the endogenous instrumental regression estimator proposed by Phillips and Magdalinos (2009a, Econometric Inference in the Vicinity of Unity. Working paper, Singapore Management University). Although the asymptotic distribution of these test statistics appears to depend on the chosen estimator, the IVX based tests are shown to be asymptotically nuisance parameter-free regardless of the degree of persistence and consistent under local alternatives. The finite-sample performance of both tests is evaluated via simulation experiments. An empirical application to house pricing index returns demonstrates the practicality of the proposed break tests for regression quantiles of nonstationary time series data.
    Date: 2023–02
  2. By: Timo Schenk (University of Amsterdam)
    Abstract: This paper proposes a time-weighted difference-in-differences (TWDID) estimation approach that is robust against interactive fixed effects in short T panels. Time weighting substantially reduces both bias and variance compared to conventional DID estimation through balancing the pre-treatment and post-treatment unobserved common factors. To conduct valid inference on the average treatment effect, I develop a correction term that adjusts conventional standard errors for weight estimation uncertainty. Revisiting a study on the effect of a cap-and-trade program on NOx emissions, TWDID estimation reduces the standard errors of the estimated treatment effect by 10% compared to a conventional DID approach. In a second application I illustrate how to implement TWDID in settings with staggered adoption of the treatment.
    Keywords: synthetic difference-in-differences, dynamic treatment effects, interactive fixed effects, panel data
    Date: 2023–02–03
  3. By: Zhewen Pan
    Abstract: This paper presents a new perspective on the identification at infinity for the intercept of the sample selection model as identification at the boundary via a transformation of the selection index. This perspective suggests generalizations of estimation at infinity to kernel regression estimation at the boundary and further to local linear estimation at the boundary. The proposed kernel-type estimators with an estimated transformation are proven to be nonparametric-rate consistent and asymptotically normal under mild regularity conditions. A fully data-driven method of selecting the optimal bandwidths for the estimators is developed. The Monte Carlo simulation shows the desirable finite sample properties of the proposed estimators and bandwidth selection procedures.
    Date: 2023–02
  4. By: O’Connell, Martin (Dept. of Economics, University of Wisconsin-Madison); Smith, Howard (Dept. of Economics, Oxford University); Thomassen, Øyvind (Dept. of Business and Management Science, Norwegian School of Economics)
    Abstract: In GMM estimators moment conditions with additive error terms involve an observed component and a predicted component. If the predicted component is computationally costly to evaluate, it may not be feasible to estimate the model with all the available data. We propose an estimator that uses the full data set for the computationally cheap observed component, but a reduced sample size for the predicted component. We show consistency, asymptotic normality, and derive standard errors and a practical criterion for when our estimator is variance-reducing. We demonstrate the estimator’s properties on a range of models through Monte Carlo studies and an empirical application to alcohol demand.
    Keywords: GMM; estimation; micro data
    JEL: C20 C51 C55
    Date: 2023–02–17
  5. By: Michael Keane (School of Economics); Timothy Neal (UNSW School of Economics)
    Abstract: 2SLS has poor properties if instruments are exogenous but weak. But how strong must instruments be for 2SLS estimates and test statistics to exhibit acceptable properties? A common standard is a first-stage F ≥ 10. This is adequate to ensure two- tailed t-tests have small size distortions. But other problems persist: In particular, we show 2SLS standard errors tend to be artificially small in samples where the estimate is most contaminated by the OLS bias. Hence, if the bias is positive, the t-test has little power to detect true negative effects, and inflated power to find positive effects. This phenomenon, which we call a “power asymmetry, †persists even if first-stage F is in the thousands. Robust tests like Anderson-Rubin perform better, and should be used in lieu of the t-test even with strong instruments. We also show how 2SLS test statistics typically suffer from very low power when first-stage F is near 10, leading us to suggest a higher standard of instrument strength in empirical practice.
    Keywords: Instrumental variables, weak instruments, 2SLS, endogeneity, F-test, size distortion, Anderson-Rubin test, likelihood ratio test, LIML, GMM, Fuller, JIVE
    JEL: C12 C26 C36
    Date: 2022–11
  6. By: Andrew Bennett; Nathan Kallus; Xiaojie Mao; Whitney Newey; Vasilis Syrgkanis; Masatoshi Uehara
    Abstract: In this paper, we study nonparametric estimation of instrumental variable (IV) regressions. Recently, many flexible machine learning methods have been developed for instrumental variable estimation. However, these methods have at least one of the following limitations: (1) restricting the IV regression to be uniquely identified; (2) only obtaining estimation error rates in terms of pseudometrics (\emph{e.g., } projected norm) rather than valid metrics (\emph{e.g., } $L_2$ norm); or (3) imposing the so-called closedness condition that requires a certain conditional expectation operator to be sufficiently smooth. In this paper, we present the first method and analysis that can avoid all three limitations, while still permitting general function approximation. Specifically, we propose a new penalized minimax estimator that can converge to a fixed IV solution even when there are multiple solutions, and we derive a strong $L_2$ error rate for our estimator under lax conditions. Notably, this guarantee only needs a widely-used source condition and realizability assumptions, but not the so-called closedness condition. We argue that the source condition and the closedness condition are inherently conflicting, so relaxing the latter significantly improves upon the existing literature that requires both conditions. Our estimator can achieve this improvement because it builds on a novel formulation of the IV estimation problem as a constrained optimization problem.
    Date: 2023–02
  7. By: Robert Adamek; Stephan Smeekes; Ines Wilms
    Abstract: We introduce a high-dimensional multiplier bootstrap for time series data based capturing dependence through a sparsely estimated vector autoregressive model. We prove its consistency for inference on high-dimensional means under two different moment assumptions on the errors, namely sub-gaussian moments and a finite number of absolute moments. In establishing these results, we derive a Gaussian approximation for the maximum mean of a linear process, which may be of independent interest.
    Date: 2023–02
  8. By: Luis Alvarez; Bruno Ferman
    Abstract: In settings with few treated units, Difference-in-Differences (DID) estimators are not consistent, and are not generally asymptotically normal. This poses relevant challenges for inference. While there are inference methods that are valid in these settings, some of these alternatives are not readily available when there is variation in treatment timing and heterogeneous treatment effects; or for deriving uniform confidence bands for event-study plots. We present alternatives in settings with few treated units that are valid with variation in treatment timing and/or that allow for uniform confidence bands.
    Date: 2023–02
  9. By: Zuzana Irsova (Charles University, Prague); Pedro R. D. Bom (University of Deusto, Bilbao); Tomas Havranek (Charles University, Prague & Centre for Economic Policy Research, London & Meta-Research Innovation Center, Stanford); Heiko Rachinger (University of the Balearic Islands, Palma)
    Abstract: Meta-analysis upweights studies reporting lower standard errors and hence more preci- sion. But in empirical practice, notably in observational research, precision is not given to the researcher. Precision must be estimated, and thus can be p-hacked to achieve statistical significance. Simulations show that a modest dose of spurious precision creates a formidable problem for inverse-variance weighting and bias-correction methods based on the funnel plot. Selection models fail to solve the problem, and the simple mean can beat sophisticated estimators. Cures to publication bias may become worse than the disease. We introduce an approach that surmounts spuriousness: the Meta-Analysis Instrumental Variable Estimator (MAIVE), which employs inverse sample size as an instrument for reported variance.
    Keywords: Publication bias, p-hacking, selection models, meta-regression, fun- nel plot, inverse-variance weighting
    JEL: C15 C26 C83
    Date: 2023–02
  10. By: Michail Tsagris; Abdulaziz Alenazi; Connie Stewart
    Abstract: Compositional data arise in many real-life applications and versatile methods for properly analyzing this type of data in the regression context are needed. When parametric assumptions do not hold or are difficult to verify, non-parametric regression models can provide a convenient alternative method for prediction. To this end, we consider an extension to the classical k-NN regression, termed a-k-NN regression, that yields a highly flexible non-parametric regression model for compositional data through the use of the a-transformation.
    Keywords: compositional data, regression,  α-transformation, k-NN algorithm, kernel regression
    JEL: C14
    Date: 2023–02–08
  11. By: Aisha Fayomi; Yannis Pantazis; Michail Tsagris; Andrew Wood
    Abstract: In this paper, we propose a modified formulation of the principal components analysis, based on the use of a multivariate Cauchy likelihood instead of the Gaussian likelihood, which has the effect of robustifying the principal components. We present an algorithm to compute these robustified principal components. We additionally derive the relevant influence function of the first component and examine its theoretical properties.
    Keywords: Principal component analysis, robust, Cauchy log-likelihood, high-dimensional data
    JEL: C13
    Date: 2023–02–08
  12. By: Haruki Kono; Kota Saito; Alec Sandroni
    Abstract: The random utility model is one of the most fundamental models in discrete choice analysis in economics. Although Falmagne (1978) obtained an axiomatization of the random utility model, his characterization requires strong observability of choices, i.e., that the frequency of choices must be observed from all subsets of the set of alternatives. Little is known, however, about the axiomatization when a dataset is incomplete, i.e., the frequencies on some choice sets are not observable. In fact, it is known that in some cases, obtaining a tight characterization is NP hard. On the other hand, datasets in reality almost always violate the requirements on observability assumed by Falmagne (1978). We consider an incomplete dataset in which we do not observe frequencies of some alternatives: for all other alternatives, we observe frequencies. For such a dataset, we obtain a finite system of linear inequalities that is necessary and sufficient for the dataset to be rationalized by a random utility model. Moreover, the necessary and sufficient condition is tight in the sense that none of the inequalities is implied by the other inequalities, and dropping any one of the inequalities makes the condition not sufficient.
    Date: 2023–02
  13. By: Manuel Arellano (CEMFI, Centro de Estudios Monetarios y Financieros); Richard Blundell (UCL and IFS); Stéphane Bonhomme (University of Chicago); Jack Light (University of Chicago)
    Abstract: In this paper we use the enhanced consumption data in the Panel Survey of Income Dynamics (PSID) from 2005-2017 to explore the transmission of income shocks to consumption. We build on the nonlinear quantile framework introduced in Arellano, Blundell and Bonhomme (2017). Our focus is on the estimation of consumption responses to persistent nonlinear income shocks in the presence of unobserved heterogeneity. To reliably estimate heterogeneous responses in our unbalanced panel, we develop Sequential Monte Carlo computational methods. We find substantial heterogeneity in consumption responses, and uncover latent types of households with different life-cycle consumption behavior. Ordering types according to their average log-consumption, we find that low-consumption types respond more strongly to income shocks at the beginning of the life cycle and when their assets are low, as standard life-cycle theory would predict. In contrast, high-consumption types respond less on average, and in a way that changes little with age or assets. We examine various mechanisms that might explain this heterogeneity.
    Keywords: Nonlinear income persistence, consumption dynamics, partial insurance, heterogeneity, panel data.
    JEL: C23 D31 D91
    Date: 2023–02

This nep-ecm issue is ©2023 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.