nep-ecm New Economics Papers
on Econometrics
Issue of 2022‒01‒31
fourteen papers chosen by
Sune Karlsson
Örebro universitet

  1. Inferential Theory for Granular Instrumental Variables in High Dimensions By Saman Banafti; Tae-Hwy Lee
  2. Machine Learning Based Semiparametric Time Series Conditional Variance: Estimation and Forecasting By Justin Dang; Aman Ullah
  3. Binary response model with many weak instruments By Dakyung Seong
  4. Efficient Likelihood-based Estimation via Annealing for Dynamic Structural Macrofinance Models By Andras Fulop; Jeremy Heng; Junye Li
  5. Generalized Kernel Ridge Regression for Long Term Causal Inference: Treatment Effects, Dose Responses, and Counterfactual Distributions By Rahul Singh
  6. Optimal Out-of-Sample Forecast Evaluation under Stationarity By Filip Stanek
  7. Compensatory model for quantile estimation and application to VaR By Shuzhen Yang
  8. A GMM Approach for Non-monotone Missingness on Both Treatment and Outcome Variables By Shenshen Yang
  9. Optimal Fixed-Budget Best Arm Identification using the Augmented Inverse Probability Estimator in Two-Armed Gaussian Bandits with Unknown Variances By Masahiro Kato; Kaito Ariu; Masaaki Imaizumi; Masatoshi Uehara; Masahiro Nomura; Chao Qin
  10. Visualization, Identification, and stimation in the Linear Panel Event-Study Design By Simon Freyaldenhoven; Christian Hansen; Jorge Perez Perez; Jesse Shapiro
  11. What's Trending in Difference-in-Differences? A Synthesis of the Recent Econometrics Literature By Jonathan Roth; Pedro H. C. Sant'Anna; Alyssa Bilinski; John Poe
  12. Misbehaving’ RCTs: the confounding problem of human agency By Kabeer, Naila
  13. Behavioral Foundations of Nested Stochastic Choice and Nested Logit By Matthew Kovach; Gerelt Tserenjigmid
  14. Ranking and Selection from Pairwise Comparisons: Empirical Bayes Methods for Citation Analysis By Jiaying Gu; Roger Koenker

  1. By: Saman Banafti (University of California Riverside); Tae-Hwy Lee (Department of Economics, University of California Riverside)
    Abstract: The Granular Instrumental Variables (GIV) methodology exploits panels with factor error structures to construct instruments to estimate structural time series models with endogeneity even after controlling for latent factors. We extend the GIV methodology in several dimensions. First, we extend the identification procedure to a large $N$ and large $T$ framework, which depends on the asymptotic Herfindahl index of the size distribution of $N$ cross-sectional units. Second, we treat both the factors and loadings as unknown and show that the sampling error in the estimated instrument and factors is negligible when considering the limiting distribution of the structural parameters. Third, we show that the sampling error in the high-dimensional precision matrix is negligible in our estimation algorithm. Fourth, we overidentify the structural parameters with additional constructed instruments, which leads to efficiency gains. Monte Carlo evidence is presented to support our asymptotic theory and application to the global crude oil market leads to new results.
    Keywords: Interactive effects, Factor error structure, Simultaneity, Power-law tails, Asymptotic Herfindahl index, Global crude oil market, Precision matrix.
    JEL: C26 C36 C38
    Date: 2022–01
  2. By: Justin Dang (UCR); Aman Ullah (Department of Economics, University of California Riverside)
    Abstract: This paper proposes a new combined semiparametric estimator of the conditional variance that takes the product of a parametric estimator and a nonparametric estimator based on machine learning. A popular kernel based machine learning algorithm, known as kernel regularized least squares estimator, is used to estimate the nonparametric component. We discuss how to estimate the semiparametric estimator using real data and how to use this estimator to make forecasts for the conditional variance.Simulations are conducted to show the dominance of the proposed estimator in terms of mean squared error. An empirical application using S&P 500 daily returns is analyzed, and the semiparametric estimator effectively forecasts future volatility.
    Keywords: Conditional variance; Nonparametric estimator; Semiparametric models; Forecasting; Machine Learning
    JEL: C01 C14 C51
    Date: 2021–01
  3. By: Dakyung Seong
    Abstract: This paper considers an endogenous binary response model with many weak instruments. We in the current paper employ a control function approach and a regularization scheme to obtain better estimation results for the endogenous binary response model in the presence of many weak instruments. Two consistent and asymptotically normally distributed estimators are provided, each of which is called a regularized conditional maximum likelihood estimator (RCMLE) and a regularized nonlinear least square estimator (RNLSE) respectively. Monte Carlo simulations show that the proposed estimators outperform the existing estimators when many weak instruments are present. We apply our estimation method to study the effect of family income on college completion.
    Date: 2022–01
  4. By: Andras Fulop; Jeremy Heng; Junye Li
    Abstract: Most solved dynamic structural macrofinance models are non-linear and/or non-Gaussian state-space models with high-dimensional and complex structures. We propose an annealed controlled sequential Monte Carlo method that delivers numerically stable and low variance estimators of the likelihood function. The method relies on an annealing procedure to gradually introduce information from observations and constructs globally optimal proposal distributions by solving associated optimal control problems that yield zero variance likelihood estimators. To perform parameter inference, we develop a new adaptive SMC$^2$ algorithm that employs likelihood estimators from annealed controlled sequential Monte Carlo. We provide a theoretical stability analysis that elucidates the advantages of our methodology and asymptotic results concerning the consistency and convergence rates of our SMC$^2$ estimators. We illustrate the strengths of our proposed methodology by estimating two popular macrofinance models: a non-linear new Keynesian dynamic stochastic general equilibrium model and a non-linear non-Gaussian consumption-based long-run risk model.
    Date: 2022–01
  5. By: Rahul Singh
    Abstract: I propose kernel ridge regression estimators for long term causal inference, where a short term experimental data set containing randomized treatment and short term surrogates is fused with a long term observational data set containing short term surrogates and long term outcomes. I propose estimators of treatment effects, dose responses, and counterfactual distributions with closed form solutions in terms of kernel matrix operations. I allow covariates, treatment, and surrogates to be discrete or continuous, and low, high, or infinite dimensional. For long term treatment effects, I prove $\sqrt{n}$ consistency, Gaussian approximation, and semiparametric efficiency. For long term dose responses, I prove uniform consistency with finite sample rates. For long term counterfactual distributions, I prove convergence in distribution.
    Date: 2022–01
  6. By: Filip Stanek
    Abstract: It is common practice to split time-series into in-sample and pseudo out-of-sample segments and to estimate the out-of-sample loss of a given statistical model by evaluating forecasting performance over the pseudo out-of-sample segment. We propose an alternative estimator of the out-of-sample loss which, contrary to conventional wisdom, utilizes both measured in-sample and out-of-sample performance via a carefully constructed system of affine weights. We prove that, provided that the time-series is stationary, the proposed estimator is the best linear unbiased estimator of the out-of-sample loss and outperforms the conventional estimator in terms of sampling variance. Applying the optimal estimator to Diebold-Mariano type tests of predictive ability leads to a substantial power gain without worsening finite sample level distortions. An extensive evaluation on real world time-series from the M4 forecasting competition confirms the superiority of the proposed estimator and also demonstrates a substantial robustness to the violation of the underlying assumption of stationarity.
    Keywords: loss estimation; forecast evaluation; cross-validation; model selection;
    JEL: C22 C52 C53
    Date: 2021–11
  7. By: Shuzhen Yang
    Abstract: In contrast to the usual procedure of estimating the distribution of a time series and then obtaining the quantile from the distribution, we develop a compensatory model to improve the quantile estimation under a given distribution estimation. A novel penalty term is introduced in the compensatory model. We prove that the penalty term can control the convergence error of the quantile estimation of a given time series, and obtain an adaptive adjusted quantile estimation. Simulation and empirical analysis indicate that the compensatory model can significantly improve the performance of the value at risk (VaR) under a given distribution estimation.
    Date: 2021–12
  8. By: Shenshen Yang
    Abstract: I examine the common problem of multiple missingness on both the endogenous treatment and outcome variables. Two types of dependence assumptions for missing mechanisms are proposed for identification, based on which a two-step AIPW GMM estimator is proposed. This estimator is unbiased and more efficient than the previously used estimation methods. Statistical properties are discussed case by case. This method is applied to the Oregon Health Insurance Experiment and shows the significant effects of enrolling in the Oregon Health Plan on improving health-related outcomes and reducing out-of-pocket costs for medical care. There is evidence that simply dropping the incomplete data creates downward biases for some of the chosen outcome variables. Moreover, the estimator proposed in this paper reduced standard errors by 6-24% of the estimated effects of the Oregon Health Plan.
    Date: 2022–01
  9. By: Masahiro Kato; Kaito Ariu; Masaaki Imaizumi; Masatoshi Uehara; Masahiro Nomura; Chao Qin
    Abstract: We consider the fixed-budget best arm identification problem in two-armed Gaussian bandits with unknown variances. The tightest lower bound on the complexity and an algorithm whose performance guarantee matches the lower bound have long been open problems when the variances are unknown and when the algorithm is agnostic to the optimal proportion of the arm draws. In this paper, we propose a strategy comprising a sampling rule with randomized sampling (RS) following the estimated target allocation probabilities of arm draws and a recommendation rule using the augmented inverse probability weighting (AIPW) estimator, which is often used in the causal inference literature. We refer to our strategy as the RS-AIPW strategy. In the theoretical analysis, we first derive a large deviation principle for martingales, which can be used when the second moment converges in mean, and apply it to our proposed strategy. Then, we show that the proposed strategy is asymptotically optimal in the sense that the probability of misidentification achieves the lower bound by Kaufmann et al. (2016) when the sample size becomes infinitely large and the gap between the two arms goes to zero.
    Date: 2022–01
  10. By: Simon Freyaldenhoven; Christian Hansen; Jorge Perez Perez; Jesse Shapiro
    Abstract: Linear panel models, and the “event-study plots” that often accompany them, are popular tools for learning about policy effects. We discuss the construction of event-study plots and suggest ways to make them more informative. We examine the economic content of different possible identifying assumptions. We explore the performance of the corresponding estimators in simulations, highlighting that a given estimator can perform well or poorly depending on the economic environment. An accompanying Stata package, xtevent, facilitates adoption of our suggestions.
    Keywords: linear panel data models; difference-in-differences; staggered adoption; pre-trends; event study
    JEL: C23 C52
    Date: 2021–12–20
  11. By: Jonathan Roth; Pedro H. C. Sant'Anna; Alyssa Bilinski; John Poe
    Abstract: This paper synthesizes recent advances in the econometrics of difference-in-differences (DiD) and provides concrete recommendations for practitioners. We begin by articulating a simple set of "canonical" assumptions under which the econometrics of DiD are well-understood. We then argue that recent advances in DiD methods can be broadly classified as relaxing some components of the canonical DiD setup, with a focus on $(i)$ multiple periods and variation in treatment timing, $(ii)$ potential violations of parallel trends, or $(iii)$ alternative frameworks for inference. Our discussion highlights the different ways that the DiD literature has advanced beyond the canonical model, and helps to clarify when each of the papers will be relevant for empirical work. We conclude by discussing some promising areas for future research.
    Date: 2022–01
  12. By: Kabeer, Naila
    Abstract: This paper argues that the theoretical model of causal inference underpinning RCTs is frequently undermined by the failure of different actors involved in their implementation to behave in ways required by the model. This is not a problem unique to RCTs, but it poses a greater challenge to them because it undercuts their claims to methodological superiority based on the ‘clean identification’ of causal effects.
    JEL: J1
    Date: 2020–03–01
  13. By: Matthew Kovach; Gerelt Tserenjigmid
    Abstract: We provide the first behavioral characterization of nested logit, a foundational and widely applied discrete choice model, through the introduction of a non-parametric version of nested logit that we call Nested Stochastic Choice (NSC). NSC is characterized by a single axiom that weakens Independence of Irrelevant Alternatives based on revealed similarity to allow for the similarity effect. Nested logit is characterized by an additional menu-independence axiom. Our axiomatic characterization leads to a practical, data-driven algorithm that identifies the true nest structure from choice data. We also discuss limitations of generalizing nested logit by studying the testable implications of cross-nested logit.
    Date: 2021–12
  14. By: Jiaying Gu; Roger Koenker
    Abstract: We study the Stigler model of citation flows among journals adapting the pairwise comparison model of Bradley and Terry to do ranking and selection of journal influence based on nonparametric empirical Bayes procedures. Comparisons with several other rankings are made.
    Date: 2021–12

This nep-ecm issue is ©2022 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.