nep-ecm New Economics Papers
on Econometrics
Issue of 2022‒01‒03
eighteen papers chosen by
Sune Karlsson
Örebro universitet

  1. When Can We Ignore Measurement Error in the Running Variable? By Yingying Dong; Michal Koles\'ar
  2. Simple Alternatives to the Common Correlated Effects Model By Nicholas L. Brown; Peter Schmidt; Jeffrey M. Wooldridge
  3. Asymptotics for Time-Varying Vector MA(∞) Processes By Yayi Yan; Jiti Gao; Bin Peng
  4. The Fixed-b Limiting Distribution and the ERP of HAR Tests Under Nonstationarity By Alessandro Casini
  5. Identification Of Mixtures Of Dynamic Discrete Choices By Higgins, Ayden; Jochmans, Koen
  6. Approximating Bayes in the 21st Century By Gael M. Martin; David T. Frazier; Christian P. Robert
  7. Comment on Giacomini, Kitagawa and Read's 'Narrative Restrictions and Proxies' By Lutz Kilian
  8. RIF Regression via Sensitivity Curves By Javier Alejo; Gabriel Montes-Rojas; Walter Sosa-Escudero
  9. Instrumental-Variable Estimation Of Exponential Regression Models With Two-Way Fixed Effects With An Application To Gravity Equations By Jochmans, Koen; Verardi, Vincenzo
  10. An explicit split point procedure in model-based trees allowing for a quick fitting of GLM trees and GLM forests By Christophe Dutang; Quentin Guibert
  11. Bias In Instrumental-Variable Estimators Of Fixed-Effect Models For Count Data By Jochmans, Koen
  12. Long and short memory in dynamic term structure models By Salman Huseynov
  13. Estimating initial conditions for dynamical systems with incomplete information By Farmer, J. Doyne; Kolic, Blas; Sabuco, Juan
  14. Structured Additive Regression and Tree Boosting By Michael Mayer; Steven C. Bourassa; Martin Hoesli; Donato Scognamiglio
  15. Realized GARCH, CBOE VIX, and the Volatility Risk Premium By Peter Reinhard Hansen; Zhuo Huang; Chen Tong; Tianyi Wang
  16. A Nifty Fix for Published Distribution Statistics: Simplified Distribution-Free Statistical Inference By Charles Beach
  17. Selection in Surveys By Deniz Dutz; Ingrid Huitfeldt; Santiago Lacouture; Magne Mogstad; Alexander Torgovitsky; Winnie van Dijk
  18. Estimation of nonlinear functions using coarsely discrete measures in panel data: The relationship between land prices and earthquake risk in the Tokyo Metropolitan District By Gu, Tao; Nakagawa, Masayuki; Saito, Makoto; Yamaga, Hisaki

  1. By: Yingying Dong; Michal Koles\'ar
    Abstract: In many empirical applications of regression discontinuity designs, the running variable used by the administrator to assign treatment is only observed with error. This paper provides easily interpretable conditions under which ignoring the measurement error nonetheless yields an estimate with a causal interpretation: the average treatment effect for units with the value of the observed running variable equal to the cutoff. To accommodate various types of measurement error, we propose to conduct inference using recently developed bias-aware methods, which remain valid even when discreteness or irregular support in the observed running variable may lead to partial identification. We illustrate the results for both sharp and fuzzy designs in an empirical application.
    Date: 2021–11
  2. By: Nicholas L. Brown; Peter Schmidt; Jeffrey M. Wooldridge
    Abstract: We study estimation of factor models in a fixed-T panel data setting and significantly relax the common correlated effects (CCE) assumptions pioneered by Pesaran (2006) and used in dozens of papers since. In the simplest case, we model the unobserved factors as functions of the cross-sectional averages of the explanatory variables and show that this is implied by Pesaran's assumptions when the number of factors does not exceed the number of explanatory variables. Our approach allows discrete explanatory variables and flexible functional forms in the covariates. Plus, it extends to a framework that easily incorporates general functions of cross-sectional moments, in addition to heterogeneous intercepts and time trends. Our proposed estimators include Pesaran's pooled correlated common effects (CCEP) estimator as a special case. We also show that in the presence of heterogeneous slopes our estimator is consistent under assumptions much weaker than those previously used. We derive the fixed-T asymptotic normality of a general estimator and show how to adjust for estimation of the population moments in the factor loading equation.
    Date: 2021–12
  3. By: Yayi Yan; Jiti Gao; Bin Peng
    Abstract: Moving average infinity (MA(∞)) processes play an important role in modeling time series data. While a strand of literature on time series analysis emphasizes the importance of modeling smooth changes over time and therefore is shifting its focus from parametric models to nonparametric ones, MA(∞) processes with constant parameters are often part of the fundamental data generating mechanism. Along this line of research, an intuitive question is how to allow the underlying data generating mechanism evolves over time. To better capture the dynamics, this paper considers a new class of time-varying vector moving average infinity (VMA(∞)) processes. Accordingly, we establish some new asymptotic properties, including the law of large numbers, the uniform convergence, the central limit theory, the bootstrap consistency, and the long-run covariance matrix estimation for the class of time-varying VMA(∞) processes. Finally, we demonstrate the empirical relevance and usefulness of the newly proposed model and estimation theory through extensive simulated and real data studies.
    Keywords: multivariate time series, nonparametric kernel estimation, time-varying Beveridge–Nelson decomposition
    JEL: C14 C32 E52
    Date: 2021
  4. By: Alessandro Casini
    Abstract: We show that the nonstandard limiting distribution of HAR test statistics under fixed-b asymptotics is not pivotal (even after studentization) when the data are nonstationarity. It takes the form of a complicated function of Gaussian processes and depends on the integrated local long-run variance and on on the second moments of the relevant series (e.g., of the regressors and errors for the case of the linear regression model). Hence, existing fixed-b inference methods based on stationarity are not theoretically valid in general. The nuisance parameters entering the fixed-b limiting distribution can be consistently estimated under small-b asymptotics but only with nonparametric rate of convergence. Hence, We show that the error in rejection probability (ERP) is an order of magnitude larger than that under stationarity and is also larger than that of HAR tests based on HAC estimators under conventional asymptotics. These theoretical results reconcile with recent finite-sample evidence in Casini (2021) and Casini, Deng and Perron (2021) who showing that fixed-b HAR tests can perform poorly when the data are nonstationary. They can be conservative under the null hypothesis and have non-monotonic power under the alternative hypothesis irrespective of how large the sample size is.
    Date: 2021–11
  5. By: Higgins, Ayden; Jochmans, Koen
    Abstract: This paper provides new identification results for finite mixtures of Markov processes. Our arguments are constructive and show that identification can be achieved from knowledge of the cross-sectional distribution of three (or more) effective time-series observations under simple conditions. Our approach is contrasted with the ones taken in prior work by Kasahara and Shimotsu (2009) and Hu and Shum (2012). Most notably, monotonicity restrictions that link conditional distributions to latent types are not needed. Maximum likelihood is considered for the purpose of estimation and inference. Implementation via the EM algorithm is straightforward. Its performance is evaluated in a simulation exercise.
    Keywords: Discrete choice; heterogeneity; Markov process; mixture; state dependence
    JEL: C14 C23 C51
    Date: 2021–11–30
  6. By: Gael M. Martin; David T. Frazier; Christian P. Robert
    Abstract: The 21st century has seen an enormous growth in the development and use of approximate Bayesian methods. Such methods produce computational solutions to certain `intractable' statistical problems that challenge exact methods like Markov chain Monte Carlo: for instance, models with unavailable likelihoods, high-dimensional models, and models featuring large data sets. These approximate methods are the subject of this review. The aim is to help new researchers in particular -- and more generally those interested in adopting a Bayesian approach to empirical work -- distinguish between different approximate techniques; understand the sense in which they are approximate; appreciate when and why particular methods are useful; and see the ways in which they can can be combined.
    Keywords: Approximate Bayesian inference, intractable Bayesian problems, approximate Bayesian computation, Bayesian synthetic likelihood, variational Bayes, integrated nested Laplace approximation
    Date: 2021
  7. By: Lutz Kilian
    Abstract: In a series of recent studies, Raffaella Giacomini and Toru Kitagawa have developed an innovative new methodological approach to estimating sign-identified structural VAR models that seeks to build a bridge between Bayesian and frequentist approaches in the literature. Their latest paper with Matthew Read contains thought-provoking new insights about modeling narrative restrictions in sign-identified structural VAR models. My discussion puts their contribution into the context of Giacomini and Kitagawa’s broader research agenda and relates it to the larger literature on estimating structural VAR models subject to sign restrictions.
    Keywords: Structural VAR; single prior; multiple prior; posterior; joint inference; impulse response; narrative restrictions
    JEL: C11 C32 C52
    Date: 2021–12–17
  8. By: Javier Alejo; Gabriel Montes-Rojas; Walter Sosa-Escudero
    Abstract: This paper proposes an empirical method to implement the recentered influence function (RIF) regression of Firpo, Fortin and Lemieux (2009), a relevant method to study the effect of covariates on many statistics beyond the mean. In empirically relevant situations where the influence function is not available or difficult to compute, we suggest to use the \emph{sensitivity curve} (Tukey, 1977) as a feasible alternative. This may be computationally cumbersome when the sample size is large. The relevance of the proposed strategy derives from the fact that, under general conditions, the sensitivity curve converges in probability to the influence function. In order to save computational time we propose to use a cubic splines non-parametric method for a random subsample and then to interpolate to the rest of the cases where it was not computed. Monte Carlo simulations show good finite sample properties. We illustrate the proposed estimator with an application to the polarization index of Duclos, Esteban and Ray (2004).
    Date: 2021–12
  9. By: Jochmans, Koen; Verardi, Vincenzo
    Abstract: This paper introduces instrumental-variable estimators for exponential-regression models that feature two-way fixed effects. These techniques allow us to develop a theory-consistent approach to the estimation of cross-sectional gravity equations that can accommodate the endogeneity of policy variables. We apply this approach to a data set in which the policy decision of interest is the engagement in a free trade agreement. We explore ways to exploit the transitivity observed in the formation of trade agreements to construct instrumental variables with considerable predictive ability. Within a bilateral model, the use of these instruments has strong theoretical foundations. We obtain point estimates of the partial effect of a preferential-trade agreement on trade volume that range between 20% and 30% and find no statistical evidence of endogeneity.
    Keywords: Bias correction; count data; differencing estimator; endogeneity; fixed effects;; gravity equation; instrumental variable; transitivity
    JEL: C23 C26 F14
    Date: 2021–11–30
  10. By: Christophe Dutang (CEREMADE - CEntre de REcherches en MAthématiques de la DEcision - CNRS - Centre National de la Recherche Scientifique - Université Paris Dauphine-PSL - PSL - Université Paris sciences et lettres); Quentin Guibert (CEREMADE - CEntre de REcherches en MAthématiques de la DEcision - CNRS - Centre National de la Recherche Scientifique - Université Paris Dauphine-PSL - PSL - Université Paris sciences et lettres)
    Abstract: Classification and regression trees (CART) prove to be a true alternative to full parametric models such as linear models (LM) and generalized linear models (GLM). Although CART suffer from a biased variable selection issue, they are commonly applied to various topics and used for tree ensembles and random forests because of their simplicity and computation speed. Conditional inference trees and model-based trees algorithms for which variable selection is tackled via fluctuation tests are known to give more accurate and interpretable results than CART, but yield longer computation times. Using a closed-form maximum likelihood estimator for GLM, this paper proposes a split point procedure based on the explicit likelihood in order to save time when searching for the best split for a given splitting variable. A simulation study for non-Gaussian response is performed to assess the computational gain when building GLM trees. We also propose a benchmark on simulated and empirical datasets of GLM trees against CART, conditional inference trees and LM trees in order to identify situations where GLM trees are efficient. This approach is extended to multiway split trees and log-transformed distributions. Making GLM trees possible through a new split point procedure allows us to investigate the use of GLM in ensemble methods. We propose a numerical comparison of GLM forests against other random forest-type approaches. Our simulation analyses show cases where GLM forests are good challengers to random forests.
    Keywords: GLM,model-based recursive partitioning,GLM trees,random forest,GLM forest
    Date: 2021–11–11
  11. By: Jochmans, Koen
    Abstract: This note looks at the properties of instrumental-variable estimators of models for non-negative outcomes in the presence of individual effects. We show that fixed-effect versions of the estimators of Mullahy (1997) and Windmeijer and Santos Silva (1997) are inconsistent under conventional asymptotics, in general, and that inference based on them in long panels requires bias correction. Such corrections are derived and their effectiveness is investigated in numerical experiments. Consistent estimation in short panels is nonetheless possible in the setting underlying Mullahy’s (1997) approach using a differencing strategy along the lines of Wooldridge (1997) and Windmeijer (2000).
    Keywords: Count data; Bias; Fixed effects; Inconsistency; Instrumental variable;; Multiplicative-error model; Poisson.
    JEL: C23 C26
    Date: 2021–10
  12. By: Salman Huseynov (Aarhus University, Department of Economics and Business Economics and CREATES)
    Abstract: I provide a unified theoretical framework for long memory term structure models and show that the recent state-space approach suffers from a parameter identification problem. I propose a different framework to estimate long memory models in a state-space setup, which addresses the shortcomings of the existing approach. The proposed framework allows asymmetrically treating the physical and risk-neutral dynamics, which simplifies estimation considerably and helps to conduct an extensive comparison with standard term structure models. Relying on a battery of tests, I find that standard term structure models perform just as well as the more complicated long memory models and produce plausible term premium estimates.
    Keywords: Dynamic term structure models, Long memory, Affine model, Shadow rate model
    JEL: C32 E43 G12
    Date: 2021–12–20
  13. By: Farmer, J. Doyne; Kolic, Blas; Sabuco, Juan
    Abstract: In this paper we study the problem of inferring the initial conditions of a dynamical system under incomplete information. Studying several model systems, we infer the latent microstates that best reproduce an observed time series when the observations are sparse, noisy and aggregated under a (possibly) nonlinear observation operator. This is done by minimizing the least-squares distance between the observed time series and a model-simulated time series using gradient-based methods. We validate this method for the Lorenz and Mackey-Glass systems by making out-of-sample predictions. Finally, we analyze the predicting power of our method as a function of the number of observations available. We find a critical transition for the MackeyGlass system, beyond which it can be initialized with arbitrary precision.
    Date: 2021–09
  14. By: Michael Mayer (Schweizerische Mobiliar Versicherungsgesellschaft); Steven C. Bourassa (Florida Atlantic University); Martin Hoesli (University of Geneva - Geneva School of Economics and Management (GSEM); Swiss Finance Institute; University of Aberdeen - Business School); Donato Scognamiglio (IAZI AG and University of Bern)
    Abstract: Structured additive regression (STAR) models are a rich class of regression models that include the generalized linear model (GLM) and the generalized additive model (GAM). STAR models can be fitted by Bayesian approaches, component-wise gradient boosting, penalized least-squares, and deep learning. Using feature interaction constraints, we show that such models can be implemented also by the gradient boosting powerhouses XGBoost and LightGBM, thereby benefiting from their excellent predictive capabilities. Furthermore, we show how STAR models can be used for supervised dimension reduction and explain under what circumstances covariate effects of such models can be described in a transparent way. We illustrate the methodology with case studies pertaining to house price modeling, with very encouraging results regarding both interpretability and predictive performance.
    Keywords: machine learning, structured additive regression, gradient boosting, interpretability, transparency
    JEL: C13 C21 C45 C51 C52 C55 R31
    Date: 2021–09
  15. By: Peter Reinhard Hansen; Zhuo Huang; Chen Tong; Tianyi Wang
    Abstract: We show that the Realized GARCH model yields close-form expression for both the Volatility Index (VIX) and the volatility risk premium (VRP). The Realized GARCH model is driven by two shocks, a return shock and a volatility shock, and these are natural state variables in the stochastic discount factor (SDF). The volatility shock endows the exponentially affine SDF with a compensation for volatility risk. This leads to dissimilar dynamic properties under the physical and risk-neutral measures that can explain time-variation in the VRP. In an empirical application with the S&P 500 returns, the VIX, and the VRP, we find that the Realized GARCH model significantly outperforms conventional GARCH models.
    Date: 2021–12
  16. By: Charles Beach
    Abstract: This paper applies the tool box measures of disaggregative income inequality characterization and the statistical methodology of Beach (2021) to percentile-based distribution statistics such as quintile income shares and decile means typically published by official statistical agencies. It derives standard error formulas for those measures which are distribution-free and easy to implement. The approach is illustrated with Canadian Labour Force Survey data over 1997-2015. It is found that widely shared real earnings gains were experienced over this period, but that the gains were very unevenly shared with middle-class workers losing out relatively and top earners having highly statistically significant earnings gains.
    Keywords: income inequality, inequality inference, statistical inference
    JEL: C12 C46 D31 D63
    Date: 2021–09
  17. By: Deniz Dutz; Ingrid Huitfeldt (Statistics Norway); Santiago Lacouture; Magne Mogstad (Statistics Norway); Alexander Torgovitsky; Winnie van Dijk
    Abstract: We evaluate how nonresponse affects conclusions drawn from survey data and consider how researchers can reliably test and correct for nonresponse bias. To do so, we examine a survey on labor market conditions during the COVID-19 pandemic that used randomly assigned financial incentives to encourage participation. We link the survey data to administrative data sources, allowing us to observe a ground truth for participants and nonparticipants. We find evidence of large nonresponse bias, even after correcting for observable differences between participants and nonparticipants. We apply a range of existing methods that account for nonresponse bias due to unobserved differences, including worst-case bounds, bounds that incorporate monotonicity assumptions, and approaches based on parametric and nonparametric selection models. These methods produce bounds (or point estimates) that are either too wide to be useful or far from the ground truth. We show how these shortcomings can be addressed by modeling how nonparticipation can be both active (declining to participate) and passive (not seeing the survey invitation). The model makes use of variation from the randomly assigned financial incentives, as well as the timing of reminder emails. Applying the model to our data produces bounds (or point estimates) that are narrower and closer to the ground truth than the other methods.
    Keywords: Survey; nonresponse; nonresponse bias
    JEL: C01 C81 C83
    Date: 2021–12
  18. By: Gu, Tao; Nakagawa, Masayuki; Saito, Makoto; Yamaga, Hisaki
    Abstract: This paper proposes a simple method to estimate a nonlinear function using only coarsely discrete explanatory variables in panel data. The basic premise is to distinguish carefully between two types of discrete variables by assuming that if the variable changes between two points in time, it increases (decreases) marginally from near the upper (lower) bound one rank below (above). The dynamic pricing behavior at the boundary between two consecutive ranks is then properly approximated. Applying the proposed method, we estimate the nonlinear relationship between land prices and earthquake risk, with the latter being assessed over only five ranks. The panel datasets used comprise some two thousand fixed places over time in the Tokyo Metropolitan District. We interpret the estimated nonlinear land pricing functions using prospect theory from behavioral economics.
    JEL: R14 R30 D91
    Date: 2021–12

This nep-ecm issue is ©2022 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.