
on Econometrics 
By:  Yingying Dong; Michal Koles\'ar 
Abstract:  In many empirical applications of regression discontinuity designs, the running variable used by the administrator to assign treatment is only observed with error. This paper provides easily interpretable conditions under which ignoring the measurement error nonetheless yields an estimate with a causal interpretation: the average treatment effect for units with the value of the observed running variable equal to the cutoff. To accommodate various types of measurement error, we propose to conduct inference using recently developed biasaware methods, which remain valid even when discreteness or irregular support in the observed running variable may lead to partial identification. We illustrate the results for both sharp and fuzzy designs in an empirical application. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.07388&r= 
By:  Nicholas L. Brown; Peter Schmidt; Jeffrey M. Wooldridge 
Abstract:  We study estimation of factor models in a fixedT panel data setting and significantly relax the common correlated effects (CCE) assumptions pioneered by Pesaran (2006) and used in dozens of papers since. In the simplest case, we model the unobserved factors as functions of the crosssectional averages of the explanatory variables and show that this is implied by Pesaran's assumptions when the number of factors does not exceed the number of explanatory variables. Our approach allows discrete explanatory variables and flexible functional forms in the covariates. Plus, it extends to a framework that easily incorporates general functions of crosssectional moments, in addition to heterogeneous intercepts and time trends. Our proposed estimators include Pesaran's pooled correlated common effects (CCEP) estimator as a special case. We also show that in the presence of heterogeneous slopes our estimator is consistent under assumptions much weaker than those previously used. We derive the fixedT asymptotic normality of a general estimator and show how to adjust for estimation of the population moments in the factor loading equation. 
Date:  2021–12 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2112.01486&r= 
By:  Yayi Yan; Jiti Gao; Bin Peng 
Abstract:  Moving average infinity (MA(∞)) processes play an important role in modeling time series data. While a strand of literature on time series analysis emphasizes the importance of modeling smooth changes over time and therefore is shifting its focus from parametric models to nonparametric ones, MA(∞) processes with constant parameters are often part of the fundamental data generating mechanism. Along this line of research, an intuitive question is how to allow the underlying data generating mechanism evolves over time. To better capture the dynamics, this paper considers a new class of timevarying vector moving average infinity (VMA(∞)) processes. Accordingly, we establish some new asymptotic properties, including the law of large numbers, the uniform convergence, the central limit theory, the bootstrap consistency, and the longrun covariance matrix estimation for the class of timevarying VMA(∞) processes. Finally, we demonstrate the empirical relevance and usefulness of the newly proposed model and estimation theory through extensive simulated and real data studies. 
Keywords:  multivariate time series, nonparametric kernel estimation, timevarying Beveridgeâ€“Nelson decomposition 
JEL:  C14 C32 E52 
Date:  2021 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:202122&r= 
By:  Alessandro Casini 
Abstract:  We show that the nonstandard limiting distribution of HAR test statistics under fixedb asymptotics is not pivotal (even after studentization) when the data are nonstationarity. It takes the form of a complicated function of Gaussian processes and depends on the integrated local longrun variance and on on the second moments of the relevant series (e.g., of the regressors and errors for the case of the linear regression model). Hence, existing fixedb inference methods based on stationarity are not theoretically valid in general. The nuisance parameters entering the fixedb limiting distribution can be consistently estimated under smallb asymptotics but only with nonparametric rate of convergence. Hence, We show that the error in rejection probability (ERP) is an order of magnitude larger than that under stationarity and is also larger than that of HAR tests based on HAC estimators under conventional asymptotics. These theoretical results reconcile with recent finitesample evidence in Casini (2021) and Casini, Deng and Perron (2021) who showing that fixedb HAR tests can perform poorly when the data are nonstationary. They can be conservative under the null hypothesis and have nonmonotonic power under the alternative hypothesis irrespective of how large the sample size is. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.14590&r= 
By:  Higgins, Ayden; Jochmans, Koen 
Abstract:  This paper provides new identification results for finite mixtures of Markov processes. Our arguments are constructive and show that identification can be achieved from knowledge of the crosssectional distribution of three (or more) effective timeseries observations under simple conditions. Our approach is contrasted with the ones taken in prior work by Kasahara and Shimotsu (2009) and Hu and Shum (2012). Most notably, monotonicity restrictions that link conditional distributions to latent types are not needed. Maximum likelihood is considered for the purpose of estimation and inference. Implementation via the EM algorithm is straightforward. Its performance is evaluated in a simulation exercise. 
Keywords:  Discrete choice; heterogeneity; Markov process; mixture; state dependence 
JEL:  C14 C23 C51 
Date:  2021–11–30 
URL:  http://d.repec.org/n?u=RePEc:tse:wpaper:126197&r= 
By:  Gael M. Martin; David T. Frazier; Christian P. Robert 
Abstract:  The 21st century has seen an enormous growth in the development and use of approximate Bayesian methods. Such methods produce computational solutions to certain `intractable' statistical problems that challenge exact methods like Markov chain Monte Carlo: for instance, models with unavailable likelihoods, highdimensional models, and models featuring large data sets. These approximate methods are the subject of this review. The aim is to help new researchers in particular  and more generally those interested in adopting a Bayesian approach to empirical work  distinguish between different approximate techniques; understand the sense in which they are approximate; appreciate when and why particular methods are useful; and see the ways in which they can can be combined. 
Keywords:  Approximate Bayesian inference, intractable Bayesian problems, approximate Bayesian computation, Bayesian synthetic likelihood, variational Bayes, integrated nested Laplace approximation 
Date:  2021 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:202124&r= 
By:  Lutz Kilian 
Abstract:  In a series of recent studies, Raffaella Giacomini and Toru Kitagawa have developed an innovative new methodological approach to estimating signidentified structural VAR models that seeks to build a bridge between Bayesian and frequentist approaches in the literature. Their latest paper with Matthew Read contains thoughtprovoking new insights about modeling narrative restrictions in signidentified structural VAR models. My discussion puts their contribution into the context of Giacomini and Kitagawa’s broader research agenda and relates it to the larger literature on estimating structural VAR models subject to sign restrictions. 
Keywords:  Structural VAR; single prior; multiple prior; posterior; joint inference; impulse response; narrative restrictions 
JEL:  C11 C32 C52 
Date:  2021–12–17 
URL:  http://d.repec.org/n?u=RePEc:fip:feddwp:93526&r= 
By:  Javier Alejo; Gabriel MontesRojas; Walter SosaEscudero 
Abstract:  This paper proposes an empirical method to implement the recentered influence function (RIF) regression of Firpo, Fortin and Lemieux (2009), a relevant method to study the effect of covariates on many statistics beyond the mean. In empirically relevant situations where the influence function is not available or difficult to compute, we suggest to use the \emph{sensitivity curve} (Tukey, 1977) as a feasible alternative. This may be computationally cumbersome when the sample size is large. The relevance of the proposed strategy derives from the fact that, under general conditions, the sensitivity curve converges in probability to the influence function. In order to save computational time we propose to use a cubic splines nonparametric method for a random subsample and then to interpolate to the rest of the cases where it was not computed. Monte Carlo simulations show good finite sample properties. We illustrate the proposed estimator with an application to the polarization index of Duclos, Esteban and Ray (2004). 
Date:  2021–12 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2112.01435&r= 
By:  Jochmans, Koen; Verardi, Vincenzo 
Abstract:  This paper introduces instrumentalvariable estimators for exponentialregression models that feature twoway fixed effects. These techniques allow us to develop a theoryconsistent approach to the estimation of crosssectional gravity equations that can accommodate the endogeneity of policy variables. We apply this approach to a data set in which the policy decision of interest is the engagement in a free trade agreement. We explore ways to exploit the transitivity observed in the formation of trade agreements to construct instrumental variables with considerable predictive ability. Within a bilateral model, the use of these instruments has strong theoretical foundations. We obtain point estimates of the partial effect of a preferentialtrade agreement on trade volume that range between 20% and 30% and find no statistical evidence of endogeneity. 
Keywords:  Bias correction; count data; differencing estimator; endogeneity; fixed effects;; gravity equation; instrumental variable; transitivity 
JEL:  C23 C26 F14 
Date:  2021–11–30 
URL:  http://d.repec.org/n?u=RePEc:tse:wpaper:126195&r= 
By:  Christophe Dutang (CEREMADE  CEntre de REcherches en MAthématiques de la DEcision  CNRS  Centre National de la Recherche Scientifique  Université Paris DauphinePSL  PSL  Université Paris sciences et lettres); Quentin Guibert (CEREMADE  CEntre de REcherches en MAthématiques de la DEcision  CNRS  Centre National de la Recherche Scientifique  Université Paris DauphinePSL  PSL  Université Paris sciences et lettres) 
Abstract:  Classification and regression trees (CART) prove to be a true alternative to full parametric models such as linear models (LM) and generalized linear models (GLM). Although CART suffer from a biased variable selection issue, they are commonly applied to various topics and used for tree ensembles and random forests because of their simplicity and computation speed. Conditional inference trees and modelbased trees algorithms for which variable selection is tackled via fluctuation tests are known to give more accurate and interpretable results than CART, but yield longer computation times. Using a closedform maximum likelihood estimator for GLM, this paper proposes a split point procedure based on the explicit likelihood in order to save time when searching for the best split for a given splitting variable. A simulation study for nonGaussian response is performed to assess the computational gain when building GLM trees. We also propose a benchmark on simulated and empirical datasets of GLM trees against CART, conditional inference trees and LM trees in order to identify situations where GLM trees are efficient. This approach is extended to multiway split trees and logtransformed distributions. Making GLM trees possible through a new split point procedure allows us to investigate the use of GLM in ensemble methods. We propose a numerical comparison of GLM forests against other random foresttype approaches. Our simulation analyses show cases where GLM forests are good challengers to random forests. 
Keywords:  GLM,modelbased recursive partitioning,GLM trees,random forest,GLM forest 
Date:  2021–11–11 
URL:  http://d.repec.org/n?u=RePEc:hal:journl:hal03448250&r= 
By:  Jochmans, Koen 
Abstract:  This note looks at the properties of instrumentalvariable estimators of models for nonnegative outcomes in the presence of individual effects. We show that fixedeffect versions of the estimators of Mullahy (1997) and Windmeijer and Santos Silva (1997) are inconsistent under conventional asymptotics, in general, and that inference based on them in long panels requires bias correction. Such corrections are derived and their effectiveness is investigated in numerical experiments. Consistent estimation in short panels is nonetheless possible in the setting underlying Mullahy’s (1997) approach using a differencing strategy along the lines of Wooldridge (1997) and Windmeijer (2000). 
Keywords:  Count data; Bias; Fixed effects; Inconsistency; Instrumental variable;; Multiplicativeerror model; Poisson. 
JEL:  C23 C26 
Date:  2021–10 
URL:  http://d.repec.org/n?u=RePEc:tse:wpaper:126253&r= 
By:  Salman Huseynov (Aarhus University, Department of Economics and Business Economics and CREATES) 
Abstract:  I provide a unified theoretical framework for long memory term structure models and show that the recent statespace approach suffers from a parameter identification problem. I propose a different framework to estimate long memory models in a statespace setup, which addresses the shortcomings of the existing approach. The proposed framework allows asymmetrically treating the physical and riskneutral dynamics, which simplifies estimation considerably and helps to conduct an extensive comparison with standard term structure models. Relying on a battery of tests, I find that standard term structure models perform just as well as the more complicated long memory models and produce plausible term premium estimates. 
Keywords:  Dynamic term structure models, Long memory, Affine model, Shadow rate model 
JEL:  C32 E43 G12 
Date:  2021–12–20 
URL:  http://d.repec.org/n?u=RePEc:aah:create:202115&r= 
By:  Farmer, J. Doyne; Kolic, Blas; Sabuco, Juan 
Abstract:  In this paper we study the problem of inferring the initial conditions of a dynamical system under incomplete information. Studying several model systems, we infer the latent microstates that best reproduce an observed time series when the observations are sparse, noisy and aggregated under a (possibly) nonlinear observation operator. This is done by minimizing the leastsquares distance between the observed time series and a modelsimulated time series using gradientbased methods. We validate this method for the Lorenz and MackeyGlass systems by making outofsample predictions. Finally, we analyze the predicting power of our method as a function of the number of observations available. We find a critical transition for the MackeyGlass system, beyond which it can be initialized with arbitrary precision. 
Date:  2021–09 
URL:  http://d.repec.org/n?u=RePEc:amz:wpaper:202120&r= 
By:  Michael Mayer (Schweizerische Mobiliar Versicherungsgesellschaft); Steven C. Bourassa (Florida Atlantic University); Martin Hoesli (University of Geneva  Geneva School of Economics and Management (GSEM); Swiss Finance Institute; University of Aberdeen  Business School); Donato Scognamiglio (IAZI AG and University of Bern) 
Abstract:  Structured additive regression (STAR) models are a rich class of regression models that include the generalized linear model (GLM) and the generalized additive model (GAM). STAR models can be fitted by Bayesian approaches, componentwise gradient boosting, penalized leastsquares, and deep learning. Using feature interaction constraints, we show that such models can be implemented also by the gradient boosting powerhouses XGBoost and LightGBM, thereby benefiting from their excellent predictive capabilities. Furthermore, we show how STAR models can be used for supervised dimension reduction and explain under what circumstances covariate effects of such models can be described in a transparent way. We illustrate the methodology with case studies pertaining to house price modeling, with very encouraging results regarding both interpretability and predictive performance. 
Keywords:  machine learning, structured additive regression, gradient boosting, interpretability, transparency 
JEL:  C13 C21 C45 C51 C52 C55 R31 
Date:  2021–09 
URL:  http://d.repec.org/n?u=RePEc:chf:rpseri:rp2183&r= 
By:  Peter Reinhard Hansen; Zhuo Huang; Chen Tong; Tianyi Wang 
Abstract:  We show that the Realized GARCH model yields closeform expression for both the Volatility Index (VIX) and the volatility risk premium (VRP). The Realized GARCH model is driven by two shocks, a return shock and a volatility shock, and these are natural state variables in the stochastic discount factor (SDF). The volatility shock endows the exponentially affine SDF with a compensation for volatility risk. This leads to dissimilar dynamic properties under the physical and riskneutral measures that can explain timevariation in the VRP. In an empirical application with the S&P 500 returns, the VIX, and the VRP, we find that the Realized GARCH model significantly outperforms conventional GARCH models. 
Date:  2021–12 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2112.05302&r= 
By:  Charles Beach 
Abstract:  This paper applies the tool box measures of disaggregative income inequality characterization and the statistical methodology of Beach (2021) to percentilebased distribution statistics such as quintile income shares and decile means typically published by official statistical agencies. It derives standard error formulas for those measures which are distributionfree and easy to implement. The approach is illustrated with Canadian Labour Force Survey data over 19972015. It is found that widely shared real earnings gains were experienced over this period, but that the gains were very unevenly shared with middleclass workers losing out relatively and top earners having highly statistically significant earnings gains. 
Keywords:  income inequality, inequality inference, statistical inference 
JEL:  C12 C46 D31 D63 
Date:  2021–09 
URL:  http://d.repec.org/n?u=RePEc:qed:wpaper:1477&r= 
By:  Deniz Dutz; Ingrid Huitfeldt (Statistics Norway); Santiago Lacouture; Magne Mogstad (Statistics Norway); Alexander Torgovitsky; Winnie van Dijk 
Abstract:  We evaluate how nonresponse affects conclusions drawn from survey data and consider how researchers can reliably test and correct for nonresponse bias. To do so, we examine a survey on labor market conditions during the COVID19 pandemic that used randomly assigned financial incentives to encourage participation. We link the survey data to administrative data sources, allowing us to observe a ground truth for participants and nonparticipants. We find evidence of large nonresponse bias, even after correcting for observable differences between participants and nonparticipants. We apply a range of existing methods that account for nonresponse bias due to unobserved differences, including worstcase bounds, bounds that incorporate monotonicity assumptions, and approaches based on parametric and nonparametric selection models. These methods produce bounds (or point estimates) that are either too wide to be useful or far from the ground truth. We show how these shortcomings can be addressed by modeling how nonparticipation can be both active (declining to participate) and passive (not seeing the survey invitation). The model makes use of variation from the randomly assigned financial incentives, as well as the timing of reminder emails. Applying the model to our data produces bounds (or point estimates) that are narrower and closer to the ground truth than the other methods. 
Keywords:  Survey; nonresponse; nonresponse bias 
JEL:  C01 C81 C83 
Date:  2021–12 
URL:  http://d.repec.org/n?u=RePEc:ssb:dispap:971&r= 
By:  Gu, Tao; Nakagawa, Masayuki; Saito, Makoto; Yamaga, Hisaki 
Abstract:  This paper proposes a simple method to estimate a nonlinear function using only coarsely discrete explanatory variables in panel data. The basic premise is to distinguish carefully between two types of discrete variables by assuming that if the variable changes between two points in time, it increases (decreases) marginally from near the upper (lower) bound one rank below (above). The dynamic pricing behavior at the boundary between two consecutive ranks is then properly approximated. Applying the proposed method, we estimate the nonlinear relationship between land prices and earthquake risk, with the latter being assessed over only five ranks. The panel datasets used comprise some two thousand fixed places over time in the Tokyo Metropolitan District. We interpret the estimated nonlinear land pricing functions using prospect theory from behavioral economics. 
JEL:  R14 R30 D91 
Date:  2021–12 
URL:  http://d.repec.org/n?u=RePEc:hit:hituec:729&r= 