
on Econometrics 
By:  Xingyu Li; Yan Shen; Qiankun Zhou 
Abstract:  We consider the construction of confidence intervals for treatment effects estimated using panel models with interactive fixed effects. We first use the factorbased matrix completion technique proposed by Bai and Ng (2021) to estimate the treatment effects, and then use bootstrap method to construct confidence intervals of the treatment effects for treated units at each posttreatment period. Our construction of confidence intervals requires neither specific distributional assumptions on the error terms nor large number of posttreatment periods. We also establish the validity of the proposed bootstrap procedure that these confidence intervals have asymptotically correct coverage probabilities. Simulation studies show that these confidence intervals have satisfactory finite sample performances, and empirical applications using classical datasets yield treatment effect estimates of similar magnitudes and reliable confidence intervals. 
Date:  2022–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2202.12078&r= 
By:  Mikkel PlagborgMøller (Princeton University); Christian K. Wolf (Princeton University) 
Abstract:  We prove that local projections (LPs) and Vector Autoregressions (VARs) estimate the same impulse responses. This nonparametric result only requires unrestricted lag structures. We discuss several implications: (i) LP and VAR estimators are not conceptually separate procedures, instead, they are simply two dimension reduction techniques with common estimand but different finitesample properties. (ii) VARbased structural identification â€“ including shortrun, longrun, or sign restrictions â€“ can equivalently be performed using LPs, and vice versa. (iii) Structural estimation with an instrument (proxy) can be carried out by ordering the instrument first in a recursive VAR, even under noninvertibility. (iv) Linear VARs are as robust to nonlinearities as linear LPs. 
Keywords:  external instrument, impulse response function, local projection, proxy variable, structural vector autoregression 
JEL:  C32 C36 
Date:  2020–10 
URL:  http://d.repec.org/n?u=RePEc:pri:econom:202016&r= 
By:  Ivan FernandezVal; Wayne Yuan Gao; Yuan Liao; Francis Vella 
Abstract:  We consider estimation of a dynamic distribution regression panel data model with heterogeneous coefficients across units. The objects of interest are functionals of these coefficients including linear projections on unit level covariates. We also consider predicted actual and stationary distributions of the outcome variable. We investigate how changes in initial conditions or covariate values affect these objects. Coefficients and their functionals are estimated via fixed effect methods, which are debiased to deal with the incidental parameter problem. We propose a crosssectional bootstrap method for uniformly valid inference on functionvalued objects. This avoids coefficient reestimation and is shown to be consistent for a large class of data generating processes. We employ PSID annual labor income data to illustrate various important empirical issues we can address. We first predict the impact of a reduction in income on future income via hypothetical tax policies. Second, we examine the impact on the distribution of labor income from increasing the education level of a chosen group of workers. Finally, we demonstrate the existence of heterogeneity in income mobility, which leads to substantial variation in individuals' incidences to be trapped in poverty. We also provide simulation evidence confirming that our procedures work. 
Date:  2022–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2202.04154&r= 
By:  Liu, Yanbo (Shandong University); Phillips, Peter C. B. (Yale University); Yu, Jun (Singapore Management Uinversity) 
Abstract:  This study provides new mechanisms for identifying and estimating explosive bubbles in mixedroot panel autoregressions with a latent group structure. A postclustering approach is employed that combines a recursive kmeans clustering algorithm with paneldata test statistics for testing the presence of explosive roots in time series trajectories. Uniform consistency of the kmeans clustering algorithm is established, showing that the postclustering estimate is asymptotically equivalent to the oracle counterpart that uses the true group identities. Based on the estimated group membership, righttailed selfnormalized ttests and coefficientbased Jtests, each with pivotal limit distributions, are introduced to detect the explosive roots. The usual Information Criterion (IC) for selecting the correct number of groups is found to be inconsistent and a new method that combines IC with a Hausmantype specification test is proposed that consistently estimates the true number of groups. Extensive Monte Carlo simulations provide strong evidence that in finite samples, the recursive kmeans clustering algorithm can correctly recover latent group membership in data of this type and the proposed postclustering paneldata tests lead to substantial power gains compared with the time series approach. The proposed methods are used to identify bubble behavior in US and Chinese housing markets, and the US stock market, leading to new findings concerning speculative behavior in these markets. 
Keywords:  Bubbles; Clustering; Mildly explosive behavior; kmeans; Latent membership detection 
JEL:  C22 C33 C51 G01 
Date:  2022–02–15 
URL:  http://d.repec.org/n?u=RePEc:ris:smuesw:2022_001&r= 
By:  Chenchuan (Mark) Li (Princeton University); Ulrich K. Müller (Princeton University) 
Abstract:  We consider inference about a scalar coefficient in a linear regression model. One previously considered approach to dealing with many controls imposes sparsity, that is, it is assumed known that nearly all control coefficients are zero, or at least very nearly so. We instead impose a bound on the quadratic mean of the controlsâ€™ effect on the dependent variable. We develop a simple inference procedure that exploits this additional information in general heteroskedastic models. We study its asymptotic efficiency properties and compare it to a sparsitybased approach in a Monte Carlo study. The method is illustrated in three empirical applications. 
Keywords:  high dimensional linear regression, limit of experiments, L2 bound, invariance to linear reparameterizations 
JEL:  C30 C39 
Date:  2020–03 
URL:  http://d.repec.org/n?u=RePEc:pri:econom:202057&r= 
By:  Jackson Bunting 
Abstract:  In dynamic discrete choice (DDC) analysis, it is common to use finite mixture models to control for unobserved heterogeneity  that is, by assuming there is a finite number of agent `types'. However, consistent estimation typically requires both a priori knowledge of the number of agent types and a highlevel injectivity condition that is difficult to verify. This paper provides lowlevel conditions for identification of continuous permanent unobserved heterogeneity in dynamic discrete choice (DDC) models. The results apply to both finite and infinitehorizon DDC models, do not require a full support assumption, nor a large panel, and place no parametric restriction on the distribution of unobserved heterogeneity. Furthermore, I present a seminonparametric estimator that is computationally attractive and can be implemented using familiar parametric methods. Finally, in an empirical application, I apply this estimator to the labor force participation model of Altug and Miller (1998). In this model, permanent unobserved heterogeneity may be interpreted as individualspecific labor productivity, and my results imply that the distribution of labor productivity can be estimated from the participation model. 
Date:  2022–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2202.03960&r= 
By:  Astill, Sam; Harvey, David I; Leybourne, Stephen J; Taylor, AM Robert 
Abstract:  We develop tests for predictability that are robust to both the magnitude of the initial condition and the degree of persistence of the predictor. While the popular Bonferroni Q test of Campbell and Yogo (2006) displays excellent power properties for strongly persistent predictors with an asymptotically negligible initial condition, it can suffer from severe size distortions and power losses when either the initial condition is asymptotically nonnegligible or the predictor is weakly persistent. The Bonferroni ttest of Cavanagh et al. (1995), although displaying power well below that of the Bonferroni Q test for strongly persistent predictors with an asymptotically negligible initial condition, displays superior size control and power when the initial condition is asymptotically nonnegligible. In the case where the predictor is weakly persistent, a conventional regression ttest comparing to standard normal quantiles is known to be asymptotically optimal under Gaussianity. Based on these properties, we propose two asymptotically size controlled hybrid tests that are functions of the Bonferroni Q, Bonferroni t, and conventional t tests. Our proposed hybrid tests exhibit very good power regardless of the magnitude of the initial condition or the persistence degree of the predictor. An empirical application to the data originally analysed by Campbell and Yogo (2006) shows our new hybrid tests are much more likely to find evidence of predictability than the Bonferroni Q test when the initial condition of the predictor is estimated to be large in magnitude. 
Keywords:  predictive regression; initial condition; unknown regressor persistence; Bonferroni tests; hybrid tests 
Date:  2022–03–03 
URL:  http://d.repec.org/n?u=RePEc:esy:uefcwp:32447&r= 
By:  Tommaso Mariotti; Fabrizio Lillo; Giacomo Toscano 
Abstract:  The estimation of the volatility with highfrequency data is plagued by the presence of microstructure noise, which leads to biased measures. Alternative estimators have been developed and tested either on specific structures of the noise or by the speed of convergence to their asymptotic distributions. Gatheral and Oomen (2010) proposed to use the ZeroIntelligence model of the limit order book to test the finitesample performance of several estimators of the integrated variance. Building on this approach, in this paper we introduce three main innovations: (i) we use as datagenerating process the QueueReactive model of the limit order book (Huang et al. (2015)), which  compared to the ZeroIntelligence model  generates more realistic microstructure dynamics, as shown here by using an Hausman test; (ii) we consider not only estimators of the integrated volatility but also of the spot volatility; (iii) we show the relevance of the estimator in the prediction of the variance of the cost of a simulated VWAP execution. Overall we find that, for the integrated volatility, the preaveraging estimator optimizes the estimation bias, while the unified and the alternation estimator lead to optimal mean squared error values. Instead, in the case of the spot volatility, the Fourier estimator yields the optimal accuracy, both in terms of bias and mean squared error. The latter estimator leads also to the optimal prediction of the cost variance of a VWAP execution. 
Date:  2022–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2202.12137&r= 
By:  Clément de Chaisemartin; Xavier D'Haultfoeuille 
Abstract:  We study regressions with period and group fixed effects and several treatment variables. Under a parallel trends assumption, the coefficient on each treatment identifies the sum of two terms. The first term is a weighted sum of the effect of that treatment in each group and period, with weights that may be negative and sum to one. The second term is a sum of the effects of the other treatments, with weights summing to zero. Accordingly, coefficients in those regressions are not robust to heterogeneous effects, and may be contaminated by the effect of other treatments. We propose alternative differencesindifferences estimators. To estimate, say, the effect of the first treatment, our estimators compare the outcome evolution of a group whose first treatment changes while its other treatments remain unchanged, to control groups whose treatments all remain unchanged, and with the same baseline treatments or treatments' history as the switching group. Those carefully selected comparisons are robust to heterogeneous effects, and do not suffer from the contamination problem. 
JEL:  C21 C23 
Date:  2022–02 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:29734&r= 
By:  Magne Mogstad; Joseph P. Romano; Azeem Shaikh; Daniel Wilhelm 
Abstract:  Economists are obsessed with rankings of institutions, journals, or scholars according to the value of some feature of interest. These rankings are invariably computed using estimates rather than the true values of such features. As a result, there may be considerable uncertainty concerning the ranks. In this paper, we consider the problem of accounting for such uncertainty by constructing confidence sets for the ranks. We consider both the problem of constructing marginal confidence sets for the rank of, say, a particular journal as well as simultaneous confidence sets for the ranks of all journals. We apply these confidence sets to draw inferences about uncertainty in the ranking of economics journals and universities by impact factors. 
JEL:  A0 C12 
Date:  2022–02 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:29768&r= 
By:  Ulrich K. Müller (Princeton University); Mark W. Watson (Princeton University) 
Abstract:  This chapter discusses econometric methods for studying lowfrequency variation and covariation in economic time series. We use the term lowfrequency for dynamics over time spans that are a nonnegligible fraction of the sample period. For example, when studying 70 years of postWWII quarterly data, decadal variation is lowfrequency, and when studying a decade of daily return data, yearly variation is lowfrequency. Much of this chapter is organized around a set of empirical exercises that feature questions about lowfrequency variability and covariability, and there is no better way to introduce the topics to be covered than to look at the data featured in these exercises. 
Keywords:  Econometrics 
JEL:  C01 C10 
Date:  2020–09 
URL:  http://d.repec.org/n?u=RePEc:pri:econom:202013&r= 
By:  Chen, Zezhun; Dassios, Angelos 
Abstract:  In this paper, we consider Poisson thinning Integervalued time series models, namely integervalued moving average model (INMA) and Integervalued Autoregressive Moving Average model (INARMA), and their relationship with cluster point processes, the Cox point process and the dynamic contagion process. We derive the probability generating functionals of INARMA models and compare to that of cluster point processes. The main aim of this paper is to prove that, under a specific parametric setting, INMA and INARMA models are just discrete versions of continuous cluster point processes and hence converge weakly when the length of subintervals goes to zero. 
Keywords:  Stochastic intensity model; dynamic contagion process; Integervalued time series; Poisson thinning; Internal OA fund 
JEL:  C1 
Date:  2022–02–10 
URL:  http://d.repec.org/n?u=RePEc:ehl:lserod:113652&r= 
By:  Elena Ivona Dumitrescu (EconomiX  UPN  Université Paris Nanterre  CNRS  Centre National de la Recherche Scientifique); Sullivan Hué (LEO  Laboratoire d'Économie d'Orleans  UO  Université d'Orléans  UT  Université de Tours, AMSE  AixMarseille Sciences Economiques  EHESS  École des hautes études en sciences sociales  AMU  Aix Marseille Université  ECM  École Centrale de Marseille  CNRS  Centre National de la Recherche Scientifique); Christophe Hurlin (LEO  Laboratoire d'Économie d'Orleans  UO  Université d'Orléans  UT  Université de Tours); Sessi Tokpavi (LEO  Laboratoire d'Économie d'Orleans  UO  Université d'Orléans  UT  Université de Tours) 
Abstract:  In the context of credit scoring, ensemble methods based on decision trees, such as the random forest method, provide better classification performance than standard logistic regression models. However, logistic regression remains the benchmark in the credit risk industry mainly because the lack of interpretability of ensemble methods is incompatible with the requirements of financial regulators. In this paper, we propose a highperformance and interpretable credit scoring method called penalised logistic tree regression (PLTR), which uses information from decision trees to improve the performance of logistic regression. Formally, rules extracted from various shortdepth decision trees built with original predictive variables are used as predictors in a penalised logistic regression model. PLTR allows us to capture nonlinear effects that can arise in credit scoring data while preserving the intrinsic interpretability of the logistic regression model. Monte Carlo simulations and empirical applications using four real credit default datasets show that PLTR predicts credit risk significantly more accurately than logistic regression and compares competitively to the random forest method. 
Keywords:  Risk management,Credit scoring,Machine learning,Interpretability,Econometrics 
Date:  2022–03–16 
URL:  http://d.repec.org/n?u=RePEc:hal:journl:hal03331114&r= 
By:  Christian Bontemps (ENAC  Ecole Nationale de l'Aviation Civile, TSE  Toulouse School of Economics  UT1  Université Toulouse 1 Capitole  Université Fédérale Toulouse MidiPyrénées  EHESS  École des hautes études en sciences sociales  CNRS  Centre National de la Recherche Scientifique  INRAE  Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement); Raquel Menezes Bezerra Sampaio (UFRN  Universidade Federal do Rio Grande do Norte [Natal]) 
Abstract:  In this paper we review the literature on static entry games and show how they can be used to estimate the market structure of the airline industry. The econometrics challenges are presented, in particular the problem of multiple equilibria and some solutions used in the literature are exposed. We also show how these models, either in the complete information setting or in the incomplete information one, can be estimated from i.i.d. data on market presence and market characteristics. We illustrate it by estimating a static entry game with heterogeneous firms by Simulated Maximum Likelihood on European data for the year 2015. 
Keywords:  Estimation,Airlines,Multiple equilibria,Entry,Industrial organization 
Date:  2020–12 
URL:  http://d.repec.org/n?u=RePEc:hal:journl:hal02137358&r= 
By:  Philippe Goulet Coulombe 
Abstract:  Many problems plague the estimation of Phillips curves. Among them is the hurdle that the two key components, inflation expectations and the output gap, are both unobserved. Traditional remedies include creating reasonable proxies for the notable absentees or extracting them via some form of assumptionsheavy filtering procedure. I propose an alternative route: a Hemisphere Neural Network (HNN) whose peculiar architecture yields a final layer where components can be interpreted as latent states within a Neural Phillips Curve. There are benefits. First, HNN conducts the supervised estimation of nonlinearities that arise when translating a highdimensional set of observed regressors into latent states. Second, computations are fast. Third, forecasts are economically interpretable. Fourth, inflation volatility can also be predicted by merely adding a hemisphere to the model. Among other findings, the contribution of real activity to inflation appears severely underestimated in traditional econometric specifications. Also, HNN captures outofsample the 2021 upswing in inflation and attributes it first to an abrupt and sizable disanchoring of the expectations component, followed by a wildly positive gap starting from late 2020. HNN's gap unique path comes from dispensing with unemployment and GDP in favor of an amalgam of nonlinearly processed alternative tightness indicators  some of which are skyrocketing as of early 2022. 
Date:  2022–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2202.04146&r= 
By:  Narayanaswamy Balakrishnan (McMaster University); Efe A. Ok (New York University); Pietro Ortoleva (Princeton University) 
Abstract:  Despite being the fundamental primitive of the study of decisionmaking in economics, choice correspondences are not observable: even for a single menu of options, we observe at most one choice of an individual at a given point in time, as opposed to the set of all choices she deems most desirable in that menu. However, it may be possible to observe a person choose from a feasible menu at various times, repeatedly. We propose a method of inferring the choice correspondence of an individual from this sort of choice data. First, we derive our method axiomatically, assuming an ideal dataset. Next, we develop statistical techniques to implement this method for realworld situations where the sample at hand is often fairly small. As an application, we use the data of two famed choice experiments from the literature to infer the choice correspondences of the participating subjects. 
Keywords:  Choice Correspondences, Estimation, Stochastic Choice Functions, Transitivity of Preferences 
JEL:  C81 D11 D12 D81 
Date:  2021–02 
URL:  http://d.repec.org/n?u=RePEc:pri:econom:202160&r= 