
on Econometrics 
By:  Harold D Chiang; Yukitoshi Matsushita; Taisuke Otsu 
Abstract:  This paper is concerned with estimation and inference on average treatment effects in randomized controlled trials when researchers observe potentially many covariates. By em ploying Neyman's (1923) finite population perspective, we propose a biascorrected regression adjustment estimator using crossfitting, and show that the proposed estimator has favorable properties over existing alternatives. For inference, we derive the first and second order terms in the stochastic component of the regression adjustment estimators, study higher order properties of the existing inference methods, and propose a biascorrected version of the HC3 standard er ror. Simulation studies show our crossfitted estimator, combined with the biascorrected HC3, delivers precise point estimates and robust size controls over a wide range of DGPs. To illus trate, the proposed methods are applied to real dataset on randomized experiments of incentives and services for college achievement following Angrist, Lang, and Oreopoulos (2009). 
Keywords:  Randomized controlled trials, regression adjustment, many covariates 
JEL:  C14 
Date:  2023–02 
URL:  http://d.repec.org/n?u=RePEc:cep:stiecm:627&r=ecm 
By:  Alain Hecq; Luca Margaritella; Stephan Smeekes 
Abstract:  In this paper we construct an inferential procedure for Granger causality in highdimensional nonstationary vector autoregressive (VAR) models. Our method does not require knowledge of the order of integration of the time series under consideration. We augment the VAR with at least as many lags as the suspected maximum order of integration, an approach which has been proven to be robust against the presence of unit roots in low dimensions. We prove that we can restrict the augmentation to only the variables of interest for the testing, thereby making the approach suitable for high dimensions. We combine this lag augmentation with a postdoubleselection procedure in which a set of initial penalized regressions is performed to select the relevant variables for both the Granger causing and caused variables. We then establish uniform asymptotic normality of a secondstage regression involving only the selected variables. Finite sample simulations show good performance, an application to investigate the (predictive) causes and effects of economic uncertainty illustrates the need to allow for unknown orders of integration. 
Date:  2023–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2302.01434&r=ecm 
By:  Liudas Giraitis; Yufei Li; Peter C.B. Phillips (Cowles Foundation, Yale University) 
Abstract:  Considerable evidence in past research shows size distortion in standard tests for zero autocorrelation or crosscorrelation when time series are not independent identically distributed random variables, pointing to the need for more robust procedures. Recent tests for serial correlation and cross correlation in Dalla, Giraitis, and Phillips (2022) provide a more robust approach, allowing for heteroskedasticity and dependence in uncorrelated data under restrictions that require a smooth, slowlyevolving deterministic heteroskedasticity process. The present work removes those restrictions and validates the robust testing methodology for a wider class of heteroskedastic time series models and innovations. The updated analysis given here enables more extensive use of the methodology in practical applications. Monte Carlo experiments conÞrm excellent Þnite sample performance of the robust test procedures even for extremely complex white noise processes. The empirical examples show that use of robust testing methods can materially reduce spurious evidence of correlations found by standard testing procedures. 
Date:  2023–02 
URL:  http://d.repec.org/n?u=RePEc:cwl:cwldpp:2354&r=ecm 
By:  Pesaran, M. H.; Smith, R. P.; ; 
Abstract:  This paper examines the role of pricing errors in linear factor pricing models, allowing for observed strong and semistrong factors, and latent weak factors. It focusses on the estimation of Î¦k = Î»k  Î¼k which plays a pivotal role, not only in the estimation of risk premia but also in tests of market efficiency, where Î»k and Î¼k are respectively the risk premium and the mean of the kth risk factor. It proposes a twostep estimator of Î¦k with Shanken type biascorrection, and derives its asymptotic distribution under a general setting that allows for idiosyncratic pricing errors, weak missing factors, as well as weak error crosssectional dependence. The implications of semistrong factors for the asymptotic distribution of the proposed estimator are also investigated. Small sample results from extensive Monte Carlo experiments show that the proposed estimator has the correct size with good power properties. The paper also provides an empirical application to a large number of U.S. securities with risk factors selected from a large number of potential risk factors according to their strength. 
Keywords:  Factor strength, pricing errors, risk premia, missing factors, FamaFrench factors, panel R2 
JEL:  C38 G10 
Date:  2023–02–13 
URL:  http://d.repec.org/n?u=RePEc:cam:camdae:2317&r=ecm 
By:  Nicholas Brown (Queen's University) 
Abstract:  I consider linear panel data models with unobserved factor structures when the number of time periods is small relative to the number of crosssectional units. I examine two popular methods of estimation: the first eliminates the factors with a parameterized quasilongdifferencing (QLD) transformation. The other, referred to as common correlated effects (CCE), uses the crosssectional averages of the independent and response variables to project out the space spanned by the factors. I show that the classical CCE assumptions imply unused moment conditions that can be exploited by the QLD transformation to derive new linear estimators, which weaken identifying assumptions and have desirable theoretical properties. I prove asymptotic normality of the linear QLD estimators under a heterogeneous slope model that allows for a tradeoff between identifying conditions. These estimators do not require the number of independent variables to be less than one minus the number of time periods, a strong restriction when the number of time periods is fixed in the asymptotic analysis. Finally, I investigate the effects of perstudent expenditure on standardized test performance using data from the state of Michigan. 
Keywords:  factor models, common correlated effects, quasilong differencing, fixed effects, correlated random coefficients 
JEL:  C36 C38 
Date:  2023–02 
URL:  http://d.repec.org/n?u=RePEc:qed:wpaper:1498&r=ecm 
By:  Chronopoulos, Ilias; Raftapostolos, Aristeidis; Kapetanios, George 
Abstract:  In this paper we use a deep quantile estimator, based on neural networks and their universal approximation property to examine a nonlinear association between the conditional quantiles of a dependent variable and predictors. This methodology is versatile and allows both the use of different penalty functions, as well as high dimensional covariates. We present a Monte Carlo exercise where we examine the finite sample properties of the deep quantile estimator and show that it delivers good finite sample performance. We use the deep quantile estimator to forecast ValueatRisk and find significant gains over linear quantile regression alternatives and other models, which are supported by various testing schemes. Further, we consider also an alternative architecture that allows the use of mixed frequency data in neural networks. This paper also contributes to the interpretability of neural networks output by making comparisons between the commonly used SHAP values and an alternative method based on partial derivatives. 
Keywords:  Quantile regression, machine learning, neural networks, valueatrisk, forecasting 
Date:  2023–02–07 
URL:  http://d.repec.org/n?u=RePEc:esy:uefcwp:34837&r=ecm 
By:  Jan Pr\"user; Florian Huber 
Abstract:  Modeling and predicting extreme movements in GDP is notoriously difficult and the selection of appropriate covariates and/or possible forms of nonlinearities are key in obtaining precise forecasts. In this paper, our focus is on using large datasets in quantile regression models to forecast the conditional distribution of US GDP growth. To capture possible nonlinearities we include several nonlinear specifications. The resulting models will be huge dimensional and we thus rely on a set of shrinkage priors. Since Markov Chain Monte Carlo estimation becomes slow in these dimensions, we rely on fast variational Bayes approximations to the posterior distribution of the coefficients and the latent states. We find that our proposed set of models produces precise forecasts. These gains are especially pronounced in the tails. Using Gaussian processes to approximate the nonlinear component of the model further improves the good performance in the tails. 
Date:  2023–01 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2301.13604&r=ecm 
By:  Alain Hecq; Marie Ternes; Ines Wilms 
Abstract:  Reverse Unrestricted MIxed DAta Sampling (RUMIDAS) regressions are used to model highfrequency responses by means of lowfrequency variables. However, due to the periodic structure of RUMIDAS regressions, the dimensionality grows quickly if the frequency mismatch between the high and lowfrequency variables is large. Additionally the number of highfrequency observations available for estimation decreases. We propose to counteract this reduction in sample size by pooling the highfrequency coefficients and further reduce the dimensionality through a sparsityinducing convex regularizer that accounts for the temporal ordering among the different lags. To this end, the regularizer prioritizes the inclusion of lagged coefficients according to the recency of the information they contain. We demonstrate the proposed method on an empirical application for daily realized volatility forecasting where we explore whether modeling highfrequency volatility data in terms of lowfrequency macroeconomic data pays off. 
Date:  2023–01 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2301.10592&r=ecm 
By:  Cheng Peng; Stanislav Uryasev 
Abstract:  This paper considers the problem of estimating the distribution of a response variable conditioned on observing some factors. Existing approaches are often deficient in one of the qualities of flexibility, interpretability and tractability. We propose a model that possesses these desirable properties. The proposed model, analogous to classic mixture regression models, models the conditional quantile function as a mixture (weighted sum) of basis quantile functions, with the weight of each basis quantile function being a function of the factors. The model can approximate any bounded conditional quantile model. It has a factor model structure with a closedform expression. The calibration problem is formulated as convex optimization, which can be viewed as conducting quantile regressions of all confidence levels simultaneously and does not suffer from quantile crossing by design. The calibration is equivalent to minimization of Continuous Probability Ranked Score (CRPS). We prove the asymptotic normality of the estimator. Additionally, based on risk quadrangle framework, we generalize the proposed approach to conditional distributions defined by Conditional ValueatRisk (CVaR), expectile and other functions of uncertainty measures. Based on CP decomposition of tensors, we propose a dimensionality reduction method by reducing the rank of the parameter tensor and propose an alternating algorithm for estimating the parameter tensor. Our numerical experiments demonstrate the efficiency of the approach. 
Date:  2023–01 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2301.13843&r=ecm 
By:  Rainer Winkelmann 
Abstract:  This paper explores an algebraic relationship between two types of coefficients for a regression with several predictors and an additive binary group variable. In a general regression, the regression coefficients are allowed to be groupspecific, the restricted regression imposes constant coefficients. The key result is that the restricted coefficients imposing homogeneity are not necessarily a convex average of the unrestricted coefficients obtained from the more general regression. In the context of treatment effect estimation with several treatment arms and group level controls, this means that the estimated effect of a specific treatment can be nonzero, and statistically significant, even if the estimated unrestricted effects are zero in each group. 
Keywords:  Ordinary least squares, subsample heterogeneity, varianceweighting, average treatment effect 
JEL:  C21 
Date:  2023–01 
URL:  http://d.repec.org/n?u=RePEc:zur:econwp:426&r=ecm 
By:  Hongjian Shi; Mathias Drton; Marc Hallin; Fang Han 
Abstract:  Defining multivariate generalizations of the classical univariate ranks has been a longstanding open problem in statistics. Optimal transport has been shown to offer a solution in which multivariate ranks are obtained by transporting data points to a grid that approximates a uniformreference measure (Chernozhukov et al. 2017; Hallin, 2017; Hallin et al. 2021). We take up this new perspective to develop and study multivariate analogues of the sign covariance/quadrant statistic, Kendall’s tau, and Spearman’s rho. The resulting tests of multivariate independence are genuinely distributionfree, hence uniformly valid irrespective of the actual (absolutely continuous) distributions of the observations. Our results provide asymptotic distribution theory for these new test statistics, with asymptotic approximations to critical values to be used for testing independence as well as a power analysis of the resulting tests. This includes a multivariate elliptical Chernoff–Savage property, which guarantees that, under ellipticity, our nonparametric tests of independence enjoy an asymptotic relative efficiency of one or larger with respect to theclassical Gaussian procedures. 
Keywords:  Semiparametrically Efficient Tests, Multivariate Independence, CenterOutward Quadrant, Spearman, Kendall Statistics 
Date:  2023–01 
URL:  http://d.repec.org/n?u=RePEc:eca:wpaper:2013/355918&r=ecm 
By:  Harold D Chiang; Yuya Sasaki 
Abstract:  Thousands of papers have reported twoway clusterrobust (TWCR) standard errors. However, the recent econometrics literature points out the potential nongaussianity of twoway cluster sample means, and thus invalidity of the inference based on the TWCR standard errors. Fortunately, simulation studies nonetheless show that the gaussianity is rather common than exceptional. This paper provides theoretical support for this encouraging observation. Specifically, we derive a novel central limit theorem for twoway clustered triangular arrays that justifies the use of the TWCR under very mild and interpretable conditions. We, therefore, hope that this paper will provide a theoretical justification for the legitimacy of most, if not all, of the thousands of those empirical papers that have used the TWCR standard errors. We provide a guide in practice as to when a researcher can employ the TWCR standard errors. 
Date:  2023–01 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2301.13775&r=ecm 
By:  JeanJacques Forneron 
Abstract:  A practical challenge for structural estimation is the requirement to accurately minimize a sample objective function which is often nonsmooth, nonconvex, or both. This paper proposes a simple algorithm designed to find accurate solutions without performing an exhaustive search. It augments each iteration from a new GaussNewton algorithm with a grid search step. A finite sample analysis derives its optimization and statistical properties simultaneously using only econometric assumptions. After a finite number of iterations, the algorithm automatically transitions from global to fast local convergence, producing accurate estimates with high probability. Simulated examples and an empirical application illustrate the results. 
Date:  2023–01 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2301.07196&r=ecm 
By:  Geert Dhaene; Martin Weidner 
Abstract:  Inference on common parameters in panel data models with individualspecific fixed effects is a classic example of Neyman and Scott's (1948) incidental parameter problem (IPP). One solution to this IPP is functional differencing (Bonhomme 2012), which works when the number of time periods T is fixed (and may be small), but this solution is not applicable to all panel data models of interest. Another solution, which applies to a larger class of models, is "largeT" bias correction (pioneered by Hahn and Kuersteiner 2002 and Hahn and Newey 2004), but this is only guaranteed to work well when T is sufficiently large. This paper provides a unified approach that connects those two seemingly disparate solutions to the IPP. In doing so, we provide an approximate version of functional differencing, that is, an approximate solution to the IPP that is applicable to a large class of panel data models even when T is relatively small. 
Date:  2023–01 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2301.13736&r=ecm 
By:  Richard K. Crump; Nikolay Gospodinov; Hunter Wieman 
Abstract:  The lowfrequency movements of many economic variables play a prominent role in policy analysis and decisionmaking. We develop a robust estimation approach for these slowmoving trend processes, which is guided by a judicious choice of priors and is characterized by sparsity. We present some novel stylized facts from longerrun survey expectations that inform the structure of the estimation procedure. The general version of the proposed Bayesian estimator with a slabandspike prior accounts explicitly for cyclical dynamics. The practical implementation of the method is discussed in detail, and we show that it performs well in simulations against some relevant benchmarks. We report empirical estimates of trend growth for U.S. output (and its components), productivity, and annual mean temperature. These estimates allow policymakers to assess shortfalls and overshoots in these variables from their economic and ecological targets. 
Keywords:  sparsity; Bayesian inference; latent variable models; trend output growth; slowmoving trends 
JEL:  C13 C30 C33 E27 E32 
Date:  2023–02–01 
URL:  http://d.repec.org/n?u=RePEc:fip:fednsr:95589&r=ecm 
By:  Bo E. Honore; Luojia Hu 
Abstract:  This paper studies semiparametric versions of the classical sample selection model (Heckman (1976, 1979)) without exclusion restrictions. We extend the analysis in Honoré and Hu (2020) by allowing for parameter heterogeneity and derive implications of this model. We also consider models that allow for heteroskedasticity and briefly discuss other extensions. The key ideas are illustrated in a simple wage regression for females. We find that the derived implications of a semiparametric version of Heckman's classical sample selection model are consistent with the data for women with no college education, but strongly rejected for women with a college degree or more. 
Keywords:  Selection; heterogeneity; heteroskedasticity; exclusion Restrictions; identification 
JEL:  C01 C14 C21 C24 
Date:  2021–07 
URL:  http://d.repec.org/n?u=RePEc:fip:fedhwp:95177&r=ecm 
By:  Phuong Anh Nguyen; Michael Wolf 
Abstract:  Return event studies generally involve several companies but there are also cases when only one company is involved. This makes the relevant testing problems, abnormal return (AR) and cumulative abnormal return (CAR), more difficult since one cannot exploit the multitude of companies (by using a relevant central limit theorem, say). We propose a permutation test that is valid under weaker conditions than the tests that have previously proposed in the literature in this context. We address the question of the power of the test via a brief simulation study and also illustrate the method with two applications to real data. 
Keywords:  Cumulative abnormal return, event study, permutation test 
JEL:  C12 G14 
Date:  2023–01 
URL:  http://d.repec.org/n?u=RePEc:zur:econwp:425&r=ecm 
By:  Benjamin Avanzi; Greg Taylor; Melantha Wang; Bernard Wong 
Abstract:  Highcardinality categorical features are pervasive in actuarial data (e.g. occupation in commercial property insurance). Standard categorical encoding methods like onehot encoding are inadequate in these settings. In this work, we present a novel _Generalised Linear Mixed Model Neural Network_ ("GLMMNet") approach to the modelling of highcardinality categorical features. The GLMMNet integrates a generalised linear mixed model in a deep learning framework, offering the predictive power of neural networks and the transparency of random effects estimates, the latter of which cannot be obtained from the entity embedding models. Further, its flexibility to deal with any distribution in the exponential dispersion (ED) family makes it widely applicable to many actuarial contexts and beyond. We illustrate and compare the GLMMNet against existing approaches in a range of simulation experiments as well as in a reallife insurance case study. Notably, we find that the GLMMNet often outperforms or at least performs comparably with an entity embedded neural network, while providing the additional benefit of transparency, which is particularly valuable in practical applications. Importantly, while our model was motivated by actuarial applications, it can have wider applicability. The GLMMNet would suit any applications that involve highcardinality categorical variables and where the response cannot be sufficiently modelled by a Gaussian distribution. 
Date:  2023–01 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2301.12710&r=ecm 
By:  Matteo Barigozzi; Filippo Pellegrino 
Abstract:  This paper generalises dynamic factor models for multidimensional dependent data. In doing so, it develops an interpretable technique to study complex information sources ranging from repeated surveys with a varying number of respondents to panels of satellite images. We specialise our results to model microeconomic data on US households jointly with macroeconomic aggregates. This results in a powerful tool able to generate localised predictions, counterfactuals and impulse response functions for individual households, accounting for traditional timeseries complexities depicted in the statespace literature. The model is also compatible with the growing focus of policymakers for realtime economic analysis as it is able to process observations online, while handling missing values and asynchronous data releases. 
Date:  2023–01 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2301.12499&r=ecm 
By:  Mario Forni (Università di Modena e Reggio Emilia, CEPR and RECent); Luca Gambetti (Universitat Autònoma de Barcelona, BSE, Università di Torino, CCA); Giovanni Ricco (École Polytechnique, University of Warwick, OFCESciencesPo, and CEPR) 
Keywords:  ProxySVAR, SVARIV, Impulse response functions, Variance Decomposition, Historical Decomposition, Monetary Policy Shock 
JEL:  C32 E32 
Date:  2022–01–29 
URL:  http://d.repec.org/n?u=RePEc:crs:wpaper:202303&r=ecm 