nep-ecm New Economics Papers
on Econometrics
Issue of 2023‒02‒27
twenty papers chosen by
Sune Karlsson
Örebro universitet

  1. Regression adjustment in randomized controlled trials with many covariates By Harold D Chiang; Yukitoshi Matsushita; Taisuke Otsu
  2. Inference in Non-stationary High-Dimensional VARs By Alain Hecq; Luca Margaritella; Stephan Smeekes
  3. Robust Inference on Correlation under General Heterogeneity By Liudas Giraitis; Yufei Li; Peter C.B. Phillips
  4. The Role of Pricing Errors in Linear Asset Pricing Models with Strong, Semi-strong, and Latent Factors By Pesaran, M. H.; Smith, R. P.; ;
  5. Moment-Based Estimation of Linear Panel Data Models with Factor-Augmented Errors By Nicholas Brown
  6. Forecasting Value-at-Risk using deep neural network quantile regression By Chronopoulos, Ilias; Raftapostolos, Aristeidis; Kapetanios, George
  7. Nonlinearities in Macroeconomic Tail Risk through the Lens of Big Data Quantile Regressions By Jan Pr\"user; Florian Huber
  8. Hierarchical Regularizers for Reverse Unrestricted Mixed Data Sampling Regressions By Alain Hecq; Marie Ternes; Ines Wilms
  9. Factor Model of Mixtures By Cheng Peng; Stanislav Uryasev
  10. Neglected heterogeneity and the algebra of least squares By Rainer Winkelmann
  11. Semiparametrically Efficient Tests of Multivariate Independence Using Center-Outward Quadrant, Spearman, and Kendall Statistics By Hongjian Shi; Mathias Drton; Marc Hallin; Fang Han
  12. On Using The Two-Way Cluster-Robust Standard Errors By Harold D Chiang; Yuya Sasaki
  13. Noisy, Non-Smooth, Non-Convex Estimation of Moment Condition Models By Jean-Jacques Forneron
  14. Approximate Functional Differencing By Geert Dhaene; Martin Weidner
  15. Sparse Trend Estimation By Richard K. Crump; Nikolay Gospodinov; Hunter Wieman
  16. Sample Selection Models Without Exclusion Restrictions: Parameter Heterogeneity and Partial Identification By Bo E. Honore; Luojia Hu
  17. A note on testing AR and CAR for event studies By Phuong Anh Nguyen; Michael Wolf
  18. Machine Learning with High-Cardinality Categorical Features in Actuarial Applications By Benjamin Avanzi; Greg Taylor; Melantha Wang; Bernard Wong
  19. Multidimensional dynamic factor models By Matteo Barigozzi; Filippo Pellegrino
  20. External Instrument SVAR Analysis forNoninvertible Shocks By Mario Forni; Luca Gambetti; Giovanni Ricco

  1. By: Harold D Chiang; Yukitoshi Matsushita; Taisuke Otsu
    Abstract: This paper is concerned with estimation and inference on average treatment effects in randomized controlled trials when researchers observe potentially many covariates. By em- ploying Neyman's (1923) finite population perspective, we propose a bias-corrected regression adjustment estimator using cross-fitting, and show that the proposed estimator has favorable properties over existing alternatives. For inference, we derive the first and second order terms in the stochastic component of the regression adjustment estimators, study higher order properties of the existing inference methods, and propose a bias-corrected version of the HC3 standard er- ror. Simulation studies show our cross-fitted estimator, combined with the bias-corrected HC3, delivers precise point estimates and robust size controls over a wide range of DGPs. To illus- trate, the proposed methods are applied to real dataset on randomized experiments of incentives and services for college achievement following Angrist, Lang, and Oreopoulos (2009).
    Keywords: Randomized controlled trials, regression adjustment, many covariates
    JEL: C14
    Date: 2023–02
  2. By: Alain Hecq; Luca Margaritella; Stephan Smeekes
    Abstract: In this paper we construct an inferential procedure for Granger causality in high-dimensional non-stationary vector autoregressive (VAR) models. Our method does not require knowledge of the order of integration of the time series under consideration. We augment the VAR with at least as many lags as the suspected maximum order of integration, an approach which has been proven to be robust against the presence of unit roots in low dimensions. We prove that we can restrict the augmentation to only the variables of interest for the testing, thereby making the approach suitable for high dimensions. We combine this lag augmentation with a post-double-selection procedure in which a set of initial penalized regressions is performed to select the relevant variables for both the Granger causing and caused variables. We then establish uniform asymptotic normality of a second-stage regression involving only the selected variables. Finite sample simulations show good performance, an application to investigate the (predictive) causes and effects of economic uncertainty illustrates the need to allow for unknown orders of integration.
    Date: 2023–02
  3. By: Liudas Giraitis; Yufei Li; Peter C.B. Phillips (Cowles Foundation, Yale University)
    Abstract: Considerable evidence in past research shows size distortion in standard tests for zero autocorrelation or cross-correlation when time series are not independent identically dis-tributed random variables, pointing to the need for more robust procedures. Recent tests for serial correlation and cross correlation in Dalla, Giraitis, and Phillips (2022) provide a more robust approach, allowing for heteroskedasticity and dependence in un-correlated data under restrictions that require a smooth, slowly-evolving deterministic heteroskedasticity process. The present work removes those restrictions and validates the robust testing methodology for a wider class of heteroskedastic time series models and innovations. The updated analysis given here enables more extensive use of the method-ology in practical applications. Monte Carlo experiments conÞrm excellent Þnite sample performance of the robust test procedures even for extremely complex white noise pro-cesses. The empirical examples show that use of robust testing methods can materially reduce spurious evidence of correlations found by standard testing procedures.
    Date: 2023–02
  4. By: Pesaran, M. H.; Smith, R. P.; ;
    Abstract: This paper examines the role of pricing errors in linear factor pricing models, allowing for observed strong and semi-strong factors, and latent weak factors. It focusses on the estimation of Φk = λk - μk which plays a pivotal role, not only in the estimation of risk premia but also in tests of market efficiency, where λk and μk are respectively the risk premium and the mean of the kth risk factor. It proposes a two-step estimator of Φk with Shanken type bias-correction, and derives its asymptotic distribution under a general setting that allows for idiosyncratic pricing errors, weak missing factors, as well as weak error cross-sectional dependence. The implications of semi-strong factors for the asymptotic distribution of the proposed estimator are also investigated. Small sample results from extensive Monte Carlo experiments show that the proposed estimator has the correct size with good power properties. The paper also provides an empirical application to a large number of U.S. securities with risk factors selected from a large number of potential risk factors according to their strength.
    Keywords: Factor strength, pricing errors, risk premia, missing factors, Fama-French factors, panel R2
    JEL: C38 G10
    Date: 2023–02–13
  5. By: Nicholas Brown (Queen's University)
    Abstract: I consider linear panel data models with unobserved factor structures when the number of time periods is small relative to the number of cross-sectional units. I examine two popular methods of estimation: the first eliminates the factors with a parameterized quasi-long-differencing (QLD) transformation. The other, referred to as common correlated effects (CCE), uses the cross-sectional averages of the independent and response variables to project out the space spanned by the factors. I show that the classical CCE assumptions imply unused moment conditions that can be exploited by the QLD transformation to derive new linear estimators, which weaken identifying assumptions and have desirable theoretical properties. I prove asymptotic normality of the linear QLD estimators under a heterogeneous slope model that allows for a tradeoff between identifying conditions. These estimators do not require the number of independent variables to be less than one minus the number of time periods, a strong restriction when the number of time periods is fixed in the asymptotic analysis. Finally, I investigate the effects of per-student expenditure on standardized test performance using data from the state of Michigan.
    Keywords: factor models, common correlated effects, quasi-long differencing, fixed effects, correlated random coefficients
    JEL: C36 C38
    Date: 2023–02
  6. By: Chronopoulos, Ilias; Raftapostolos, Aristeidis; Kapetanios, George
    Abstract: In this paper we use a deep quantile estimator, based on neural networks and their universal approximation property to examine a non-linear association between the conditional quantiles of a dependent variable and predictors. This methodology is versatile and allows both the use of different penalty functions, as well as high dimensional covariates. We present a Monte Carlo exercise where we examine the finite sample properties of the deep quantile estimator and show that it delivers good finite sample performance. We use the deep quantile estimator to forecast Value-at-Risk and find significant gains over linear quantile regression alternatives and other models, which are supported by various testing schemes. Further, we consider also an alternative architecture that allows the use of mixed frequency data in neural networks. This paper also contributes to the interpretability of neural networks output by making comparisons between the commonly used SHAP values and an alternative method based on partial derivatives.
    Keywords: Quantile regression, machine learning, neural networks, value-at-risk, forecasting
    Date: 2023–02–07
  7. By: Jan Pr\"user; Florian Huber
    Abstract: Modeling and predicting extreme movements in GDP is notoriously difficult and the selection of appropriate covariates and/or possible forms of nonlinearities are key in obtaining precise forecasts. In this paper, our focus is on using large datasets in quantile regression models to forecast the conditional distribution of US GDP growth. To capture possible non-linearities we include several nonlinear specifications. The resulting models will be huge dimensional and we thus rely on a set of shrinkage priors. Since Markov Chain Monte Carlo estimation becomes slow in these dimensions, we rely on fast variational Bayes approximations to the posterior distribution of the coefficients and the latent states. We find that our proposed set of models produces precise forecasts. These gains are especially pronounced in the tails. Using Gaussian processes to approximate the nonlinear component of the model further improves the good performance in the tails.
    Date: 2023–01
  8. By: Alain Hecq; Marie Ternes; Ines Wilms
    Abstract: Reverse Unrestricted MIxed DAta Sampling (RU-MIDAS) regressions are used to model high-frequency responses by means of low-frequency variables. However, due to the periodic structure of RU-MIDAS regressions, the dimensionality grows quickly if the frequency mismatch between the high- and low-frequency variables is large. Additionally the number of high-frequency observations available for estimation decreases. We propose to counteract this reduction in sample size by pooling the high-frequency coefficients and further reduce the dimensionality through a sparsity-inducing convex regularizer that accounts for the temporal ordering among the different lags. To this end, the regularizer prioritizes the inclusion of lagged coefficients according to the recency of the information they contain. We demonstrate the proposed method on an empirical application for daily realized volatility forecasting where we explore whether modeling high-frequency volatility data in terms of low-frequency macroeconomic data pays off.
    Date: 2023–01
  9. By: Cheng Peng; Stanislav Uryasev
    Abstract: This paper considers the problem of estimating the distribution of a response variable conditioned on observing some factors. Existing approaches are often deficient in one of the qualities of flexibility, interpretability and tractability. We propose a model that possesses these desirable properties. The proposed model, analogous to classic mixture regression models, models the conditional quantile function as a mixture (weighted sum) of basis quantile functions, with the weight of each basis quantile function being a function of the factors. The model can approximate any bounded conditional quantile model. It has a factor model structure with a closed-form expression. The calibration problem is formulated as convex optimization, which can be viewed as conducting quantile regressions of all confidence levels simultaneously and does not suffer from quantile crossing by design. The calibration is equivalent to minimization of Continuous Probability Ranked Score (CRPS). We prove the asymptotic normality of the estimator. Additionally, based on risk quadrangle framework, we generalize the proposed approach to conditional distributions defined by Conditional Value-at-Risk (CVaR), expectile and other functions of uncertainty measures. Based on CP decomposition of tensors, we propose a dimensionality reduction method by reducing the rank of the parameter tensor and propose an alternating algorithm for estimating the parameter tensor. Our numerical experiments demonstrate the efficiency of the approach.
    Date: 2023–01
  10. By: Rainer Winkelmann
    Abstract: This paper explores an algebraic relationship between two types of coefficients for a regression with several predictors and an additive binary group variable. In a general regression, the regression coefficients are allowed to be group-specific, the restricted regression imposes constant coefficients. The key result is that the restricted coefficients imposing homogeneity are not necessarily a convex average of the unrestricted coefficients obtained from the more general regression. In the context of treatment effect estimation with several treatment arms and group- level controls, this means that the estimated effect of a specific treatment can be non-zero, and statistically significant, even if the estimated unrestricted effects are zero in each group.
    Keywords: Ordinary least squares, subsample heterogeneity, variance-weighting, average treatment effect
    JEL: C21
    Date: 2023–01
  11. By: Hongjian Shi; Mathias Drton; Marc Hallin; Fang Han
    Abstract: Defining multivariate generalizations of the classical univariate ranks has been a long-standing open problem in statistics. Optimal transport has been shown to offer a solution in which multivariate ranks are obtained by transporting data points to a grid that approximates a uniformreference measure (Chernozhukov et al. 2017; Hallin, 2017; Hallin et al. 2021). We take up this new perspective to develop and study multivariate analogues of the sign covariance/quadrant statistic, Kendall’s tau, and Spearman’s rho. The resulting tests of multivariate independence are genuinely distribution-free, hence uniformly valid irrespective of the actual (absolutely continuous) distributions of the observations. Our results provide asymptotic distribution theory for these new test statistics, with asymptotic approximations to critical values to be used for testing independence as well as a power analysis of the resulting tests. This includes a multivariate elliptical Chernoff–Savage property, which guarantees that, under ellipticity, our nonparametric tests of independence enjoy an asymptotic relative efficiency of one or larger with respect to theclassical Gaussian procedures.
    Keywords: Semiparametrically Efficient Tests, Multivariate Independence, Center-Outward Quadrant, Spearman, Kendall Statistics
    Date: 2023–01
  12. By: Harold D Chiang; Yuya Sasaki
    Abstract: Thousands of papers have reported two-way cluster-robust (TWCR) standard errors. However, the recent econometrics literature points out the potential non-gaussianity of two-way cluster sample means, and thus invalidity of the inference based on the TWCR standard errors. Fortunately, simulation studies nonetheless show that the gaussianity is rather common than exceptional. This paper provides theoretical support for this encouraging observation. Specifically, we derive a novel central limit theorem for two-way clustered triangular arrays that justifies the use of the TWCR under very mild and interpretable conditions. We, therefore, hope that this paper will provide a theoretical justification for the legitimacy of most, if not all, of the thousands of those empirical papers that have used the TWCR standard errors. We provide a guide in practice as to when a researcher can employ the TWCR standard errors.
    Date: 2023–01
  13. By: Jean-Jacques Forneron
    Abstract: A practical challenge for structural estimation is the requirement to accurately minimize a sample objective function which is often non-smooth, non-convex, or both. This paper proposes a simple algorithm designed to find accurate solutions without performing an exhaustive search. It augments each iteration from a new Gauss-Newton algorithm with a grid search step. A finite sample analysis derives its optimization and statistical properties simultaneously using only econometric assumptions. After a finite number of iterations, the algorithm automatically transitions from global to fast local convergence, producing accurate estimates with high probability. Simulated examples and an empirical application illustrate the results.
    Date: 2023–01
  14. By: Geert Dhaene; Martin Weidner
    Abstract: Inference on common parameters in panel data models with individual-specific fixed effects is a classic example of Neyman and Scott's (1948) incidental parameter problem (IPP). One solution to this IPP is functional differencing (Bonhomme 2012), which works when the number of time periods T is fixed (and may be small), but this solution is not applicable to all panel data models of interest. Another solution, which applies to a larger class of models, is "large-T" bias correction (pioneered by Hahn and Kuersteiner 2002 and Hahn and Newey 2004), but this is only guaranteed to work well when T is sufficiently large. This paper provides a unified approach that connects those two seemingly disparate solutions to the IPP. In doing so, we provide an approximate version of functional differencing, that is, an approximate solution to the IPP that is applicable to a large class of panel data models even when T is relatively small.
    Date: 2023–01
  15. By: Richard K. Crump; Nikolay Gospodinov; Hunter Wieman
    Abstract: The low-frequency movements of many economic variables play a prominent role in policy analysis and decision-making. We develop a robust estimation approach for these slow-moving trend processes, which is guided by a judicious choice of priors and is characterized by sparsity. We present some novel stylized facts from longer-run survey expectations that inform the structure of the estimation procedure. The general version of the proposed Bayesian estimator with a slab-and-spike prior accounts explicitly for cyclical dynamics. The practical implementation of the method is discussed in detail, and we show that it performs well in simulations against some relevant benchmarks. We report empirical estimates of trend growth for U.S. output (and its components), productivity, and annual mean temperature. These estimates allow policymakers to assess shortfalls and overshoots in these variables from their economic and ecological targets.
    Keywords: sparsity; Bayesian inference; latent variable models; trend output growth; slow-moving trends
    JEL: C13 C30 C33 E27 E32
    Date: 2023–02–01
  16. By: Bo E. Honore; Luojia Hu
    Abstract: This paper studies semiparametric versions of the classical sample selection model (Heckman (1976, 1979)) without exclusion restrictions. We extend the analysis in Honoré and Hu (2020) by allowing for parameter heterogeneity and derive implications of this model. We also consider models that allow for heteroskedasticity and briefly discuss other extensions. The key ideas are illustrated in a simple wage regression for females. We find that the derived implications of a semiparametric version of Heckman's classical sample selection model are consistent with the data for women with no college education, but strongly rejected for women with a college degree or more.
    Keywords: Selection; heterogeneity; heteroskedasticity; exclusion Restrictions; identification
    JEL: C01 C14 C21 C24
    Date: 2021–07
  17. By: Phuong Anh Nguyen; Michael Wolf
    Abstract: Return event studies generally involve several companies but there are also cases when only one company is involved. This makes the relevant testing problems, abnormal return (AR) and cumulative abnormal return (CAR), more difficult since one cannot exploit the multitude of companies (by using a relevant central limit theorem, say). We propose a permutation test that is valid under weaker conditions than the tests that have previously proposed in the literature in this context. We address the question of the power of the test via a brief simulation study and also illustrate the method with two applications to real data.
    Keywords: Cumulative abnormal return, event study, permutation test
    JEL: C12 G14
    Date: 2023–01
  18. By: Benjamin Avanzi; Greg Taylor; Melantha Wang; Bernard Wong
    Abstract: High-cardinality categorical features are pervasive in actuarial data (e.g. occupation in commercial property insurance). Standard categorical encoding methods like one-hot encoding are inadequate in these settings. In this work, we present a novel _Generalised Linear Mixed Model Neural Network_ ("GLMMNet") approach to the modelling of high-cardinality categorical features. The GLMMNet integrates a generalised linear mixed model in a deep learning framework, offering the predictive power of neural networks and the transparency of random effects estimates, the latter of which cannot be obtained from the entity embedding models. Further, its flexibility to deal with any distribution in the exponential dispersion (ED) family makes it widely applicable to many actuarial contexts and beyond. We illustrate and compare the GLMMNet against existing approaches in a range of simulation experiments as well as in a real-life insurance case study. Notably, we find that the GLMMNet often outperforms or at least performs comparably with an entity embedded neural network, while providing the additional benefit of transparency, which is particularly valuable in practical applications. Importantly, while our model was motivated by actuarial applications, it can have wider applicability. The GLMMNet would suit any applications that involve high-cardinality categorical variables and where the response cannot be sufficiently modelled by a Gaussian distribution.
    Date: 2023–01
  19. By: Matteo Barigozzi; Filippo Pellegrino
    Abstract: This paper generalises dynamic factor models for multidimensional dependent data. In doing so, it develops an interpretable technique to study complex information sources ranging from repeated surveys with a varying number of respondents to panels of satellite images. We specialise our results to model microeconomic data on US households jointly with macroeconomic aggregates. This results in a powerful tool able to generate localised predictions, counterfactuals and impulse response functions for individual households, accounting for traditional time-series complexities depicted in the state-space literature. The model is also compatible with the growing focus of policymakers for real-time economic analysis as it is able to process observations online, while handling missing values and asynchronous data releases.
    Date: 2023–01
  20. By: Mario Forni (Università di Modena e Reggio Emilia, CEPR and RECent); Luca Gambetti (Universitat Autònoma de Barcelona, BSE, Università di Torino, CCA); Giovanni Ricco (École Polytechnique, University of Warwick, OFCE-SciencesPo, and CEPR)
    Keywords: Proxy-SVAR, SVAR-IV, Impulse response functions, Variance Decomposition, Historical Decomposition, Monetary Policy Shock
    JEL: C32 E32
    Date: 2022–01–29

This nep-ecm issue is ©2023 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.