nep-ecm New Economics Papers
on Econometrics
Issue of 2023‒09‒25
24 papers chosen by
Sune Karlsson, Örebro universitet

  1. Linear Regression with Weak Exogeneity By Anna Mikusheva; Mikkel S{\o}lvsten
  2. Weak Identification with Many Instruments By Anna Mikusheva; Liyang Sun
  3. Highly Irregular Serial Correlation Tests By Dante Amengual; Xinyue Bei; Enrique Sentana
  4. High Dimensional Time Series Regression Models: Applications to Statistical Learning Methods By Christis Katsouris
  5. Subvector inference for Varying Coefficient Models with Partial Identification By Shengjie Hong; Yu-Chin Hsu
  6. Optimal Shrinkage Estimation of Fixed Effects in Linear Panel Data Models By Soonwoo Kwon
  7. Target PCA: Transfer Learning Large Dimensional Panel Data By Junting Duan; Markus Pelger; Ruoxuan Xiong
  8. SGMM: Stochastic Approximation to Generalized Method of Moments By Xiaohong Chen; Sokbae Lee; Yuan Liao; Myung Hwan Seo; Youngki Shin; Myunghyun Song
  9. James–Stein for the leading eigenvector By Goldberg, Lisa R; Kercheval, Alec N
  10. Simulation Experiments as a Causal Problem By Tyrel Stokes; Ian Shrier; Russell Steele
  11. Recovering Stars in Macroeconomics By Daniel Buncic; Adrian Pagan; Tim Robinson
  12. Quantile Time Series Regression Models Revisited By Christis Katsouris
  13. Model-agnostic auditing: a lost cause? By Hansen, Sakina; Loftus, Joshua
  14. Modeling Event Studies with Heterogeneous Treatment Effects By Laura Argys; Thomas Mroz; M. Melinda Pitts
  15. What is a relevant control?: An algorithmic proposal By Fernando Delbianco; Fernando Tohmé
  16. Spatial autoregressive fractionally integrated moving average model By Otto, Philipp; Sibbertsen, Philipp
  17. Identification and Estimation of Demand Models with Endogenous Product Entry and Exit By Victor Aguirregabiria; Alessandro Iaria; Senay Sokullu
  18. Scalable Estimation of Multinomial Response Models with Uncertain Consideration Sets By Siddhartha Chib; Kenichi Shimizu
  19. The Dispersion Bias By Goldberg, Lisa R; Papanicolaou, Alex; Shkolnik, Alex
  20. Black-Litterman, Bayesian Shrinkage, and Factor Models in Portfolio Selection: You Can Have It All By Kwong Yu Chong
  21. Spatial and Spatiotemporal Volatility Models: A Review By Philipp Otto; Osman Do\u{g}an; S\"uleyman Ta\c{s}p{\i}nar; Wolfgang Schmid; Anil K. Bera
  22. GARHCX-NoVaS: A Model-free Approach to Incorporate Exogenous Variables By Kejin Wu; Sayar Karmakar
  23. A Coefficient of Variation for Multivariate Ordered Categorical Outcomes. By Gordon Anderson
  24. Unpacking P-Hacking and Publication Bias By Abel Brodeur; Scott E. Carrell; David N. Figlio; Lester R. Lusher

  1. By: Anna Mikusheva; Mikkel S{\o}lvsten
    Abstract: This paper studies linear time series regressions with many regressors. Weak exogeneity is the most used identifying assumption in time series. Weak exogeneity requires the structural error to have zero conditional expectation given the present and past regressor values, allowing errors to correlate with future regressor realizations. We show that weak exogeneity in time series regressions with many controls may produce substantial biases and even render the least squares (OLS) estimator inconsistent. The bias arises in settings with many regressors because the normalized OLS design matrix remains asymptotically random and correlates with the regression error when only weak (but not strict) exogeneity holds. This bias's magnitude increases with the number of regressors and their average autocorrelation. To address this issue, we propose an innovative approach to bias correction that yields a new estimator with improved properties relative to OLS. We establish consistency and conditional asymptotic Gaussianity of this new estimator and provide a method for inference.
    Date: 2023–08
  2. By: Anna Mikusheva; Liyang Sun
    Abstract: Linear instrumental variable regressions are widely used to estimate causal effects. Many instruments arise from the use of "technical" instruments and more recently from the empirical strategy of "judge design". This paper surveys and summarizes ideas from recent literature on estimation and statistical inferences with many instruments. We discuss how to assess the strength of the instruments and how to conduct weak identification-robust inference under heteroscedasticity. We establish new results for a jack-knifed version of the Lagrange Multiplier (LM) test statistic. Many exogenous regressors arise often in practice to ensure the validity of the instruments. We extend the weak-identification-robust tests to settings with both many exogenous regressors and many instruments. We propose a test that properly partials out many exogenous regressors while preserving the re-centering property of the jack-knife. The proposed tests have uniformly correct size and good power properties.
    Date: 2023–08
  3. By: Dante Amengual (CEMFI, Centro de Estudios Monetarios y Financieros); Xinyue Bei (Duke University); Enrique Sentana (CEMFI, Centro de Estudios Monetarios y Financieros)
    Abstract: We develop tests for neglected serial correlation when the information matrix is repeatedly singular under the null. Specifically, we consider white noise against a multiplicative seasonal AR model, and a local-level model against a nesting UCARIMA one. Our proposals, which involve higher-order derivatives, are asymptotically equivalent to the likelihood ratio test but only require estimation under the null. Remarkably, we show that our proposed tests effectively check that certain autocorrelations of the observations are 0, so their asymptotic distribution is standard. We conduct Monte Carlo exercises that study their finite sample size and power properties, comparing them to alternative approaches.
    Keywords: Generalized extremum tests, higher-order identifiability, likelihood ratio test.
    JEL: C22 C32 C52 C12
    Date: 2023–05
  4. By: Christis Katsouris
    Abstract: These lecture notes provide an overview of existing methodologies and recent developments for estimation and inference with high dimensional time series regression models. First, we present main limit theory results for high dimensional dependent data which is relevant to covariance matrix structures as well as to dependent time series sequences. Second, we present main aspects of the asymptotic theory related to time series regression models with many covariates. Third, we discuss various applications of statistical learning methodologies for time series analysis purposes.
    Date: 2023–08
  5. By: Shengjie Hong; Yu-Chin Hsu
    Abstract: This paper develops inference methods for a general class of varying coefficient models defined by a set of moment inequalities and/or equalities, where unknown functional parameters are not necessarily point-identified. We propose an inferential procedure for a subvector of the parameters and establish the asymptotic validity of the resulting confidence sets uniformly over a broad family of data-generating processes. We also propose a specification test for the varying coefficient models considered in this paper. Monte Carlo studies show that the proposed methods work well in finite samples.
    Keywords: Varying coefficient; Moment inequalities; Partial-identification; Multiplierbootstrap
    JEL: C12 C14 C15
    Date: 2023–08–31
  6. By: Soonwoo Kwon
    Abstract: Shrinkage methods are frequently used to estimate fixed effects to reduce the noisiness of the least square estimators. However, widely used shrinkage estimators guarantee such noise reduction only under strong distributional assumptions. I develop an estimator for the fixed effects that obtains the best possible mean squared error within a class of shrinkage estimators. This class includes conventional shrinkage estimators and the optimality does not require distributional assumptions. The estimator has an intuitive form and is easy to implement. Moreover, the fixed effects are allowed to vary with time and to be serially correlated, and the shrinkage optimally incorporates the underlying correlation structure in this case. In such a context, I also provide a method to forecast fixed effects one period ahead.
    Date: 2023–08
  7. By: Junting Duan; Markus Pelger; Ruoxuan Xiong
    Abstract: This paper develops a novel method to estimate a latent factor model for a large target panel with missing observations by optimally using the information from auxiliary panel data sets. We refer to our estimator as target-PCA. Transfer learning from auxiliary panel data allows us to deal with a large fraction of missing observations and weak signals in the target panel. We show that our estimator is more efficient and can consistently estimate weak factors, which are not identifiable with conventional methods. We provide the asymptotic inferential theory for target-PCA under very general assumptions on the approximate factor model and missing patterns. In an empirical study of imputing data in a mixed-frequency macroeconomic panel, we demonstrate that target-PCA significantly outperforms all benchmark methods.
    Date: 2023–08
  8. By: Xiaohong Chen; Sokbae Lee; Yuan Liao; Myung Hwan Seo; Youngki Shin; Myunghyun Song
    Abstract: We introduce a new class of algorithms, Stochastic Generalized Method of Moments (SGMM), for estimation and inference on (overidentified) moment restriction models. Our SGMM is a novel stochastic approximation alternative to the popular Hansen (1982) (offline) GMM, and offers fast and scalable implementation with the ability to handle streaming datasets in real time. We establish the almost sure convergence, and the (functional) central limit theorem for the inefficient online 2SLS and the efficient SGMM. Moreover, we propose online versions of the Durbin-Wu-Hausman and Sargan-Hansen tests that can be seamlessly integrated within the SGMM framework. Extensive Monte Carlo simulations show that as the sample size increases, the SGMM matches the standard (offline) GMM in terms of estimation accuracy and gains over computational efficiency, indicating its practical value for both large-scale and online datasets. We demonstrate the efficacy of our approach by a proof of concept using two well known empirical examples with large sample sizes.
    Date: 2023–08
  9. By: Goldberg, Lisa R; Kercheval, Alec N
    Abstract: Recent research identifies and corrects bias, such as excess dispersion, in the leading sample eigenvector of a factor-based covariance matrix estimated from a high-dimension low sample size (HL) data set. We show that eigenvector bias can have a substantial impact on variance-minimizing optimization in the HL regime, while bias in estimated eigenvalues may have little effect. We describe a data-driven eigenvector shrinkage estimator in the HL regime called "James-Stein for eigenvectors" (JSE) and its close relationship with the James-Stein (JS) estimator for a collection of averages. We show, both theoretically and with numerical experiments, that, for certain variance-minimizing problems of practical importance, efforts to correct eigenvalues have little value in comparison to the JSE correction of the leading eigenvector. When certain extra information is present, JSE is a consistent estimator of the leading eigenvector.
    Keywords: Bias, Sample Size, asymptotic regime, shrinkage, factor model, optimization, covariance matrix
    Date: 2023–01–10
  10. By: Tyrel Stokes; Ian Shrier; Russell Steele
    Abstract: Simulation methods are among the most ubiquitous methodological tools in statistical science. In particular, statisticians often is simulation to explore properties of statistical functionals in models for which developed statistical theory is insufficient or to assess finite sample properties of theoretical results. We show that the design of simulation experiments can be viewed from the perspective of causal intervention on a data generating mechanism. We then demonstrate the use of causal tools and frameworks in this context. Our perspective is agnostic to the particular domain of the simulation experiment which increases the potential impact of our proposed approach. In this paper, we consider two illustrative examples. First, we re-examine a predictive machine learning example from a popular textbook designed to assess the relationship between mean function complexity and the mean-squared error. Second, we discuss a traditional causal inference method problem, simulating the effect of unmeasured confounding on estimation, specifically to illustrate bias amplification. In both cases, applying causal principles and using graphical models with parameters and distributions as nodes in the spirit of influence diagrams can 1) make precise which estimand the simulation targets , 2) suggest modifications to better attain the simulation goals, and 3) provide scaffolding to discuss performance criteria for a particular simulation design.
    Date: 2023–08
  11. By: Daniel Buncic; Adrian Pagan; Tim Robinson
    Abstract: Many key macroeconomic variables such as the NAIRU, potential GDP, and the neutral real rate of interest—which are needed for policy analysis—are latent. Collectively, these latent variables are known as ‘stars’ and are typically estimated using the Kalman filter or smoother from models that can be expressed in State Space form. When these models contain more shocks than observed variables, they are ‘short’, and potentially create issues in recovering the star variable of interest from the observed data. Recovery issues can occur when the model is correctly specified and its parameters are known. In this paper, we summarize the literature on shock recovery and demonstrate its implications for estimating stars in a number of widely used models in policy analysis. The ability of many popular and recent models to recover stars is shown to be limited. We suggest ways this can be addressed.
    Keywords: Kalman filter and smoother, State Space models, shock recovery, short systems, natural rate of interest, macroeconomic policy, Beveridge-Nelson decomposition
    JEL: C22 C32 E58
    Date: 2023–09
  12. By: Christis Katsouris
    Abstract: This article discusses recent developments in the literature of quantile time series models in the cases of stationary and nonstationary underline stochastic processes.
    Date: 2023–08
  13. By: Hansen, Sakina; Loftus, Joshua
    Abstract: Tools for interpretable machine learning (IML) or explainable artificial intelligence (xAI) can be used to audit algorithms for fairness or other desiderata. In a black-box setting without access to the algorithm’s internal structure an auditor may be limited to methods that are model-agnostic. These methods have severe limitations with important consequences for outcomes such as fairness. Among model-agnostic IML methods, visualizations such as the partial dependence plot (PDP) or individual conditional expectation (ICE) plots are popular and useful for displaying qualitative relationships. Although we focus on fairness auditing with PDP/ICE plots, the consequences we highlight generalize to other auditing or IML/xAI applications. This paper questions the validity of auditing in high-stakes settings with contested values or conflicting interests if the audit methods are model-agnostic.
    Keywords: artificial intelligence; black-box auditing; causal models; CEUR Workshop Proceedings (; counterfactual fairness; individual conditional expectation; machine learning; partial dependence plots; supervised learning; visualization
    JEL: C1
    Date: 2023–07–16
  14. By: Laura Argys; Thomas Mroz; M. Melinda Pitts
    Abstract: This paper develops a simple approach to overcome the shortcomings of using a standard, single treatment–effect event study to assess the ability of an empirical model to measure heterogeneous treatment effects. Equally as important, we discuss how the standard errors reported in a typical event-study analysis for the posttreatment event-time effects are, without additional information, of limited use for assessing posttreatment variations in the treatment effects. The simple reformulation of the standard event—study approach described and illustrated with artificially constructed data in this paper overcomes the limitations of conventional event-study analyses.
    Keywords: event studies; heterogeneous treatment effects
    JEL: C22 C23
    Date: 2023–09–07
  15. By: Fernando Delbianco (UNS/CONICET); Fernando Tohmé (UNS/CONICET)
    Abstract: Individualized inference (or prediction) is an approach to data analysis that is increasingly relevant thanks to the availability of large datasets. In this paper, we present an algorithm that starts by detecting the relevant observations for a given query. Further refinement of that subsample is obtained by selecting the ones with the largest Shapley values. The probability distribution over this selection allows to generate synthetic controls, which in turn can be used to generate a robust inference (or prediction). Data collected from repeating this procedure for different queries provides a deeper understanding of the general process that generates the data.
    Keywords: Individualized inference, Relevance selection, and classification, Synthetic controls
    JEL: C6 C15 C63
    Date: 2023–08
  16. By: Otto, Philipp; Sibbertsen, Philipp
    Abstract: In this paper, we introduce the concept of fractional integration for spatial autoregressive models. We show that the range of the dependence can be spatially extended or diminished by introducing a further fractional integration parameter to spatial autoregressive moving average models (SARMA). This new model is called the spatial autoregressive fractionally integrated moving average model, briefly sp-ARFIMA. We show the relation to time-series ARFIMA models and also to (higher-order) spatial autoregressive models. Moreover, an estimation procedure based on the maximum-likelihood principle is introduced and analysed in a series of simulation studies. Eventually, the use of the model is illustrated by an empirical example of atmospheric fine particles, so-called aerosol optical thickness, which is important in weather, climate and environmental science.
    Keywords: Spatial ARFIMA; spatial fractional integration; long-range dependence; aerosol optical depth
    JEL: C22 C23
    Date: 2023–09
  17. By: Victor Aguirregabiria; Alessandro Iaria; Senay Sokullu
    Abstract: This paper deals with the endogeneity of firms' entry and exit decisions in demand estimation. Product entry decisions lack a single crossing property in terms of demand unobservables, which causes the inconsistency of conventional methods dealing with selection. We present a novel and straightforward two-step approach to estimate demand while addressing endogenous product entry. In the first step, our method estimates a finite mixture model of product entry accommodating latent market types. In the second step, it estimates demand controlling for the propensity scores of all latent market types. We apply this approach to data from the airline industry.
    Keywords: Demand for differentiated product; Endogenous product availability; Selection bias; Market entry and exit; Multiple equilibria; Identification; Estimation; Demand for airlines
    JEL: C14 C34 C35 C57 D22 L13 L93
    Date: 2023–08–27
  18. By: Siddhartha Chib; Kenichi Shimizu
    Abstract: A standard assumption in the fitting of unordered multinomial response models for J mutually exclusive nominal categories, on cross-sectional or longitudinal data, is that the responses arise from the same set of J categories between subjects. However, when responses measure a choice made by the subject, it is more appropriate to assume that the distribution of multinomial responses is conditioned on a subject-specific consideration set, where this consideration set is drawn from the power set of {1, 2, ..., J}. Because the cardinality of this power set is exponential in J, estimation is infeasible in general. In this paper, we provide an approach to overcoming this problem. A key step in the approach is a probability model over consideration sets, based on a general representation of probability distributions on contingency tables. Although the support of this distribution is exponentially large, the posterior distribution over consideration sets given parameters is typically sparse, and is easily sampled as part of an MCMC scheme that iterates sampling of subject-specific consideration sets given parameters, followed by parameters given consideration sets. The effectiveness of the procedure is documented in simulated longitudinal data sets with J=100 categories and real data from the cereal market with J=73 brands.
    Date: 2023–08
  19. By: Goldberg, Lisa R; Papanicolaou, Alex; Shkolnik, Alex
    Abstract: We identify and correct excess dispersion in the leading eigenvector of a sample covariance matrix when the number of variables vastly exceeds the number of observations. Our correction is datadriven, and it materially diminishes the substantial impact of estimation error on weights and risk forecasts of minimum variance portfolios. We quantify that impact with a novel metric, the optimization bias, which has a positive lower bound prior to correction and tends to zero almost surely after correction. Our analysis sheds light on aspects of how estimation error corrupts an estimated covariance matrix and is transmitted to portfolios via quadratic optimization.
    Keywords: dispersion bias, optimization bias, eigenvector, minimum variance portfolio, covariance matrix, shrinkage, Applied Mathematics, Statistics, Banking, Finance and Investment
    Date: 2022–06–01
  20. By: Kwong Yu Chong
    Abstract: Mean-variance analysis is widely used in portfolio management to identify the best portfolio that makes an optimal trade-off between expected return and volatility. Yet, this method has its limitations, notably its vulnerability to estimation errors and its reliance on historical data. While shrinkage estimators and factor models have been introduced to improve estimation accuracy through bias-variance trade-offs, and the Black-Litterman model has been developed to integrate investor opinions, a unified framework combining three approaches has been lacking. Our study debuts a Bayesian blueprint that fuses shrinkage estimation with view inclusion, conceptualizing both as Bayesian updates. This model is then applied within the context of the Fama-French approach factor models, thereby integrating the advantages of each methodology. Finally, through a comprehensive empirical study in the US equity market spanning a decade, we show that the model outperforms both the simple $1/N$ portfolio and the optimal portfolios based on sample estimators.
    Date: 2023–08
  21. By: Philipp Otto; Osman Do\u{g}an; S\"uleyman Ta\c{s}p{\i}nar; Wolfgang Schmid; Anil K. Bera
    Abstract: Spatial and spatiotemporal volatility models are a class of models designed to capture spatial dependence in the volatility of spatial and spatiotemporal data. Spatial dependence in the volatility may arise due to spatial spillovers among locations; that is, if two locations are in close proximity, they can exhibit similar volatilities. In this paper, we aim to provide a comprehensive review of the recent literature on spatial and spatiotemporal volatility models. We first briefly review time series volatility models and their multivariate extensions to motivate their spatial and spatiotemporal counterparts. We then review various spatial and spatiotemporal volatility specifications proposed in the literature along with their underlying motivations and estimation strategies. Through this analysis, we effectively compare all models and provide practical recommendations for their appropriate usage. We highlight possible extensions and conclude by outlining directions for future research.
    Date: 2023–08
  22. By: Kejin Wu; Sayar Karmakar
    Abstract: In this work, we further explore the forecasting ability of a recently proposed normalizing and variance-stabilizing (NoVaS) transformation after wrapping exogenous variables. In practice, especially in the area of financial econometrics, extra knowledge such as fundamentals- and sentiments-based information could be beneficial to improve the prediction accuracy of market volatility if they are incorporated into the forecasting process. In a classical approach, people usually apply GARCHX-type methods to include the exogenous variables. Being a Model-free prediction method, NoVaS has been shown to be more accurate and stable than classical GARCH-type methods. We are interested in whether the novel NoVaS method can also sustain its superiority after exogenous covariates are taken into account. We provide the NoVaS transformation based on GARCHX model and then claim the corresponding prediction procedure with exogenous variables existing. Also, simulation studies verify that the NoVaS method still outperforms traditional methods, especially for long-term time aggregated predictions.
    Date: 2023–08
  23. By: Gordon Anderson
    Abstract: Comparing the relative variation of ordinal variates defined on diverse populations is challenging. Pearsons’ Coefficient of Variation or its inverse (the Sharpe Ratio), each used extensively for comparing relative variation or risk tempered location in cardinal paradigms, cannot be employed in ordinal data environments unless cardinal scale is attributed to ordered categories. Unfortunately, due to the scale dependencies of the Coefficient of Variations denominator and numerator, such arbitrary attribution can result in equivocal comparisons. Here, based upon the notion of probabilistic distance, unequivocal, scale independent, Coefficient of Variation and Sharpe Ratio analogues for use with Multivariate Ordered Categorical Data are introduced and exemplified in an analysis of Canadian Human Resource distributions.
    Keywords: ordinal outcomes, variation coefficient, Sharpe Ratio
    JEL: C18 I32 G10
    Date: 2023–09–05
  24. By: Abel Brodeur; Scott E. Carrell; David N. Figlio; Lester R. Lusher
    Abstract: We use unique data from journal submissions to identify and unpack publication bias and p-hacking. We find that initial submissions display significant bunching, suggesting the distribution among published statistics cannot be fully attributed to a publication bias in peer review. Desk-rejected manuscripts display greater heaping than those sent for review i.e. marginally significant results are more likely to be desk rejected. Reviewer recommendations, in contrast, are positively associated with statistical significance. Overall, the peer review process has little effect on the distribution of test statistics. Lastly, we track rejected papers and present evidence that the prevalence of publication biases is perhaps not as prominent as feared.
    JEL: A0
    Date: 2023–08

This nep-ecm issue is ©2023 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.