Econometrics
http://lists.repec.org/mailman/listinfo/nep-ecm
Econometrics
2023-09-25
Linear Regression with Weak Exogeneity
http://d.repec.org/n?u=RePEc:arx:papers:2308.08958&r=ecm
This paper studies linear time series regressions with many regressors. Weak exogeneity is the most used identifying assumption in time series. Weak exogeneity requires the structural error to have zero conditional expectation given the present and past regressor values, allowing errors to correlate with future regressor realizations. We show that weak exogeneity in time series regressions with many controls may produce substantial biases and even render the least squares (OLS) estimator inconsistent. The bias arises in settings with many regressors because the normalized OLS design matrix remains asymptotically random and correlates with the regression error when only weak (but not strict) exogeneity holds. This bias's magnitude increases with the number of regressors and their average autocorrelation. To address this issue, we propose an innovative approach to bias correction that yields a new estimator with improved properties relative to OLS. We establish consistency and conditional asymptotic Gaussianity of this new estimator and provide a method for inference.
Anna Mikusheva
Mikkel S{\o}lvsten
2023-08
Weak Identification with Many Instruments
http://d.repec.org/n?u=RePEc:arx:papers:2308.09535&r=ecm
Linear instrumental variable regressions are widely used to estimate causal effects. Many instruments arise from the use of "technical" instruments and more recently from the empirical strategy of "judge design". This paper surveys and summarizes ideas from recent literature on estimation and statistical inferences with many instruments. We discuss how to assess the strength of the instruments and how to conduct weak identification-robust inference under heteroscedasticity. We establish new results for a jack-knifed version of the Lagrange Multiplier (LM) test statistic. Many exogenous regressors arise often in practice to ensure the validity of the instruments. We extend the weak-identification-robust tests to settings with both many exogenous regressors and many instruments. We propose a test that properly partials out many exogenous regressors while preserving the re-centering property of the jack-knife. The proposed tests have uniformly correct size and good power properties.
Anna Mikusheva
Liyang Sun
2023-08
Highly Irregular Serial Correlation Tests
http://d.repec.org/n?u=RePEc:cmf:wpaper:wp2023_2302&r=ecm
We develop tests for neglected serial correlation when the information matrix is repeatedly singular under the null. Specifically, we consider white noise against a multiplicative seasonal AR model, and a local-level model against a nesting UCARIMA one. Our proposals, which involve higher-order derivatives, are asymptotically equivalent to the likelihood ratio test but only require estimation under the null. Remarkably, we show that our proposed tests effectively check that certain autocorrelations of the observations are 0, so their asymptotic distribution is standard. We conduct Monte Carlo exercises that study their finite sample size and power properties, comparing them to alternative approaches.
Dante Amengual
Xinyue Bei
Enrique Sentana
Generalized extremum tests, higher-order identifiability, likelihood ratio test.
2023-05
High Dimensional Time Series Regression Models: Applications to Statistical Learning Methods
http://d.repec.org/n?u=RePEc:arx:papers:2308.16192&r=ecm
These lecture notes provide an overview of existing methodologies and recent developments for estimation and inference with high dimensional time series regression models. First, we present main limit theory results for high dimensional dependent data which is relevant to covariance matrix structures as well as to dependent time series sequences. Second, we present main aspects of the asymptotic theory related to time series regression models with many covariates. Third, we discuss various applications of statistical learning methodologies for time series analysis purposes.
Christis Katsouris
2023-08
Subvector inference for Varying Coefficient Models with Partial Identification
http://d.repec.org/n?u=RePEc:tor:tecipa:tecipa-756&r=ecm
This paper develops inference methods for a general class of varying coefficient models defined by a set of moment inequalities and/or equalities, where unknown functional parameters are not necessarily point-identified. We propose an inferential procedure for a subvector of the parameters and establish the asymptotic validity of the resulting confidence sets uniformly over a broad family of data-generating processes. We also propose a specification test for the varying coefficient models considered in this paper. Monte Carlo studies show that the proposed methods work well in finite samples.
Shengjie Hong
Yu-Chin Hsu
Varying coefficient; Moment inequalities; Partial-identification; Multiplierbootstrap
2023-08-31
Optimal Shrinkage Estimation of Fixed Effects in Linear Panel Data Models
http://d.repec.org/n?u=RePEc:arx:papers:2308.12485&r=ecm
Shrinkage methods are frequently used to estimate fixed effects to reduce the noisiness of the least square estimators. However, widely used shrinkage estimators guarantee such noise reduction only under strong distributional assumptions. I develop an estimator for the fixed effects that obtains the best possible mean squared error within a class of shrinkage estimators. This class includes conventional shrinkage estimators and the optimality does not require distributional assumptions. The estimator has an intuitive form and is easy to implement. Moreover, the fixed effects are allowed to vary with time and to be serially correlated, and the shrinkage optimally incorporates the underlying correlation structure in this case. In such a context, I also provide a method to forecast fixed effects one period ahead.
Soonwoo Kwon
2023-08
Target PCA: Transfer Learning Large Dimensional Panel Data
http://d.repec.org/n?u=RePEc:arx:papers:2308.15627&r=ecm
This paper develops a novel method to estimate a latent factor model for a large target panel with missing observations by optimally using the information from auxiliary panel data sets. We refer to our estimator as target-PCA. Transfer learning from auxiliary panel data allows us to deal with a large fraction of missing observations and weak signals in the target panel. We show that our estimator is more efficient and can consistently estimate weak factors, which are not identifiable with conventional methods. We provide the asymptotic inferential theory for target-PCA under very general assumptions on the approximate factor model and missing patterns. In an empirical study of imputing data in a mixed-frequency macroeconomic panel, we demonstrate that target-PCA significantly outperforms all benchmark methods.
Junting Duan
Markus Pelger
Ruoxuan Xiong
2023-08
SGMM: Stochastic Approximation to Generalized Method of Moments
http://d.repec.org/n?u=RePEc:arx:papers:2308.13564&r=ecm
We introduce a new class of algorithms, Stochastic Generalized Method of Moments (SGMM), for estimation and inference on (overidentified) moment restriction models. Our SGMM is a novel stochastic approximation alternative to the popular Hansen (1982) (offline) GMM, and offers fast and scalable implementation with the ability to handle streaming datasets in real time. We establish the almost sure convergence, and the (functional) central limit theorem for the inefficient online 2SLS and the efficient SGMM. Moreover, we propose online versions of the Durbin-Wu-Hausman and Sargan-Hansen tests that can be seamlessly integrated within the SGMM framework. Extensive Monte Carlo simulations show that as the sample size increases, the SGMM matches the standard (offline) GMM in terms of estimation accuracy and gains over computational efficiency, indicating its practical value for both large-scale and online datasets. We demonstrate the efficacy of our approach by a proof of concept using two well known empirical examples with large sample sizes.
Xiaohong Chen
Sokbae Lee
Yuan Liao
Myung Hwan Seo
Youngki Shin
Myunghyun Song
2023-08
James–Stein for the leading eigenvector
http://d.repec.org/n?u=RePEc:cdl:econwp:qt3mm9r9pp&r=ecm
Recent research identifies and corrects bias, such as excess dispersion, in the leading sample eigenvector of a factor-based covariance matrix estimated from a high-dimension low sample size (HL) data set. We show that eigenvector bias can have a substantial impact on variance-minimizing optimization in the HL regime, while bias in estimated eigenvalues may have little effect. We describe a data-driven eigenvector shrinkage estimator in the HL regime called "James-Stein for eigenvectors" (JSE) and its close relationship with the James-Stein (JS) estimator for a collection of averages. We show, both theoretically and with numerical experiments, that, for certain variance-minimizing problems of practical importance, efforts to correct eigenvalues have little value in comparison to the JSE correction of the leading eigenvector. When certain extra information is present, JSE is a consistent estimator of the leading eigenvector.
Goldberg, Lisa R
Kercheval, Alec N
Bias, Sample Size, asymptotic regime, shrinkage, factor model, optimization, covariance matrix
2023-01-10
Simulation Experiments as a Causal Problem
http://d.repec.org/n?u=RePEc:arx:papers:2308.10823&r=ecm
Simulation methods are among the most ubiquitous methodological tools in statistical science. In particular, statisticians often is simulation to explore properties of statistical functionals in models for which developed statistical theory is insufficient or to assess finite sample properties of theoretical results. We show that the design of simulation experiments can be viewed from the perspective of causal intervention on a data generating mechanism. We then demonstrate the use of causal tools and frameworks in this context. Our perspective is agnostic to the particular domain of the simulation experiment which increases the potential impact of our proposed approach. In this paper, we consider two illustrative examples. First, we re-examine a predictive machine learning example from a popular textbook designed to assess the relationship between mean function complexity and the mean-squared error. Second, we discuss a traditional causal inference method problem, simulating the effect of unmeasured confounding on estimation, specifically to illustrate bias amplification. In both cases, applying causal principles and using graphical models with parameters and distributions as nodes in the spirit of influence diagrams can 1) make precise which estimand the simulation targets , 2) suggest modifications to better attain the simulation goals, and 3) provide scaffolding to discuss performance criteria for a particular simulation design.
Tyrel Stokes
Ian Shrier
Russell Steele
2023-08
Recovering Stars in Macroeconomics
http://d.repec.org/n?u=RePEc:een:camaaa:2023-43&r=ecm
Many key macroeconomic variables such as the NAIRU, potential GDP, and the neutral real rate of interestâ€”which are needed for policy analysisâ€”are latent. Collectively, these latent variables are known as â€˜starsâ€™ and are typically estimated using the Kalman filter or smoother from models that can be expressed in State Space form. When these models contain more shocks than observed variables, they are â€˜shortâ€™, and potentially create issues in recovering the star variable of interest from the observed data. Recovery issues can occur when the model is correctly specified and its parameters are known. In this paper, we summarize the literature on shock recovery and demonstrate its implications for estimating stars in a number of widely used models in policy analysis. The ability of many popular and recent models to recover stars is shown to be limited. We suggest ways this can be addressed.
Daniel Buncic
Adrian Pagan
Tim Robinson
Kalman filter and smoother, State Space models, shock recovery, short systems, natural rate of interest, macroeconomic policy, Beveridge-Nelson decomposition
2023-09
Quantile Time Series Regression Models Revisited
http://d.repec.org/n?u=RePEc:arx:papers:2308.06617&r=ecm
This article discusses recent developments in the literature of quantile time series models in the cases of stationary and nonstationary underline stochastic processes.
Christis Katsouris
2023-08
Model-agnostic auditing: a lost cause?
http://d.repec.org/n?u=RePEc:ehl:lserod:120114&r=ecm
Tools for interpretable machine learning (IML) or explainable artificial intelligence (xAI) can be used to audit algorithms for fairness or other desiderata. In a black-box setting without access to the algorithm’s internal structure an auditor may be limited to methods that are model-agnostic. These methods have severe limitations with important consequences for outcomes such as fairness. Among model-agnostic IML methods, visualizations such as the partial dependence plot (PDP) or individual conditional expectation (ICE) plots are popular and useful for displaying qualitative relationships. Although we focus on fairness auditing with PDP/ICE plots, the consequences we highlight generalize to other auditing or IML/xAI applications. This paper questions the validity of auditing in high-stakes settings with contested values or conflicting interests if the audit methods are model-agnostic.
Hansen, Sakina
Loftus, Joshua
artificial intelligence; black-box auditing; causal models; CEUR Workshop Proceedings (CEUR-WS.org); counterfactual fairness; individual conditional expectation; machine learning; partial dependence plots; supervised learning; visualization
2023-07-16
Modeling Event Studies with Heterogeneous Treatment Effects
http://d.repec.org/n?u=RePEc:fip:fedawp:96718&r=ecm
This paper develops a simple approach to overcome the shortcomings of using a standard, single treatment–effect event study to assess the ability of an empirical model to measure heterogeneous treatment effects. Equally as important, we discuss how the standard errors reported in a typical event-study analysis for the posttreatment event-time effects are, without additional information, of limited use for assessing posttreatment variations in the treatment effects. The simple reformulation of the standard event—study approach described and illustrated with artificially constructed data in this paper overcomes the limitations of conventional event-study analyses.
Laura Argys
Thomas Mroz
M. Melinda Pitts
event studies; heterogeneous treatment effects
2023-09-07
What is a relevant control?: An algorithmic proposal
http://d.repec.org/n?u=RePEc:aoz:wpaper:269&r=ecm
Individualized inference (or prediction) is an approach to data analysis that is increasingly relevant thanks to the availability of large datasets. In this paper, we present an algorithm that starts by detecting the relevant observations for a given query. Further refinement of that subsample is obtained by selecting the ones with the largest Shapley values. The probability distribution over this selection allows to generate synthetic controls, which in turn can be used to generate a robust inference (or prediction). Data collected from repeating this procedure for different queries provides a deeper understanding of the general process that generates the data.
Fernando Delbianco
Fernando Tohmé
Individualized inference, Relevance selection, and classification, Synthetic controls
2023-08
Spatial autoregressive fractionally integrated moving average model
http://d.repec.org/n?u=RePEc:han:dpaper:dp-712&r=ecm
In this paper, we introduce the concept of fractional integration for spatial autoregressive models. We show that the range of the dependence can be spatially extended or diminished by introducing a further fractional integration parameter to spatial autoregressive moving average models (SARMA). This new model is called the spatial autoregressive fractionally integrated moving average model, briefly sp-ARFIMA. We show the relation to time-series ARFIMA models and also to (higher-order) spatial autoregressive models. Moreover, an estimation procedure based on the maximum-likelihood principle is introduced and analysed in a series of simulation studies. Eventually, the use of the model is illustrated by an empirical example of atmospheric fine particles, so-called aerosol optical thickness, which is important in weather, climate and environmental science.
Otto, Philipp
Sibbertsen, Philipp
Spatial ARFIMA; spatial fractional integration; long-range dependence; aerosol optical depth
2023-09
Identification and Estimation of Demand Models with Endogenous Product Entry and Exit
http://d.repec.org/n?u=RePEc:tor:tecipa:tecipa-755&r=ecm
This paper deals with the endogeneity of firms' entry and exit decisions in demand estimation. Product entry decisions lack a single crossing property in terms of demand unobservables, which causes the inconsistency of conventional methods dealing with selection. We present a novel and straightforward two-step approach to estimate demand while addressing endogenous product entry. In the first step, our method estimates a finite mixture model of product entry accommodating latent market types. In the second step, it estimates demand controlling for the propensity scores of all latent market types. We apply this approach to data from the airline industry.
Victor Aguirregabiria
Alessandro Iaria
Senay Sokullu
Demand for differentiated product; Endogenous product availability; Selection bias; Market entry and exit; Multiple equilibria; Identification; Estimation; Demand for airlines
2023-08-27
Scalable Estimation of Multinomial Response Models with Uncertain Consideration Sets
http://d.repec.org/n?u=RePEc:arx:papers:2308.12470&r=ecm
A standard assumption in the fitting of unordered multinomial response models for J mutually exclusive nominal categories, on cross-sectional or longitudinal data, is that the responses arise from the same set of J categories between subjects. However, when responses measure a choice made by the subject, it is more appropriate to assume that the distribution of multinomial responses is conditioned on a subject-specific consideration set, where this consideration set is drawn from the power set of {1, 2, ..., J}. Because the cardinality of this power set is exponential in J, estimation is infeasible in general. In this paper, we provide an approach to overcoming this problem. A key step in the approach is a probability model over consideration sets, based on a general representation of probability distributions on contingency tables. Although the support of this distribution is exponentially large, the posterior distribution over consideration sets given parameters is typically sparse, and is easily sampled as part of an MCMC scheme that iterates sampling of subject-specific consideration sets given parameters, followed by parameters given consideration sets. The effectiveness of the procedure is documented in simulated longitudinal data sets with J=100 categories and real data from the cereal market with J=73 brands.
Siddhartha Chib
Kenichi Shimizu
2023-08
The Dispersion Bias
http://d.repec.org/n?u=RePEc:cdl:econwp:qt4kt5g2x3&r=ecm
We identify and correct excess dispersion in the leading eigenvector of a sample covariance matrix when the number of variables vastly exceeds the number of observations. Our correction is datadriven, and it materially diminishes the substantial impact of estimation error on weights and risk forecasts of minimum variance portfolios. We quantify that impact with a novel metric, the optimization bias, which has a positive lower bound prior to correction and tends to zero almost surely after correction. Our analysis sheds light on aspects of how estimation error corrupts an estimated covariance matrix and is transmitted to portfolios via quadratic optimization.
Goldberg, Lisa R
Papanicolaou, Alex
Shkolnik, Alex
dispersion bias, optimization bias, eigenvector, minimum variance portfolio, covariance matrix, shrinkage, Applied Mathematics, Statistics, Banking, Finance and Investment
2022-06-01
Black-Litterman, Bayesian Shrinkage, and Factor Models in Portfolio Selection: You Can Have It All
http://d.repec.org/n?u=RePEc:arx:papers:2308.09264&r=ecm
Mean-variance analysis is widely used in portfolio management to identify the best portfolio that makes an optimal trade-off between expected return and volatility. Yet, this method has its limitations, notably its vulnerability to estimation errors and its reliance on historical data. While shrinkage estimators and factor models have been introduced to improve estimation accuracy through bias-variance trade-offs, and the Black-Litterman model has been developed to integrate investor opinions, a unified framework combining three approaches has been lacking. Our study debuts a Bayesian blueprint that fuses shrinkage estimation with view inclusion, conceptualizing both as Bayesian updates. This model is then applied within the context of the Fama-French approach factor models, thereby integrating the advantages of each methodology. Finally, through a comprehensive empirical study in the US equity market spanning a decade, we show that the model outperforms both the simple $1/N$ portfolio and the optimal portfolios based on sample estimators.
Kwong Yu Chong
2023-08
Spatial and Spatiotemporal Volatility Models: A Review
http://d.repec.org/n?u=RePEc:arx:papers:2308.13061&r=ecm
Spatial and spatiotemporal volatility models are a class of models designed to capture spatial dependence in the volatility of spatial and spatiotemporal data. Spatial dependence in the volatility may arise due to spatial spillovers among locations; that is, if two locations are in close proximity, they can exhibit similar volatilities. In this paper, we aim to provide a comprehensive review of the recent literature on spatial and spatiotemporal volatility models. We first briefly review time series volatility models and their multivariate extensions to motivate their spatial and spatiotemporal counterparts. We then review various spatial and spatiotemporal volatility specifications proposed in the literature along with their underlying motivations and estimation strategies. Through this analysis, we effectively compare all models and provide practical recommendations for their appropriate usage. We highlight possible extensions and conclude by outlining directions for future research.
Philipp Otto
Osman Do\u{g}an
S\"uleyman Ta\c{s}p{\i}nar
Wolfgang Schmid
Anil K. Bera
2023-08
GARHCX-NoVaS: A Model-free Approach to Incorporate Exogenous Variables
http://d.repec.org/n?u=RePEc:arx:papers:2308.13346&r=ecm
In this work, we further explore the forecasting ability of a recently proposed normalizing and variance-stabilizing (NoVaS) transformation after wrapping exogenous variables. In practice, especially in the area of financial econometrics, extra knowledge such as fundamentals- and sentiments-based information could be beneficial to improve the prediction accuracy of market volatility if they are incorporated into the forecasting process. In a classical approach, people usually apply GARCHX-type methods to include the exogenous variables. Being a Model-free prediction method, NoVaS has been shown to be more accurate and stable than classical GARCH-type methods. We are interested in whether the novel NoVaS method can also sustain its superiority after exogenous covariates are taken into account. We provide the NoVaS transformation based on GARCHX model and then claim the corresponding prediction procedure with exogenous variables existing. Also, simulation studies verify that the NoVaS method still outperforms traditional methods, especially for long-term time aggregated predictions.
Kejin Wu
Sayar Karmakar
2023-08
A Coefficient of Variation for Multivariate Ordered Categorical Outcomes.
http://d.repec.org/n?u=RePEc:tor:tecipa:tecipa-757&r=ecm
Comparing the relative variation of ordinal variates defined on diverse populations is challenging. Pearsonsâ€™ Coefficient of Variation or its inverse (the Sharpe Ratio), each used extensively for comparing relative variation or risk tempered location in cardinal paradigms, cannot be employed in ordinal data environments unless cardinal scale is attributed to ordered categories. Unfortunately, due to the scale dependencies of the Coefficient of Variations denominator and numerator, such arbitrary attribution can result in equivocal comparisons. Here, based upon the notion of probabilistic distance, unequivocal, scale independent, Coefficient of Variation and Sharpe Ratio analogues for use with Multivariate Ordered Categorical Data are introduced and exemplified in an analysis of Canadian Human Resource distributions.
Gordon Anderson
ordinal outcomes, variation coefficient, Sharpe Ratio
2023-09-05
Unpacking P-Hacking and Publication Bias
http://d.repec.org/n?u=RePEc:nbr:nberwo:31548&r=ecm
We use unique data from journal submissions to identify and unpack publication bias and p-hacking. We find that initial submissions display significant bunching, suggesting the distribution among published statistics cannot be fully attributed to a publication bias in peer review. Desk-rejected manuscripts display greater heaping than those sent for review i.e. marginally significant results are more likely to be desk rejected. Reviewer recommendations, in contrast, are positively associated with statistical significance. Overall, the peer review process has little effect on the distribution of test statistics. Lastly, we track rejected papers and present evidence that the prevalence of publication biases is perhaps not as prominent as feared.
Abel Brodeur
Scott E. Carrell
David N. Figlio
Lester R. Lusher
2023-08