|
on Econometrics |
By: | Tassos Magdalinos; Katerina Petrova |
Abstract: | A unified theory of estimation and inference is developed for an autoregressive process with root in (-∞, ∞) that includes the stationary, local-to-unity, explosive and all intermediate regions. The discontinuity of the limit distribution of the t-statistic outside the stationary region and its dependence on the distribution of the innovations in the explosive regions (-∞, -1) ∪ (1, ∞) are addressed simultaneously. A novel estimation procedure, based on a data-driven combination of a near-stationary and a mildly explosive artificially constructed instrument, delivers mixed-Gaussian limit theory and gives rise to an asymptotically standard normal t-statistic across all autoregressive regions. The resulting hypothesis tests and confidence intervals are shown to have correct asymptotic size (uniformly over the space of autoregressive parameters and the space of innovation distribution functions) in autoregressive, predictive regression and local projection models, thereby establishing a general and unified framework for inference with autoregressive processes. Extensive Monte Carlo simulation shows that the proposed methodology exhibits very good finite sample properties over the entire autoregressive parameter space (-∞, ∞) and compares favorably to existing methods within their parametric (-1, 1] validity range. We demonstrate how our procedure can be used to construct valid confidence intervals in standard epidemiological models as well as to test in real-time for speculative bubbles in the price of the Magnificent Seven tech stocks. |
Keywords: | uniform inference; central limit theory (CLT); autoregression; Predictive regressions; instrumentation; mixed-Gaussianity; t-statistic; confidence intervals |
JEL: | C12 C22 |
Date: | 2025–04–01 |
URL: | https://d.repec.org/n?u=RePEc:fip:fednsr:99905 |
By: | Alain-Philippe Fortin (University of Geneva); Patrick Gagliardini (University of Lugano; Swiss Finance Institute); O. Scaillet (Swiss Finance Institute - University of Geneva) |
Abstract: | We derive optimal maximin tests for errors sphericity in latent factor analysis of short panels. We rely on a Generalized Method of Moments setting with optimal weighting under a large cross-sectional dimension n and a fixed time series dimension T. We outline the asymptotic distributions of the estimators as well as the asymptotic maximin optimality of the Wald, Lagrange Multiplier, and Likelihood Ratio-type tests. The characterisation of optimality relies on finding the limit Gaussian experiment in strongly identified GMM models under a block-dependence structure and unobserved heterogeneity. We reject sphericity in an empirical application to a large cross-section of U.S. stocks, which casts doubt on the validity of routinely applying Principal Component Analysis to short panels of monthly financial returns. |
Keywords: | Latent factor analysis, Generalized Method of Moments, maximin test, Gaussian experiment, fixed effects, panel data, sphericity, large n and fixed T asymptotics, equity returns |
JEL: | C12 C23 C38 C58 G12 |
Date: | 2025–03 |
URL: | https://d.repec.org/n?u=RePEc:chf:rpseri:rp2527 |
By: | Stefano DellaVigna; Guido Imbens; Woojin Kim; David M. Ritzwoller |
Abstract: | Empirical research in economics often examines the behavior of agents located in a geographic space. In such cases, statistical inference is complicated by the interdependence of economic outcomes across locations. A common approach to account for this dependence is to cluster standard errors based on a predefined geographic partition. A second strategy is to model dependence in terms of the distance between units. Dependence, however, does not necessarily stop at borders and is typically not determined by distance alone. This paper introduces a method that leverages observations of multiple outcomes to adjust standard errors for cross-sectional dependence. Specifically, a researcher, while interested in a particular outcome variable, often observes dozens of other variables for the same units. We show that these outcomes can be used to estimate dependence under the assumption that the cross-sectional correlation structure is shared across outcomes. We develop a procedure, which we call Thresholding Multiple Outcomes (TMO), that uses this estimate to adjust standard errors in a given regression setting. We show that adjustments of this form can lead to sizable reductions in the bias of standard errors in calibrated U.S. county-level regressions. Re-analyzing nine recent papers, we find that the proposed correction can make a substantial difference in practice. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.13295 |
By: | Xixi Hu; Yi Qian; Hui Xie |
Abstract: | Given the ubiquitous presence of endogenous regressors and the challenges in finding good instruments to overcome the endogeneity problem, a forefront of recent research is the development and application of endogeneity correction methods without requiring instruments. In this article, we formulate the regressor endogeneity problem using a novel conditional copula endogeneity model to capture the regressor-error dependence unexplained by exogenous regressors. The model relaxes the key assumption of Gaussian copula regressor-error dependence structure and eliminates unnecessary modeling of regressors. Under the model, we develop an instrument-free two-stage nonparametric copula endogeneity control function approach (2sCOPE-np), which generalizes existing copula endogeneity correction methods and minimizes the assumptions in the first-stage auxiliary model for endogenous regressors. Specifically, the 2sCOPE-np employs robust model-free kernel estimates of copula control functions. We elucidate and demonstrate the robustness and broad applicability of 2sCOPE-np, compared to existing copula endogeneity correction methods. Simulation studies further demonstrate that 2sCOPE-np outperforms existing methods. We illustrate the usage of 2sCOPE-np in an empirical application of the store sales demand estimation. |
JEL: | C0 C01 C10 C14 C51 |
Date: | 2025–03 |
URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:33607 |
By: | Jiankun Chen (School of Economics, University of International Business and Economics, Beijing, China); Yanli Lin (University of Western Australia Business School, Perth, Australia); Yang Yang (Ma Yinchu School of Economics, Tianjin University, Tianjin, China) |
Abstract: | This paper introduces a model featuring two hierarchically structured layers of spatial or social networks in a cross-sectional setting. Individuals interact within groups, while groups also interact with one another, generating network dependence at both the individual and group levels. The network structures can be flexibly specified using general measures of proximity. The model accommodates individual random effects with heteroskedasticity, as well as unobserved random group effects. Given the complex error structure, we consider a Generalized Method of Moments (GMM) approach for estimation. The linear moment conditions exploit exogenous variations in individual and group characteristics to identify the network parameters at both levels. To enhance identification when linear moments are weak, we also propose a new set of quadratic moments that are robust to heteroskedasticity. Building on the method of Lin and Lee (2010), we can consistently estimate the variance-covariance (VC) matrix of these heteroskedasticity-robust moments, enabling the construction of a GMM estimator with optimally weighted moments. The asymptotic properties of both a generic and the "optimal" GMM estimator are derived. Monte Carlo simulations demonstrate that the proposed estimators perform well in finite samples. The model is applicable to a variety of social and economic contexts where network effects at two distinct levels are of particular interest, with peer effects among students within the same class and spillovers between classes serving as a leading example. |
Keywords: | Hierarchical networks, Spatial model, Social interaction, Random effect, GMM |
JEL: | C31 C51 |
Date: | 2025 |
URL: | https://d.repec.org/n?u=RePEc:uwa:wpaper:25-03 |
By: | Souvik Banerjee; Anirban Basu; Shubham Das |
Abstract: | Causal inference methods are widely used in empirical research; however, there is a paucity of evidence on the properties of shared latent factor estimators in the presence of contaminated instrumental variable (IV) when a strong IV may not be available. We present a theoretical formulation to depict how the strength and degree of contamination of the IV simultaneously determine the optimal choice of estimator. We perform Monte Carlo simulations with four outcome variables and an endogenous treatment variable, with sample sizes of 1000 and 2000, and for 1000 iterations, to compare the finite sample properties of the OLS, 2SLS, Shared Latent Factor without IV (SLF), and Shared Latent Factor with IV (SLF+IV) estimators. Finally, we demonstrate the applicability of the proposed estimators to study the causal impact of maternal parity on various maternal and child health indicators: child’s height-for-age percentile, child’s weight-for-age percentile, child’s haemoglobin count, and mother’s haemoglobin count, using data from the 2019-21 Round 5 of the National Family Health Survey (NFHS-5) from India. Our simulation results indicate that for a given degree of contamination of the IV, there exists a threshold strength of the IV, such that the SLF+IV estimator has a lower (greater) bias than the SLF estimator when the strength of the IV lies above (below) that threshold. The empirical results suggest that a lower parity is associated with higher height-for-age and weight-for-age percentile and haemoglobin count in children and a higher haemoglobin count in mothers. |
JEL: | C3 C31 I1 J13 |
Date: | 2025–03 |
URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:33620 |
By: | Bulat Gafarov; Matthias Meier; Jos\'e Luis Montiel Olea |
Abstract: | We study the properties of projection inference for set-identified Structural Vector Autoregressions. A nominal $1-\alpha$ projection region collects the structural parameters that are compatible with a $1-\alpha$ Wald ellipsoid for the model's reduced-form parameters (autoregressive coefficients and the covariance matrix of residuals). We show that projection inference can be applied to a general class of stationary models, is computationally feasible, and -- as the sample size grows large -- it produces regions for the structural parameters and their identified set with both frequentist coverage and \emph{robust} Bayesian credibility of at least $1-\alpha$. A drawback of the projection approach is that both coverage and robust credibility may be strictly above their nominal level. Following the work of \cite{Kaido_Molinari_Stoye:2014}, we `calibrate' the radius of the Wald ellipsoid to guarantee that -- for a given posterior on the reduced-form parameters -- the robust Bayesian credibility of the projection method is exactly $1-\alpha$. If the bounds of the identified set are differentiable, our calibrated projection also covers the identified set with probability $1-\alpha$. %eliminating the excess of robust Bayesian credibility also eliminates excessive frequentist coverage. We illustrate the main results of the paper using the demand/supply-model for the U.S. labor market in Baumeister_Hamilton(2015) |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.14106 |
By: | Marcell T. Kurbucz; Betsab\'e P\'erez Garrido; Antal Jakov\'ac |
Abstract: | This paper introduces the Eigenvalue-Based Randomness (EBR) test - a novel approach rooted in the Tracy-Widom law from random matrix theory - and applies it to the context of residual analysis in panel data models. Unlike traditional methods, which target specific issues like cross-sectional dependence or autocorrelation, the EBR test simultaneously examines multiple assumptions by analyzing the largest eigenvalue of a symmetrized residual matrix. Monte Carlo simulations demonstrate that the EBR test is particularly robust in detecting not only standard violations such as autocorrelation and linear cross-sectional dependence (CSD) but also more intricate non-linear and non-monotonic dependencies, making it a comprehensive and highly flexible tool for enhancing the reliability of panel data analyses. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.05297 |
By: | F. Marta L. Di Lascio (Free University of Bozen-Bolzano, Italy); Aurora Gatto (Free University of Bozen-Bolzano, Italy) |
Abstract: | Missing values in multivariate dependent variables may occur during data collection, requiring imputation methods capable of handling complex intervariable relationships. We propose a nonparametric copula-based method for imputing dependent multivariate missing data, called NPCoImp. By leveraging the conditional empirical beta copula of the missing variables given the observed ones, NPCoImp imputes data while accounting for its distributional shape, particularly radial symmetry, and adjusting the multivariate values used for imputation accordingly. NPCoImp is highly exible and can handle multivariate missing data with any type of missingness pattern. The performance of the NPCoImp has been evaluated through an extensive Monte Carlo study and compared with classical imputation methods, as well as with its direct competitor, the CoImp algorithm. Our findings indicate that NPCoImp is particularly effective in preserving microdata and dependence structure. The strong performance of the proposed method is further supported by empirical case studies in the agricultural sector. Finally, the NPCoImp algorithm has been implemented in the R package CoImp, which is available on CRAN. |
Keywords: | Asymmetry, Conditional copula, Empirical copula, Imputation methods, Multivariate missing data, NPCoImp. |
JEL: | C1 C14 C63 Q1 |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:bzn:wpaper:bemps112 |
By: | Mikhail Chernov (UCLA Anderson); Bryan T. Kelly (Yale SOM; AQR Capital Management, LLC; National Bureau of Economic Research (NBER)); Semyon Malamud (Ecole Polytechnique Federale de Lausanne; Centre for Economic Policy Research (CEPR); Swiss Finance Institute); Johannes Schwab (École Polytechnique Fédérale de Lausanne (EPFL)) |
Abstract: | We generalize the seminal Gibbons-Ross-Shanken test to the empirically relevant case where the number of test assets far exceeds the number of observations. In such a setting, one needs to use a regularized estimator of the covariance matrix of test assets, which leads to biases in the original test statistic. Random Matrix Theory allows us to account for these biases and to evaluate the test's power. Power increases with the number of test assets and reaches the maximum for a broad range of local alternatives. These conclusions are supported by an extensive simulation study. We implement the test empirically for state-of-the-art candidate efficient portfolios and test assets. |
Keywords: | efficient portfolio, cross-section of stock returns, testing, regularization, random matrix theory |
JEL: | C12 C40 C55 C57 G12 |
Date: | 2025–03 |
URL: | https://d.repec.org/n?u=RePEc:chf:rpseri:rp2526 |
By: | Evan Munro |
Abstract: | Equilibrium effects make it challenging to evaluate the impact of an individual-level treatment on outcomes in a single market, even with data from a randomized trial. In some markets, however, a centralized mechanism allocates goods and imposes useful structure on spillovers. For a class of strategy-proof "cutoff" mechanisms, we propose an estimator for global treatment effects using individual-level data from one market, where treatment assignment is unconfounded. Algorithmically, we re-run a weighted and perturbed version of the mechanism. Under a continuum market approximation, the estimator is asymptotically normal and semi-parametrically efficient. We extend this approach to learn spillover-aware treatment rules with vanishing asymptotic regret. Empirically, adjusting for equilibrium effects notably diminishes the estimated effect of information on inequality in the Chilean school system. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.07217 |
By: | Bobeica, Elena; Holton, Sarah; Huber, Florian; Martínez Hernández, Catalina |
Abstract: | We propose a novel empirical structural inflation model that captures non-linear shock transmission using a Bayesian machine learning framework that combines VARs with non-linear structural factor models. Unlike traditional linear models, our approach allows for non-linear effects at all impulse response horizons. Identification is achieved via sign, zero, and magnitude restrictions within the factor model. Applying our method to euro area energy shocks, we find that inflation reacts disproportionately to large shocks, while small shocks trigger no significant response. These non-linearities are present along the pricing chain, more pronounced upstream and gradually attenuating downstream. JEL Classification: E31, C32, C38, Q43 |
Keywords: | energy, euro area, inflation, machine learning, non-linear model |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:ecb:ecbwps:20253052 |
By: | JD Opdyke |
Abstract: | We live in a multivariate world, and effective modeling of financial portfolios, including their construction, allocation, forecasting, and risk analysis, simply is not possible without explicitly modeling the dependence structure of their assets. Dependence structure can drive portfolio results more than many other parameters in investment and risk models, sometimes even more than their combined effects, but the literature provides relatively little to define the finite-sample distributions of dependence measures in useable and useful ways under challenging, real-world financial data conditions. Yet this is exactly what is needed to make valid inferences about their estimates, and to use these inferences for a myriad of essential purposes, such as hypothesis testing, dynamic monitoring, realistic and granular scenario and reverse scenario analyses, and mitigating the effects of correlation breakdowns during market upheavals (which is when we need valid inferences the most). This work develops a new and straightforward method, Nonparametric Angles-based Correlation (NAbC), for defining the finite-sample distributions of any dependence measure whose matrix of pairwise associations is positive definite (e.g. Pearsons, Kendalls Tau, Spearmans Rho, Chatterjees, Lancasters, Szekelys, and their many variants). The solution remains valid under marginal asset distributions characterized by notably different and varying degrees of serial correlation, non-stationarity, heavy-tailedness, and asymmetry. Notably, NAbCs p-values and confidence intervals remain analytically consistent at both the matrix level and the pairwise cell level. Finally, NAbC maintains validity even when selected cells in the matrix are frozen for a given scenario or stress test, that is, unaffected by the scenario, thus enabling flexible, granular, and realistic scenarios. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.15268 |
By: | Manuel Quintero; William T. Stephenson; Advik Shreekumar; Tamara Broderick |
Abstract: | In science and social science, we often wish to explain why an outcome is different in two populations. For instance, if a jobs program benefits members of one city more than another, is that due to differences in program participants (particular covariates) or the local labor markets (outcomes given covariates)? The Kitagawa-Oaxaca-Blinder (KOB) decomposition is a standard tool in econometrics that explains the difference in the mean outcome across two populations. However, the KOB decomposition assumes a linear relationship between covariates and outcomes, while the true relationship may be meaningfully nonlinear. Modern machine learning boasts a variety of nonlinear functional decompositions for the relationship between outcomes and covariates in one population. It seems natural to extend the KOB decomposition using these functional decompositions. We observe that a successful extension should not attribute the differences to covariates -- or, respectively, to outcomes given covariates -- if those are the same in the two populations. Unfortunately, we demonstrate that, even in simple examples, two common decompositions -- functional ANOVA and Accumulated Local Effects -- can attribute differences to outcomes given covariates, even when they are identical in two populations. We provide a characterization of when functional ANOVA misattributes, as well as a general property that any discrete decomposition must satisfy to avoid misattribution. We show that if the decomposition is independent of its input distribution, it does not misattribute. We further conjecture that misattribution arises in any reasonable additive decomposition that depends on the distribution of the covariates. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.16864 |
By: | Hannah O’Keeffe; Katerina Petrova |
Abstract: | In this paper, we propose a component-based dynamic factor model for nowcasting GDP growth. We combine ideas from “bottom-up” approaches, which utilize the national income accounting identity through modelling and predicting sub-components of GDP, with a dynamic factor (DF) model, which is suitable for dimension reduction as well as parsimonious real-time monitoring of the economy. The advantages of the new model are twofold: (i) in contrast to existing dynamic factor models, it respects the GDP accounting identity; (ii) in contrast to existing “bottom-up” approaches, it models all GDP components jointly through the dynamic factor model, inheriting its main advantages. An additional advantage of the resulting CBDF approach is that it generates nowcast densities and impact decompositions for each component of GDP as a by-product. We present a comprehensive forecasting exercise, where we evaluate the model’s performance in terms of point and density forecasts, and we compare it to existing models (e.g. the model of Almuzara, Baker, O’Keeffe, and Sbordone (2023)) currently used by the New York Fed, as well as the model of Higgins (2014) currently used by the Atlanta Fed. We demonstrate that, on average, the point nowcast performance (in terms of RMSE) of the standard DF model can be improved by 15 percent and its density nowcast performance (in terms of log-predictive scores) can be improved by 20 percent over a large historical sample. |
Keywords: | Dynamic factor model; GDP nowcasting |
JEL: | C32 C38 C53 |
Date: | 2025–04–01 |
URL: | https://d.repec.org/n?u=RePEc:fip:fednsr:99906 |