|
on Econometrics |
By: | Kuanhao Jiang; Rajarshi Mukherjee; Subhabrata Sen; Pragya Sur |
Abstract: | Estimation of the average treatment effect (ATE) is a central problem in causal inference. In recent times, inference for the ATE in the presence of high-dimensional covariates has been extensively studied. Among the diverse approaches that have been proposed, augmented inverse probability weighting (AIPW) with cross-fitting has emerged as a popular choice in practice. In this work, we study this cross-fit AIPW estimator under well-specified outcome regression and propensity score models in a high-dimensional regime where the number of features and samples are both large and comparable. Under assumptions on the covariate distribution, we establish a new CLT for the suitably scaled cross-fit AIPW that applies without any sparsity assumptions on the underlying high-dimensional parameters. Our CLT uncovers two crucial phenomena among others: (i) the AIPW exhibits a substantial variance inflation that can be precisely quantified in terms of the signal-to-noise ratio and other problem parameters, (ii) the asymptotic covariance between the pre-cross-fit estimates is non-negligible even on the root-n scale. In fact, these cross-covariances turn out to be negative in our setting. These findings are strikingly different from their classical counterparts. On the technical front, our work utilizes a novel interplay between three distinct tools--approximate message passing theory, the theory of deterministic equivalents, and the leave-one-out approach. We believe our proof techniques should be useful for analyzing other two-stage estimators in this high-dimensional regime. Finally, we complement our theoretical results with simulations that demonstrate both the finite sample efficacy of our CLT and its robustness to our assumptions. |
Date: | 2022–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2205.10198&r= |
By: | Qingliang Fan; Zijian Guo; Ziwei Mei |
Abstract: | This paper proposes a new test of overidentifying restrictions (called the Q test) with high-dimensional data. This test is based on estimation and inference for a quadratic form of high-dimensional parameters. It is shown to have the desired asymptotic size and power properties under heteroskedasticity, even if the number of instruments and covariates is larger than the sample size. Simulation results show that the new test performs favorably compared to existing alternative tests (Chao et al., 2014; Kolesar, 2018; Carrasco and Doukali, 2021) under the scenarios when those tests are feasible or not. An empirical example of the trade and economic growth nexus manifests the usefulness of the proposed test. |
Date: | 2022–04 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2205.00171&r= |
By: | Sung Jae Jun; Sokbae Lee |
Abstract: | The log odds ratio is a common parameter to measure association between (binary) outcome and exposure variables. Much attention has been paid to its parametric but robust estimation, or its nonparametric estimation as a function of confounders. However, discussion on how to use a summary statistic by averaging the log odds ratio function is surprisingly difficult to find despite the popularity and importance of averaging in other contexts such as estimating the average treatment effect. We propose a couple of efficient double/debiased machine learning (DML) estimators of the average log odds ratio, where the odds ratios are adjusted for observed (potentially high dimensional) confounders and are averaged over them. The estimators are built from two equivalent forms of the efficient influence function. The first estimator uses a prospective probability of the outcome conditional on the exposure and confounders; the second one employs a retrospective probability of the exposure conditional on the outcome and confounders. Our framework encompasses random sampling as well as outcome-based or exposure-based sampling. Finally, we illustrate how to apply the proposed estimators using real data. |
Date: | 2022–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2205.14048&r= |
By: | Philipp Ratz |
Abstract: | Artificial Neural Networks (ANN) have been employed for a range of modelling and prediction tasks using financial data. However, evidence on their predictive performance, especially for time-series data, has been mixed. Whereas some applications find that ANNs provide better forecasts than more traditional estimation techniques, others find that they barely outperform basic benchmarks. The present article aims to provide guidance as to when the use of ANNs might result in better results in a general setting. We propose a flexible nonparametric model and extend existing theoretical results for the rate of convergence to include the popular Rectified Linear Unit (ReLU) activation function and compare the rate to other nonparametric estimators. Finite sample properties are then studied with the help of Monte-Carlo simulations to provide further guidance. An application to estimate the Value-at-Risk of portfolios of varying sizes is also considered to show the practical implications. |
Date: | 2022–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2205.07101&r= |
By: | Oorschot, Jochem; Segers, Johan (Université catholique de Louvain, LIDAM/ISBA, Belgium); Zhou, Chen |
Abstract: | Extreme U-statistics arise when the kernel of a U-statistic has a high degree but depends only on its arguments through a small number of top order statistics. As the kernel degree of the U-statistic grows to infinity with the sample size, estimators built out of such statistics form an intermediate family in between those constructed in the block maxima and peaks-over-threshold frameworks in extreme value analysis. The asymptotic normality of extreme U-statistics based on location-scale invariant kernels is established. Although the asymptotic variance corresponds with the one of the Hájek projection, the proof goes beyond considering the first term in Hoeffding’s variance decomposition; instead, a growing number of terms needs to be incorporated in the proof. To show the usefulness of extreme U-statistics, we propose a kernel depending on the three highest order statistics leading to an unbiased estimator of the shape parameter of the generalized Pareto distribution. When applied to samples in the max-domain of attraction of an extreme value distribution, the extreme U-statistic based on this kernel produces a locationscale invariant estimator of the extreme value index which is asymptotically normal and whose finite-sample performance is competitive with that of the pseudo-maximum likelihood estimator. |
Keywords: | U-statistic ; Generalized Pareto distribution ; Hájek projection ; Extreme value index |
Date: | 2022–03–16 |
URL: | http://d.repec.org/n?u=RePEc:aiz:louvad:2022014&r= |
By: | Ziyu Wang; Yuhao Zhou; Jun Zhu |
Abstract: | We investigate nonlinear instrumental variable (IV) regression given high-dimensional instruments. We propose a simple algorithm which combines kernelized IV methods and an arbitrary, adaptive regression algorithm, accessed as a black box. Our algorithm enjoys faster-rate convergence and adapts to the dimensionality of informative latent features, while avoiding an expensive minimax optimization procedure, which has been necessary to establish similar guarantees. It further brings the benefit of flexible machine learning models to quasi-Bayesian uncertainty quantification, likelihood-based model selection, and model averaging. Simulation studies demonstrate the competitive performance of our method. |
Date: | 2022–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2205.10772&r= |
By: | Ryan Zischke; Gael M. Martin; David T. Frazier; Donald S. Poskitt |
Abstract: | We investigate the performance and sampling variability of estimated forecast combinations, with particular attention given to the combination of forecast distributions. Unknown parameters in the forecast combination are optimized according to criterion functions based on proper scoring rules, which are chosen to reward the form of forecast accuracy that matters for the problem at hand, and forecast performance is measured using the out-of-sample expectation of said scoring rule. Our results provide novel insights into the behavior of estimated forecast combinations. Firstly, we show that, asymptotically, the sampling variability in the performance of standard forecast combinations is determined solely by estimation of the constituent models, with estimation of the combination weights contributing no sampling variability whatsoever, at first order. Secondly, we show that, if computationally feasible, forecast combinations produced in a single step -- in which the constituent model and combination function parameters are estimated jointly -- have superior predictive accuracy and lower sampling variability than standard forecast combinations -- where constituent model and combination function parameters are estimated in two steps. These theoretical insights are demonstrated numerically, both in simulation settings and in an extensive empirical illustration using a time series of S&P500 returns |
Keywords: | forecast combination, forecast combination puzzle, probabilistic forecasting, scoring rules, S&P500 forecasting, two-stage estimation |
Date: | 2022 |
URL: | http://d.repec.org/n?u=RePEc:msh:ebswps:2022-6&r= |
By: | Martin Magris; Mostafa Shabani; Alexandros Iosifidis |
Abstract: | We develop an optimization algorithm suitable for Bayesian learning in complex models. Our approach relies on natural gradient updates within a general black-box framework for efficient training with limited model-specific derivations. It applies within the class of exponential-family variational posterior distributions, for which we extensively discuss the Gaussian case for which the updates have a rather simple form. Our Quasi Black-box Variational Inference (QBVI) framework is readily applicable to a wide class of Bayesian inference problems and is of simple implementation as the updates of the variational posterior do not involve gradients with respect to the model parameters, nor the prescription of the Fisher information matrix. We develop QBVI under different hypotheses for the posterior covariance matrix, discuss details about its robust and feasible implementation, and provide a number of real-world applications to demonstrate its effectiveness. |
Date: | 2022–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2205.11568&r= |
By: | Charles F. Manski |
Abstract: | Incomplete observability of data generates an identification problem. There is no panacea for missing data. What one can learn about a population parameter depends on the assumptions one finds credible to maintain. The credibility of assumptions varies with the empirical setting. No specific assumptions can provide a realistic general solution to the problem of inference with missing data. Yet Rubin has promoted random multiple imputation (RMI) as a general way to deal with missing values in public-use data. This recommendation has been influential to empirical researchers who seek a simple fix to the nuisance of missing data. This paper adds to my earlier critiques of imputation. It provides a transparent assessment of the mix of Bayesian and frequentist thinking used by Rubin to argue for RMI. It evaluates random imputation to replace missing outcome or covariate data when the objective is to learn a conditional expectation. It considers steps that might help combat the allure of making stuff up. |
Date: | 2022–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2205.07388&r= |
By: | Clara Bicalho; Adam Bouyamourn; Thad Dunning |
Abstract: | Researchers often use covariate balance tests to assess whether a treatment variable is assigned "as-if" at random. However, standard tests may shed no light on a key condition for causal inference: the independence of treatment assignment and potential outcomes. We focus on a key factor that affects the sensitivity and specificity of balance tests: the extent to which covariates are prognostic, that is, predictive of potential outcomes. We propose a "conditional balance test" based on the weighted sum of covariate differences of means, where the weights are coefficients from a standardized regression of observed outcomes on covariates. Our theory and simulations show that this approach increases power relative to other global tests when potential outcomes are imbalanced, while limiting spurious rejections due to imbalance on irrelevant covariates. |
Date: | 2022–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2205.10478&r= |
By: | Nathan Kallus |
Abstract: | The fundamental problem of causal inference -- that we never observe counterfactuals -- prevents us from identifying how many might be negatively affected by a proposed intervention. If, in an A/B test, half of users click (or buy, or watch, or renew, etc.), whether exposed to the standard experience A or a new one B, hypothetically it could be because the change affects no one, because the change positively affects half the user population to go from no-click to click while negatively affecting the other half, or something in between. While unknowable, this impact is clearly of material importance to the decision to implement a change or not, whether due to fairness, long-term, systemic, or operational considerations. We therefore derive the tightest-possible (i.e., sharp) bounds on the fraction negatively affected (and other related estimands) given data with only factual observations, whether experimental or observational. Naturally, the more we can stratify individuals by observable covariates, the tighter the sharp bounds. Since these bounds involve unknown functions that must be learned from data, we develop a robust inference algorithm that is efficient almost regardless of how and how fast these functions are learned, remains consistent when some are mislearned, and still gives valid conservative bounds when most are mislearned. Our methodology altogether therefore strongly supports credible conclusions: it avoids spuriously point-identifying this unknowable impact, focusing on the best bounds instead, and it permits exceedingly robust inference on these. We demonstrate our method in simulation studies and in a case study of career counseling for the unemployed. |
Date: | 2022–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2205.10327&r= |
By: | Graham Elliott; Nikolay Kudrin; Kaspar W\"uthrich |
Abstract: | $p$-Hacking can undermine the validity of empirical studies. A flourishing empirical literature investigates the prevalence of $p$-hacking based on the empirical distribution of reported $p$-values across studies. Interpreting results in this literature requires a careful understanding of the power of methods used to detect different types of $p$-hacking. We theoretically study the implications of likely forms of $p$-hacking on the distribution of reported $p$-values and the power of existing methods for detecting it. Power can be quite low, depending crucially on the particular $p$-hacking strategy and the distribution of actual effects tested by the studies. We relate the power of the tests to the costs of $p$-hacking and show that power tends to be larger when $p$-hacking is very costly. Monte Carlo simulations support our theoretical results. |
Date: | 2022–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2205.07950&r= |
By: | Bjoern Schulte-Tillman; Mawuli Segnon; Bernd Wilfling |
Abstract: | We propose four multiplicative-component volatility MIDAS models to disentangle short- and long-term volatility sources. Three of our models specify short-term volatility as Markov-switching processes. We establish statistical properties, covariance-stationarity conditions, and an estimation framework using regime-switching filter techniques. A simulation study shows the robustness of the estimates against several mis-specifications. An out-of-sample forecasting analysis with daily S&P500 returns and quarterly-sampled (macro)economic variables yields two major results. (i) Specific long-term variables in the MIDAS models significantly improve forecast accuracy (over the non-MIDAS benchmarks). (ii) We robustly find superior performance of one Markov-switching MIDAS specification (among a set of competitor models) when using the 'Term structure' as the long-term variable. |
Keywords: | MIDAS volatility modeling, Hierarchical hidden Markov models, Markov-switching, Forecasting, Model conï¬ dence sets |
JEL: | C51 C53 C58 E44 |
Date: | 2022–06 |
URL: | http://d.repec.org/n?u=RePEc:cqe:wpaper:9922&r= |
By: | Onishi, Rikuto; Otsu, Taisuke |
Abstract: | This paper follows up the sensitivity analysis by Andrews, Gentzkow and Shapiro (2017) for biases in GMM estimators due to local violations of identifying assumptions, and proposes complementary bias measures that are sensitive to different choices of GMM weight matrices by considering a specific form of the local perturbation. Our method accommodates the two-step and continuous updating GMM estimators with or without centering. The proposed bias measures are illustrated by a consumption based asset pricing model using Japanese data. |
Keywords: | sensitivity analysis; generalized method of moments; misspecification |
JEL: | J1 |
Date: | 2021–01–01 |
URL: | http://d.repec.org/n?u=RePEc:ehl:lserod:107522&r= |
By: | Matteo Escud\'e; Paula Onuchic; Ludvig Sinander; Quitz\'e Valenzuela-Stookey |
Abstract: | We revisit Chambers and Echenique's (2021) characterization of Phelps-Aigner-Cain-type statistical discrimination. We firstly propose an alternative interpretation of their "identification" property. On our interpretation, their main result is a dismal one: discrimination is inevitable. Secondly, we show how Blackwell's Theorem characterizes statistical discrimination in terms of statistical informativeness. Its corollaries include half of Chambers and Echenique's main result and some finer-grained properties of statistical discrimination. Finally, we illustrate how the close link between discrimination and informativeness may be used to characterize other forms of discrimination. |
Date: | 2022–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2205.07128&r= |
By: | Mika Meitz; Pentti Saikkonen |
Abstract: | In this paper, we consider subgeometric ergodicity of univariate nonlinear autoregressions with autoregressive conditional heteroskedasticity (ARCH). The notion of subgeometric ergodicity was introduced in the Markov chain literature in 1980s and it means that the transition probability measures converge to the stationary measure at a rate slower than geometric; this rate is also closely related to the convergence rate of $\beta$-mixing coefficients. While the existing literature on subgeometrically ergodic autoregressions assumes a homoskedastic error term, this paper provides an extension to the case of conditionally heteroskedastic ARCH-type errors, considerably widening the scope of potential applications. Specifically, we consider suitably defined higher-order nonlinear autoregressions with possibly nonlinear ARCH errors and show that they are, under appropriate conditions, subgeometrically ergodic at a polynomial rate. An empirical example using energy sector volatility index data illustrates the use of subgeometrically ergodic AR-ARCH models. |
Date: | 2022–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2205.11953&r= |