|
on Econometrics |
By: | Shahnaz Parsaeian (Department of Economics, University of Kansas, Lawrence, KS 66045) |
Abstract: | This paper develops a Stein-like combined estimator for large heterogeneous panel data models under common structural breaks. The model allows for cross-sectional dependence through a general multifactor error structure. By utilizing the common correlated effects (CCE) estimation technique, we propose a Stein-like combined estimator of the CCE full-sample estimator (i.e., estimation using both the pre-break and post-break observations) and the CCE post-break estimator (i.e., estimation using only the post-break sample observations). The proposed Stein-like combined estimator benefits from exploiting the pre-break sample observations. We derive the optimal combination weight by minimizing the asymptotic risk. We show the superiority of the CCE Stein-like combined estimator over the CCE post-break estimator in terms of the asymptotic risk. Further, we establish the asymptotic properties of the CCE mean group Stein-like combined estimator. The finite sample performance of our proposed estimator is investigated using Monte Carlo experiments and an empirical application of predicting the output growth of industrialized countries. |
Keywords: | Common correlated effects, Cross-sectional dependence, Heterogeneous panels, Structural breaks. |
JEL: | C13 C23 C33 |
Date: | 2024–08 |
URL: | https://d.repec.org/n?u=RePEc:kan:wpaper:202409 |
By: | Zongwu Cai (Department of Economics, The University of Kansas, Lawrence, KS 66045, USA); Gunawan (Faculty of Economics and Business, Universitas Gadjah Mada, Yogyakarta 55281, Indonesia); Yuying Sun (Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China) |
Abstract: | This paper proposes a new nonparametric forecasting procedure based on a weighted local linear estimator for a nonparametric model with structural breaks. The proposed method assigns a weight based on both the distance of observations to the predictor covariates and their location in time and the weight is chosen using multifold forward-validation to account for time series data. We investigate the asymptotic properties of the proposed estimator and show that the weight estimated by the multifold forward-validation is asymptotically optimal in the sense of achieving the lowest possible out-of-sample prediction risk. Additionally, a nonparametric method is adopted to estimate the break date and the proposed approach allows for different features of predictors before and after break. A Monte Carlo simulation study is conducted to provide evidence for the forecasting outperformance of the proposed method over the regular nonparametric post-break and full-sample estimators. Finally, an empirical application to volatility forecasting compares several popular parametric and nonparametric methods, including the proposed weighted local linear estimator, demonstrating its superiority over other alternative methods. |
Keywords: | Combination Forecasting; Model Averaging; multifold forward-validation; Nonparametric Model; Structural Break Model; Weighted Local Linear Fitting |
JEL: | C14 C22 C53 |
Date: | 2024–09 |
URL: | https://d.repec.org/n?u=RePEc:kan:wpaper:202412 |
By: | Zongwu Cai (Department of Economics, The University of Kansas, Lawrence, KS 66045, USA); Ying Fang (The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, Fujian 361005, China and Department of Statistics & Data Science, School of Economics, Xiamen University, Xiamen, Fujian 361005, China); Ming Lin (The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, Fujian 361005, China and Department of Statistics and Data Science, School of Economics, Xiamen University, Xiamen, Fujian 361005, China); Yaqian Wu (School of Economics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China) |
Abstract: | In this paper, we propose a new method to estimate counterfactual distribution functions via the optimal distribution balancing weights, to avoid estimating the inverse propensity weights, which is sensitive to model specification and easily causes unstable estimates as well as often fails to adequately balance covariates. First, we demonstrate that the estimated weights exactly balance the estimated conditional distributions among the treated, untreated, and combined groups via a well-defined convex optimization problem. Secondly, we show that the resulting estimator of counterfactual distribution function converges weakly to a mean-zero Gaussian process at the parametric rate of the squared root n. Additionally, we show that a properly designed Bootstrap method can be used to obtain confidence intervals for conducting statistical inferences, together with its theoretical justification. Furthermore, with the estimates of counterfactual distribution functions, we provide methods to estimate the quantile treatment effects and test the stochastic dominance relationship between the potential outcome distributions. Moreover, Monte Carlo simulations are conducted to illustrate that the finite sample performance for the proposed estimator is better than the inverse propensity score weighted estimators in many scenarios. Finally, our empirical study revisits the effect of maternal smoking on infant birth weight. |
Keywords: | Counterfactual distribution function; Covariate balance; Quantile treatment effect; Stochastic dominance; Weighting scheme. |
JEL: | C01 C14 C54 |
Date: | 2024–10 |
URL: | https://d.repec.org/n?u=RePEc:kan:wpaper:202315 |
By: | Yadav, Anil (Central Bank of Ireland); McHale, John (University of Galway); Harold, Jason (University of Galway); O'Neill, Stephen (London School of Hygiene & Tropical Medicine) |
Abstract: | Difference-in-Differences and Event-study methods with staggered intervention may provide biased estimates when these approaches are implemented using a two-way fixed effect (TWFE) estimator in the presence of heterogeneous effects. Recent literature proposed alternative estimators that are unbiased, however to date, attention has primarily focused on linear outcome models. This study addresses this gap by extending five of these alternative estimators to count and binary outcomes and assessing their accuracy against the TWFE estimator in Monte Carlo simulations. While unbiased for linear models, some of the estimators yield biased estimates for nonlinear outcomes. An application revisits the statistical association between citations and star coauthorship. |
Keywords: | Nonlinear difference-in-differences and Event-study; Staggered intervention; Count and Binary outcomes; Treatment effect heterogeneity. |
JEL: | C13 C18 C22 C23 C35 |
Date: | 2024–07 |
URL: | https://d.repec.org/n?u=RePEc:cbi:wpaper:4/rt/24 |
By: | Jinyuan Chang; Yue Du; Guanglin Huang; Qiwei Yao |
Abstract: | We investigate the identification and the estimation for matrix time series CP-factor models. Unlike the generalized eigenanalysis-based method of Chang et al. (2023) which requires the two factor loading matrices to be full-ranked, the newly proposed estimation can handle rank-deficient factor loading matrices. The estimation procedure consists of the spectral decomposition of several matrices and a matrix joint diagonalization algorithm, resulting in low computational cost. The theoretical guarantee established without the stationarity assumption shows that the proposed estimation exhibits a faster convergence rate than that of Chang et al. (2023). In fact the new estimator is free from the adverse impact of any eigen-gaps, unlike most eigenanalysis-based methods such as that of Chang et al. (2023). Furthermore, in terms of the error rates of the estimation, the proposed procedure is equivalent to handling a vector time series of dimension $\max(p, q)$ instead of $p \times q$, where $(p, q)$ are the dimensions of the matrix time series concerned. We have achieved this without assuming the "near orthogonality" of the loadings under various incoherence conditions often imposed in the CP-decomposition literature, see Han and Zhang (2022), Han et al. (2024) and the references within. Illustration with both simulated and real matrix time series data shows the usefulness of the proposed approach. |
Date: | 2024–10 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2410.05634 |
By: | Liu, Yirui; Qiao, Xinghao; Pei, Yulong; Wang, Liying |
Abstract: | This paper introduces the Deep Functional Factor Model (DF2M), a Bayesian nonparametric model designed for analysis of high-dimensional functional time series. DF2M is built upon the Indian Buffet Process and the multi-task Gaussian Process, incorporating a deep kernel function that captures non-Markovian and nonlinear temporal dynamics. Unlike many black-box deep learning models, DF2M offers an explainable approach to utilizing neural networks by constructing a factor model and integrating deep neural networks within the kernel function. Additionally, we develop a computationally efficient variational inference algorithm to infer DF2M. Empirical results from four real-world datasets demonstrate that DF2M provides better explainability and superior predictive accuracy compared to conventional deep learning models for high-dimensional functional time series. |
JEL: | C1 |
Date: | 2024–07–21 |
URL: | https://d.repec.org/n?u=RePEc:ehl:lserod:125587 |
By: | Gabriele Fiorentini; Alessio Moneta; Francesca Papagni |
Abstract: | We establish the identification of a specific shock in a structural vector autoregressive model under the assumption that this shock is independent of the other shocks in the system, without requiring the latter shocks to be mutually independent, unlike the typical assumptions in the independent component analysis literature. The shock of interest can be either non-Gaussian or Gaussian, but, in the latter case, the other shocks must be jointly non-Gaussian. We formally prove the global identification of the shock and the associated column of the impact multiplier matrix, and discuss parameter estimation by maximum likelihood. We conduct a detailed Monte Carlo simulation to illustrate the finite sample behavior of our identification and estimation procedure. Finally, we estimate the dynamic effect of a contraction in economic activity on some measures of economic policy uncertainty. |
Keywords: | Independent component analysis, Non-Gaussian maximum likelihood, Impact multipliers, Economic policy uncertainty |
Date: | 2024–10–31 |
URL: | https://d.repec.org/n?u=RePEc:ssa:lemwps:2024/28 |
By: | Yang, Xuzhi; Wang, Tengyao |
Abstract: | Composite quantile regression has been used to obtain robust estimators of regression coefficients in linear models with good statistical efficiency. By revealing an intrinsic link between the composite quantile regression loss function and the Wasserstein distance from the residuals to the set of quantiles, we establish a generalization of the composite quantile regression to the multiple-output settings. Theoretical convergence rates of the proposed estimator are derived both under the setting where the additive error possesses only a finite ℓ-th moment (for ℓ > 2) and where it exhibits a sub-Weibull tail. In doing so, we develop novel techniques for analyzing the M-estimation problem that involves Wasserstein-distance in the loss. Numerical studies confirm the practical effectiveness of our proposed procedure. |
Keywords: | multivariate quantiles; optimal transport; quantile regression; robust estimation |
JEL: | C1 |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:ehl:lserod:125589 |
By: | Simar, Léopold (Université catholique de Louvain, LIDAM/ISBA, Belgium); Wilson, Paul (Clemson University) |
Abstract: | Kneip, Simar and Wilson (Journal of Business and Economic Statistics, 2016) and Daraio, Simar and Wilson (The Econometrics Journal, 2018) provide non-parametric tests of (i) convexity versus non-convexity of the production set, (ii) constant ver- sus non-constant returns-to-scale of the frontier, and (iii) separability versus non- separability of the frontier with respect to environmental variables. Among other uses, these tests are essential for deciding which non-parametric efficiency estimator should be used to estimate technical efficiency. Each test requires randomly splitting the sample. Although theory establishes that the tests are valid for any random split, results can vary with different splits. This paper provides a computationally efficient method to aggregate test outcomes across multiple sample-splits using ideas from the statistical literature on controlling false discovery rates in multiple testing situations. We provide tests using multiple sample-splits (to remove the ambiguity resulting from a single sample-split) and extensive Monte Carlo evidence on the size and power of our tests. The computational time required by the new tests is about 0.001 times the computational time required by the bootstrap method proposed by Simar and Wilson (Journal of Productivity Analysis, 2020). |
Keywords: | Hypothesis testing ; inference ; multiple splits ; convexity ; returns to scale ; separability ; DEA ; FDH |
JEL: | C12 C44 C63 |
Date: | 2024–04–10 |
URL: | https://d.repec.org/n?u=RePEc:aiz:louvad:2024012 |
By: | Yixiao Sun (University of California, San Diego); Peter C. B. Phillips (Yale University); Igor L. Kheifets (University of North Carolina at Charlotte) |
Abstract: | This note shows that the mixed normal asymptotic limit of the trend IV estimator with a fixed number of deterministic instruments (fTIV) holds in both singular (multicointegrated) and nonsingular cointegration systems, thereby relaxing the exogeneity condition in (Phillips and Kheifets, 2024, Theorem 1(ii)). The mixed normality of the limiting distribution of fTIV allows for asymptotically pivotal F tests about the cointegration parameters and for simple efficiency comparisons of the estimators for different numbers K of instruments, as well as comparisons with the trend IV estimator when K → ∞ with the sample size. |
Date: | 2024–10 |
URL: | https://d.repec.org/n?u=RePEc:cwl:cwldpp:2410 |
By: | Liudas Giraitis (Queen Mary University of London); Fulvia Marotta (De Nederlandsche Bank, University of Oxford); Peter C B Phillips (Yale University) |
Abstract: | This paper builds on methodology that corrects for irregular spacing between realizations of unevenly spaced time series and provides appropriately corrected estimates of autoregressive model parameters. Using these methods for dealing with missing data, we develop time series tools for forecasting and estimation of autoregressions with cyclically varying parameters in which periodicity is assumed. To illustrate the robustness and flexibility of the methodology, an application is conducted to model daily temperature data. The approach helps to uncover cyclical (daily as well as annual) patterns in the data without imposing restrictive assumptions. Using the Central England Temperature (CET) time series (1772 - present) we find with a high level of accuracy that temperature intra-year averages and persistence have increased in the later sample 1850-2020 compared to 1772 - 1850, especially for the winter months, whereas the estimated variance of the random shocks in the autoregression seems to have decreased over time. |
Date: | 2024–09 |
URL: | https://d.repec.org/n?u=RePEc:cwl:cwldpp:2409 |
By: | Breitung, Jörg; Bolwin, Lennart; Töns, Justus |
JEL: | C22 |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:zbw:vfsc24:302344 |
By: | Li, Mengxue (Université catholique de Louvain, LIDAM/ISBA, Belgium); von Sachs, Rainer (Université catholique de Louvain, LIDAM/ISBA, Belgium); Pircalabelu, Eugen (Université catholique de Louvain, LIDAM/ISBA, Belgium) |
Abstract: | Recent interest has emerged in community detection for dynamic networks which are observed along a trajectory of points in time. In this paper, we present a time-varying degree-corrected stochastic block model to fit a dynamic network which allows evolving heterogeneity in the degrees of nodes within a community over time. Considering the influence of the varying time window on the aggregation of network information from different time points, in the parameter estimation, we propose a smoothing-based method to recover time-varying degree parameters and communities. We also provide rates of consistency of our smoothed estimators for degree parameters and communities using a time-localised profile- likelihood approach. Extensive simulation studies and applications to two different real data sets complete our work. |
Keywords: | Dynamic network ; Community detection ; Time-localised profile-likelihood ; Nonparametric curve estimation |
Date: | 2024–04–21 |
URL: | https://d.repec.org/n?u=RePEc:aiz:louvad:2024014 |
By: | Bauwens, Luc (Université catholique de Louvain, LIDAM/CORE, Belgium); Dzuverovic, Emilija (Universita di Pisa); Hafner, Christian (Université catholique de Louvain, LIDAM/ISBA, Belgium) |
Abstract: | We introduce asymmetric effects in the BEKK-type conditional autoregressive Wishart model for realized covariance matrices. The asymmetry terms are specified either by interacting the lagged realized covariances with the signs of the lagged daily returns or by using the decomposition of the lagged realized covariance matrix into positive, negative, and mixed semi-covariances, thus relying on the lagged intra-daily returns and their signs. We provide a detailed comparison of models with different complexity, for example with respect to restrictions on the parameter matrices. In an extensive empirical study, our results suggest that the asymmetric models outperform the symmetric one in terms of statistical and economic criteria. The asymmetric models using the signs of the daily returns tend to have a better in-sample fit and out-of-sample predictive ability than the models using the signed intra-daily returns. |
Keywords: | High frequency data ; asymmetric volatility ; realized covariance ; conditional autoregressive Wishart model |
Date: | 2024–10–08 |
URL: | https://d.repec.org/n?u=RePEc:aiz:louvad:2024022 |
By: | Gaurab Aryal; Isabelle Perrigne; Quang Vuong; Haiqing Xu |
Abstract: | In this paper, we address the identification and estimation of insurance models where insurees have private information about their risk and risk aversion. The model includes random damages and allows for several claims, while insurers choose from a finite number of coverages. We show that the joint distribution of risk and risk aversion is nonparametrically identified despite bunching due to multidimensional types and a finite number of coverages. Our identification strategy exploits the observed number of claims as well as an exclusion restriction, and a full support assumption. Furthermore, our results apply to any form of competition. We propose a novel estimation procedure combining nonparametric estimators and GMM estimation that we illustrate in a Monte Carlo study. |
Date: | 2024–10 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2410.08416 |
By: | Schuessler, Julian (Aarhus University) |
Abstract: | Causal inference plays a central role in the social sciences. This chapter discusses key questions in causal inquiry: What distinguishes causal questions from descriptive or predictive ones? How can we reason about the assumptions required for causal analysis, and how can we test these assumptions? Using structural causal models and directed acyclic graphs, the chapter explores how to define causal estimands, assess the feasibility of learning from data about them (identification), and evaluate sensitivity to assumption violations. It discusses concrete problems and phenomena such as choosing control variables, post-treatment bias, causal interaction, effect heterogeneity, and mediation. Central issues are exemplified by an analysis of the relationship between exposure to violence and attitudes towards piece among survey respondents in Darfur. |
Date: | 2024–10–10 |
URL: | https://d.repec.org/n?u=RePEc:osf:osfxxx:wam94 |
By: | Yongchan Kwon; Sokbae Lee; Guillaume A. Pouliot |
Abstract: | We propose a variant of the Shapley value, the group Shapley value, to interpret counterfactual simulations in structural economic models by quantifying the importance of different components. Our framework compares two sets of parameters, partitioned into multiple groups, and applying group Shapley value decomposition yields unique additive contributions to the changes between these sets. The relative contributions sum to one, enabling us to generate an importance table that is as easily interpretable as a regression table. The group Shapley value can be characterized as the solution to a constrained weighted least squares problem. Using this property, we develop robust decomposition methods to address scenarios where inputs for the group Shapley value are missing. We first apply our methodology to a simple Roy model and then illustrate its usefulness by revisiting two published papers. |
Date: | 2024–10 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2410.06875 |
By: | Duarte, Belmiro P.M.; Atkinson, Anthony C.; Oliveira, Nuno M.C. |
Abstract: | Nonlinear regression is frequently used to fit nonlinear relations between response variables and regressors, for process data. The procedure involves the minimization of the square norm of the residuals with respect to the model parameters. Nonlinear least squares may lead to parametric collinearity, multiple optima and computational inefficiency. One of the strategies to handle collinearity is model reparameterization, i.e. the replacement of the original set of parameters by another with increased orthogonality properties. In this paper we propose a systematic strategy for model reparameterization based on the response surface generated from a carefully chosen set of points. This is illustrated with the support points of locally K-optimal experimental designs, to generate a set of analytical equations that allow the construction of a transformation to a set of parameters with better orthogonality properties. Recognizing the difficulties in the generalization of the technique to complex models, we propose a related alternative approach based on first-order Taylor approximation of the model. Our approach is tested both with linear and nonlinear models. The Variance Inflation Factor and the condition number as well as the orientation and eccentricity of the parametric confidence region are used for comparisons. |
Keywords: | K-optimal design of experiments; model reparameterization; nonlinear regression; semidefinite programming; support points |
JEL: | C1 |
Date: | 2023–08–15 |
URL: | https://d.repec.org/n?u=RePEc:ehl:lserod:122986 |
By: | Haodong Liang; Krishnakumar Balasubramanian; Lifeng Lai |
Abstract: | We explore the capability of transformers to address endogeneity in in-context linear regression. Our main finding is that transformers inherently possess a mechanism to handle endogeneity effectively using instrumental variables (IV). First, we demonstrate that the transformer architecture can emulate a gradient-based bi-level optimization procedure that converges to the widely used two-stage least squares $(\textsf{2SLS})$ solution at an exponential rate. Next, we propose an in-context pretraining scheme and provide theoretical guarantees showing that the global minimizer of the pre-training loss achieves a small excess loss. Our extensive experiments validate these theoretical findings, showing that the trained transformer provides more robust and reliable in-context predictions and coefficient estimates than the $\textsf{2SLS}$ method, in the presence of endogeneity. |
Date: | 2024–10 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2410.01265 |
By: | Li, Ting; Shi, Chengchun; Wen, Qianglin; Sui, Yang; Qin, Yongli; Lai, Chunbo; Zhu, Hongtu |
Abstract: | This paper studies policy evaluation with multiple data sources, especially in scenarios that involve one experimental dataset with two arms, complemented by a historical dataset generated under a single control arm. We propose novel data integration methods that linearly integrate base policy value estimators constructed based on the experimental and historical data, with weights optimized to minimize the mean square error (MSE) of the resulting combined estimator. We further apply the pessimistic principle to obtain more robust estimators, and extend these developments to sequential decision making. Theoretically, we establish non-asymptotic error bounds for the MSEs of our proposed estimators, and derive their oracle, efficiency and robustness properties across a broad spectrum of reward shift scenarios. Numerical experiments and real-data-based analyses from a ridesharing company demonstrate the superior performance of the proposed estimators. |
JEL: | C1 |
Date: | 2024–07–21 |
URL: | https://d.repec.org/n?u=RePEc:ehl:lserod:125588 |
By: | Allen, Sam (ETH Zurich); Koh, Jonathan (University of Bern); Segers, Johan (Université catholique de Louvain, LIDAM/ISBA, Belgium); Ziegel, Johanna (ETH Zurich) |
Abstract: | Probabilistic forecasts comprehensively describe the uncertainty in the unknown future outcome, making them essential for decision making and risk management. While several methods have been introduced to evaluate probabilistic forecasts, existing evaluation techniques are ill-suited to the evaluation of tail properties of such forecasts. However, these tail properties are often of particular interest to forecast users due to the severe impacts caused by extreme outcomes. In this work, we introduce a general notion of tail calibration for probabilistic forecasts, which allows forecasters to assess the reliability of their predictions for extreme outcomes. We study the relationships between tail calibration and standard notions of forecast calibration, and discuss connections to peaks-over-threshold models in extreme value theory. Diagnostic tools are introduced and applied in a case study on European precipitation forecasts. |
Keywords: | Extreme event ; proper scoring rule ; forecast evaluation ; tail calibration diagnostic plot ; precipitation forecast |
Date: | 2024–07–04 |
URL: | https://d.repec.org/n?u=RePEc:aiz:louvad:2024018 |