|
on Econometrics |
By: | Chen, J.; Li, D.; Li, Y.; Linton, O. B. |
Abstract: | We explore time-varying networks for high-dimensional locally stationary time series, using the large VAR model framework with both the transition and (error) precision matrices evolving smoothly over time. Two types of time-varying graphs are investigated: one containing directed edges of Granger causality linkages, and the other containing undirected edges of partial correlation linkages. Under the sparse structural assumption, we propose a penalised local linear method with time-varying weighted group LASSO to jointly estimate the transition matrices and identify their significant entries, and a time-varying CLIME method to estimate the precision matrices. The estimated transition and precision matrices are then used to determine the time-varying network structures. Under some mild conditions, we derive the theoretical properties of the proposed estimates including the consistency and oracle properties. In addition, we extend the methodology and theory to cover highly-correlated large-scale time series, for which the sparsity assumption becomes invalid and we allow for common factors before estimating the factor-adjusted time-varying networks. We provide extensive simulation studies and an empirical application to a large U.S. macroeconomic dataset to illustrate the finite-sample performance of our methods. |
Keywords: | CLIME, Factor model, Granger causality, lasso, local linear smoothing, partial correlation, time-varying network, VAR |
JEL: | C13 C14 C32 C38 |
Date: | 2022–12–14 |
URL: | http://d.repec.org/n?u=RePEc:cam:camdae:2273&r=ecm |
By: | Hugo Bodory; Martin Huber; Michael Lechner |
Abstract: | This paper investigates the finite sample performance of a range of parametric, semi-parametric, and non-parametric instrumental variable estimators when controlling for a fixed set of covariates to evaluate the local average treatment effect. Our simulation designs are based on empirical labor market data from the US and vary in several dimensions, including effect heterogeneity, instrument selectivity, instrument strength, outcome distribution, and sample size. Among the estimators and simulations considered, non-parametric estimation based on the random forest (a machine learner controlling for covariates in a data-driven way) performs competitive in terms of the average coverage rates of the (bootstrap-based) 95% confidence intervals, while also being relatively precise. Non-parametric kernel regression as well as certain versions of semi-parametric radius matching on the propensity score, pair matching on the covariates, and inverse probability weighting also have a decent coverage, but are less precise than the random forest-based method. In terms of the average root mean squared error of LATE estimation, kernel regression performs best, closely followed by the random forest method, which has the lowest average absolute bias. |
Date: | 2022–12 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2212.07379&r=ecm |
By: | Takahide Yanagi |
Abstract: | We develop a difference-in-differences method in a general setting in which the treatment variable of interest may be non-binary and its value may change in each time period. It is generally difficult to estimate treatment parameters defined with the potential outcome given the entire path of treatment adoption, as each treatment path may be experienced by only a small number of observations. We propose an empirically tractable alternative using the concept of effective treatment, which summarizes the treatment path into a low-dimensional variable. Under a parallel trends assumption conditional on observed covariates, we show that doubly robust difference-in-differences estimands can identify certain average treatment effects for movers, even when the chosen effective treatment is misspecified. We consider doubly robust estimation and multiplier bootstrap inference, which are asymptotically justifiable if either an outcome regression function for stayers or a generalized propensity score is correctly parametrically specified. We illustrate the usefulness of our method by estimating the instantaneous and dynamic effects of union membership on wages. |
Date: | 2022–12 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2212.13226&r=ecm |
By: | Kazuhiko Shinoda; Takahiro Hoshino |
Abstract: | In various fields of data science, researchers are often interested in estimating the ratio of conditional expectation functions (CEFR). Specifically in causal inference problems, it is sometimes natural to consider ratio-based treatment effects, such as odds ratios and hazard ratios, and even difference-based treatment effects are identified as CEFR in some empirically relevant settings. This chapter develops the general framework for estimation and inference on CEFR, which allows the use of flexible machine learning for infinite-dimensional nuisance parameters. In the first stage of the framework, the orthogonal signals are constructed using debiased machine learning techniques to mitigate the negative impacts of the regularization bias in the nuisance estimates on the target estimates. The signals are then combined with a novel series estimator tailored for CEFR. We derive the pointwise and uniform asymptotic results for estimation and inference on CEFR, including the validity of the Gaussian bootstrap, and provide low-level sufficient conditions to apply the proposed framework to some specific examples. We demonstrate the finite-sample performance of the series estimator constructed under the proposed framework by numerical simulations. Finally, we apply the proposed method to estimate the causal effect of the 401(k) program on household assets. |
Date: | 2022–12 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2212.13145&r=ecm |
By: | Dante Amengual (CEMFI, Centro de Estudios Monetarios y Financieros); Gabriele Fiorentini (Università di Firenze); Enrique Sentana (CEMFI, Centro de Estudios Monetarios y Financieros) |
Abstract: | We propose specification tests for independent component analysis and structural vector autoregressions that assess the assumed cross-sectional independence of the non-Gaussian shocks. Our tests effectively compare their joint cumulative distribution with the product of their marginals at discrete or continuous grids of values for its arguments, the latter yielding a consistent test. We explicitly consider the sampling variability from using consistent estimators to compute the shocks. We study the finite sample size of our tests in several simulation exercises, with special attention to resampling procedures. We also show that they have non-negligible power against a variety of empirically plausible alternatives. |
Keywords: | Consistest tests, copulas, finite normal mixtures, independence tests, pseudo maximum likelihood estimators. |
JEL: | C32 C52 |
Date: | 2022–12 |
URL: | http://d.repec.org/n?u=RePEc:cmf:wpaper:wp2022_2212&r=ecm |
By: | Irene Botosaru; Chris Muris |
Abstract: | We present identification results for counterfactual parameters in a class of nonlinear semiparametric panel models with fixed effects and time effects. This class accommodates both discrete and continuous outcomes and discrete and continuous regressors, and includes the binary choice model with two-way fixed effects, the ordered choice model with time-varying thresholds, the censored regression model with time-varying censoring, and various transformation models for continuous dependent variables. We show that the survival distribution of counterfactual outcomes is identified (point or partial) in this class of models. This parameter is a building block for most partial and marginal effects of interest in applied practice that are based on the average structural function as defined by Blundell and Powell (2003, 2004). Our main results focus on static models, with a set of results applying to models without any exogeneity conditions. Our results do not require parametric assumptions on the distribution of the error terms and do not require time-homogeneity on the outcome equation. To the best of our knowledge, ours are the first results on average partial and marginal effects for binary choice and ordered choice models with fixed effects, time effects, and non-logistic errors. |
Date: | 2022–12 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2212.09193&r=ecm |
By: | Siqi Wei (CEMFI, Centro de Estudios Monetarios y Financieros) |
Abstract: | The Expectation-Maximization (EM) algorithm is a popular tool for estimating models with latent variables. In complex models, simulated versions such as stochastic EM, are often implemented to overcome the difficulties in computing expectations analytically. A drawback of the EM algorithm and its variants is the slow convergence in some cases, especially when the models contain high-dimensional latent variables. Liu et al., 1998 proposed a parameter-expanded algorithm (PX-EM) to speed up convergence. This paper explores the potential of parameter expansion ideas for estimating nonlinear panel models using the stochastic EM algorithm. We develop PX-SEM methods for two types of nonlinear panel data models: 1) binary choice models with individual effects and persistent shocks, and 2) persistent-transitory dynamic quantile processes. We find that PX-SEM can greatly speed up convergence especially when the initial guess is relatively far away from true values. |
Keywords: | Stochastic EM, parameter-expansion, discrete choice model, dynamic quantile regression, latent variables. |
JEL: | C13 C33 C63 |
Date: | 2022–07 |
URL: | http://d.repec.org/n?u=RePEc:cmf:wpaper:wp2022_2206&r=ecm |
By: | Andrii Babii; Eric Ghysels; Junsu Pan |
Abstract: | In this paper, we develop new methods for analyzing high-dimensional tensor datasets. A tensor factor model describes a high-dimensional dataset as a sum of a low-rank component and an idiosyncratic noise, generalizing traditional factor models for panel data. We propose an estimation algorithm, called tensor principal component analysis (PCA), which generalizes the traditional PCA applicable to panel data. The algorithm involves unfolding the tensor into a sequence of matrices along different dimensions and applying PCA to the unfolded matrices. We provide theoretical results on the consistency and asymptotic distribution for tensor PCA estimator of loadings and factors. The algorithm demonstrates good performance in Mote Carlo experiments and is applied to sorted portfolios. |
Date: | 2022–12 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2212.12981&r=ecm |
By: | Alexandre Belloni; Fei Fang; Alexander Volfovsky |
Abstract: | Estimating causal effects has become an integral part of most applied fields. Solving these modern causal questions requires tackling violations of many classical causal assumptions. In this work we consider the violation of the classical no-interference assumption, meaning that the treatment of one individuals might affect the outcomes of another. To make interference tractable, we consider a known network that describes how interference may travel. However, unlike previous work in this area, the radius (and intensity) of the interference experienced by a unit is unknown and can depend on different sub-networks of those treated and untreated that are connected to this unit. We study estimators for the average direct treatment effect on the treated in such a setting. The proposed estimator builds upon a Lepski-like procedure that searches over the possible relevant radii and treatment assignment patterns. In contrast to previous work, the proposed procedure aims to approximate the relevant network interference patterns. We establish oracle inequalities and corresponding adaptive rates for the estimation of the interference function. We leverage such estimates to propose and analyze two estimators for the average direct treatment effect on the treated. We address several challenges steaming from the data-driven creation of the patterns (i.e. feature engineering) and the network dependence. In addition to rates of convergence, under mild regularity conditions, we show that one of the proposed estimators is asymptotically normal and unbiased. |
Date: | 2022–12 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2212.03683&r=ecm |
By: | Pacifico, Antonio |
Abstract: | A novel for multivariate dynamic panel data analysis with correlated random effects is proposed when estimating high dimensional parameter spaces. A semiparametric hierarchical Bayesian strategy is used to jointly deal with incidental parameters, endogeneity issues, and model misspecification problems. The underlying methodology involves addressing an \texttt{ad-hoc} model selection based on conjugate informative proper mixture priors to select promising subsets of predictors affecting outcomes. Monte Carlo algorithms are then conducted on the resulting submodels to construct empirical Bayes estimators and investigate ratio-optimality and posterior consistency for forecasting purposes and policy issues. An empirical approach to a large panel of economies is conducted describing the functioning of the model. Simulations based on Monte Carlo designs are also performed to account for relative regrets dealing with cross-sectional heterogeneity. |
Keywords: | Multidimensional data; Bayesian Inference; Conditional Forecasting; Incidental Parameters; Tweedie Correction; Multicountry Analysis. |
JEL: | C1 C5 O1 |
Date: | 2021 |
URL: | http://d.repec.org/n?u=RePEc:pra:mprapa:115711&r=ecm |
By: | Ng Cheuk Fai |
Abstract: | Cluster standard error (Liang and Zeger, 1986) is widely used by empirical researchers to account for cluster dependence in linear model. It is well known that this standard error is biased. We show that the bias does not vanish under high dimensional asymptotics by revisiting Chesher and Jewitt (1987)'s approach. An alternative leave-cluster-out crossfit (LCOC) estimator that is unbiased, consistent and robust to cluster dependence is provided under high dimensional setting introduced by Cattaneo, Jansson and Newey (2018). Since LCOC estimator nests the leave-one-out crossfit estimator of Kline, Saggio and Solvsten (2019), the two papers are unified. Monte Carlo comparisons are provided to give insights on its finite sample properties. The LCOC estimator is then applied to Angrist and Lavy's (2009) study of the effects of high school achievement award and Donohue III and Levitt's (2001) study of the impact of abortion on crime. |
Date: | 2022–12 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2212.05554&r=ecm |
By: | Baltagi, Badi H. (Syracuse University); Bresson, Georges (University of Paris 2); Chaturvedi, Anoop (University of Allahabad); Lacroix, Guy (Université Laval) |
Abstract: | This paper extends the Baltagi et al. (2018, 2021) static and dynamic ?-contamination papers to dynamic space-time models. We investigate the robustness of Bayesian panel data models to possible misspecification of the prior distribution. The proposed robust Bayesian approach departs from the standard Bayesian framework in two ways. First, we consider the ?-contamination class of prior distributions for the model parameters as well as for the individual effects. Second, both the base elicited priors and the ?-contamination priors use Zellner (1986)’s g-priors for the variance-covariance matrices. We propose a general “toolbox” for a wide range of specifications which includes the dynamic space- time panel model with random effects, with cross-correlated effects à la Chamberlain, for the Hausman-Taylor world and for dynamic panel data models with homogeneous/ heterogeneous slopes and cross-sectional dependence. Using an extensive Monte Carlo simulation study, we compare the finite sample properties of our proposed estimator to those of standard classical estimators. We illustrate our robust Bayesian estimator using the same data as in Keane and Neal (2020). We obtain short run as well as long run effects of climate change on corn producers in the United States. |
Keywords: | climate change, robust Bayesian estimator, ε-contamination, crop yields, panel data, dynamic model, space-time |
JEL: | C11 C23 C26 Q15 Q54 |
Date: | 2022–12 |
URL: | http://d.repec.org/n?u=RePEc:iza:izadps:dp15815&r=ecm |
By: | Miguel Cabello |
Abstract: | Statistical identification of structural vector auto-regressive, moving-average (SVARMA) models requires structural shocks to be an independent process, to have mutually independent components, and each component must be non-Gaussian distributed. Taken as granted the former two conditions, common procedures for testing joint Gaussianity of structural errors vector is not sufficient to validate the latter requirement, because rejection of the null hypothesis only implicates the existence of at least one structural shock that is non-Gaussian distributed. Therefore, it is required to estimate the number of non-Gaussian components in the structural disturbances vector. This work abords such problem with a sequential testing procedure, which generalizes the current proposals, designed only for fundamental SVAR models, and allows for possibly non-fundamental SVARMA models. Under our setup, current procedures are invalid since reduced-form errors are a possibly infinite, linear combination of present, past and future values of structural errors, and they are only serially uncorrelated, but not independent. Our approach employs third and fourth order cumulant spectrum to construct some arrays whose rank is equivalent to the number non-Gaussian structural errors. Montecarlo simulations show that our approach estimates satisfactorily the number of non-Gaussian components. |
Date: | 2022–12 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2212.07263&r=ecm |
By: | Zhexiao Lin; Fang Han |
Abstract: | Imputing missing potential outcomes using an estimated regression function is a natural idea for estimating causal effects. In the literature, estimators that combine imputation and regression adjustments are believed to be comparable to augmented inverse probability weighting. Accordingly, people for a long time conjectured that such estimators, while avoiding directly constructing the weights, are also doubly robust (Imbens, 2004; Stuart, 2010). Generalizing an earlier result of the authors (Lin et al., 2021), this paper formalizes this conjecture, showing that a large class of regression-adjusted imputation methods are indeed doubly robust for estimating the average treatment effect. In addition, they are provably semiparametrically efficient as long as both the density and regression models are correctly specified. Notable examples of imputation methods covered by our theory include kernel matching, (weighted) nearest neighbor matching, local linear matching, and (honest) random forests. |
Date: | 2022–12 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2212.05424&r=ecm |
By: | Arthur Lewbel (Boston College); Xi Qu (Shanghai Jiao Tong University); Xun Tang (Rice University) |
Abstract: | We propose an adjusted 2SLS estimator for social network models when some existing network links are missing from the sample (due, e.g., to recall errors by survey respondents, or lapses in data input). In the feasible structural form, missing links make all covariates endogenous and add a new source of correlation between the structural errors and endogenous peer outcomes (in addition to simultaneity), thus invalidating conventional estimators used in the literature. We resolve these issues by rescaling peer outcomes with estimates of missing rates and constructing instruments that exploit properties of the noisy network measures. We apply our method to study peer effects in household decisions to participate in a microfinance program in Indian villages. We find that ignoring missing links and applying conventional instruments would result in a sizeable upward bias in peer effect estimates. |
Keywords: | social networks, 2SLS, missing links |
Date: | 2022–12–20 |
URL: | http://d.repec.org/n?u=RePEc:boc:bocoec:1056&r=ecm |
By: | Mboya, Mwasi; Sibbertsen, Philipp |
Abstract: | We develop methods to obtain optimal forecast under long memory in the presence of a discrete structural break based on different weighting schemes for the observations. We observe significant changes in the forecasts when long-range dependence is taken into account. Using Monte Carlo simulations, we confirm that our methods substantially improve the forecasting performance under long memory. We further present an empirical application to in inflation rates that emphasizes the importance of our methods. |
Keywords: | Long memory; Forecasting; Structural break; Optimal weight; ARFIMA model |
JEL: | C12 C22 |
Date: | 2022–12 |
URL: | http://d.repec.org/n?u=RePEc:han:dpaper:dp-705&r=ecm |
By: | Dante Amengual (CEMFI, Centro de Estudios Monetarios y Financieros); Gabriele Fiorentini (Universitá di Firenze); Enrique Sentana (CEMFI, Centro de Estudios Monetarios y Financieros) |
Abstract: | Arellano (1989a) showed that valid equality restrictions on covariance matrices could result in efficiency losses for Gaussian PMLEs in simultaneous equations models. We revisit his two-equation example using finite normal mixtures PMLEs instead, which are also consistent for mean and variance parameters regardless of the true distribution of the shocks. Because such mixtures provide good approximations to many distributions, we relate the asymptotic variance of our estimators to the relevant semiparametric efficiency bound. Our Monte Carlo results indicate that they systematically dominate MD, and that the version that imposes the valid covariance restriction is more efficient than the unrestricted one. |
Keywords: | Covariance restrictions, distributional misspecification, efficiency bound, finite normal mixtures, partial adaptivity. |
JEL: | C30 C36 |
Date: | 2022–10 |
URL: | http://d.repec.org/n?u=RePEc:cmf:wpaper:wp2022_2210&r=ecm |
By: | Ruonan Xu; Jeffrey M. Wooldridge |
Abstract: | When observing spatial data, what standard errors should we report? With the finite population framework, we identify three channels of spatial correlation: sampling scheme, assignment design, and model specification. The Eicker-Huber-White standard error, the cluster-robust standard error, and the spatial heteroskedasticity and autocorrelation consistent standard error are compared under different combinations of the three channels. Then, we provide guidelines for whether standard errors should be adjusted for spatial correlation for both linear and nonlinear estimators. As it turns out, the answer to this question also depends on the magnitude of the sampling probability. |
Date: | 2022–11 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2211.14354&r=ecm |
By: | Eva Biswas (Department of Statistics, Iowa State University); Farzad Sabzikar (Department of Statistics, Iowa State University); Peter C. B. Phillips (Cowles Foundation, Yale University) |
Abstract: | This paper extends recent asymptotic theory developed for the Hodrick Prescott (HP) filter and boosted HP (bHP) filter to long range dependent time series that have fractional Brownian motion (fBM) limit processes after suitable standardization. Under general conditions it is shown that the asymptotic form of the HP filter is a smooth curve, analogous to the finding in Phillips and Jin (2021) for integrated time series and series with deterministic drifts. Boosting the filter using the iterative procedure suggested in Phillips and Shi (2021) leads under well defined rate conditions to a consistent estimate of the fBM limit process or the fBM limit process with an accompanying deterministic drift when that is present. A stopping criterion is used to automate the boosting algorithm, giving a data-determined method for practical implementation. The theory is illustrated in simulations and two real data examples that highlight the differences between simple HP filtering and the use of boosting. The analysis is assisted by employing a uniformly and almost surely convergent trigonometric series representation of fBM. |
Date: | 2022–08 |
URL: | http://d.repec.org/n?u=RePEc:cwl:cwldpp:2347&r=ecm |
By: | Zadrozny, Peter A. |
Abstract: | Linear rational-expectations models (LREMs) are conventionally "forwardly" estimated as follows. Structural coefficients are restricted by economic restrictions in terms of deep parameters. For given deep parameters, structural equations are solved for "rational-expectations solution" (RES) equations that determine endogenous variables. For given vector autoregressive (VAR) equations that determine exogenous variables, RES equations reduce to reduced-form VAR equations for endogenous variables with exogenous variables (VARX). The combined endogenous-VARX and exogenous-VAR equations comprise the reduced-form overall VAR (OVAR) equations of all variables in a LREM. The sequence of specified, solved, and combined equations defines a mapping from deep parameters to OVAR coefficients that is used to forwardly estimate a LREM in terms of deep parameters. Forwardly-estimated deep parameters determine forwardly-estimated RES equations that Lucas (1976) advocated for making policy predictions in his critique of policy predictions made with reduced-form equations. Sims (1980) called economic identifying restrictions on deep parameters of forwardly-estimated LREMs "incredible", because he considered in-sample fits of forwardly-estimated OVAR equations inadequate and out-of-sample policy predictions of forwardly-estimated RES equations inaccurate. Sims (1980, 1986) instead advocated directly estimating OVAR equations restricted by statistical shrinkage restrictions and directly using the directly-estimated OVAR equations to make policy predictions. However, if assumed or predicted out-of-sample policy variables in directly-made policy predictions differ significantly from in-sample values, then, the out-of-sample policy predictions won't satisfy Lucas's critique. If directly-estimated OVAR equations are reduced-form equations of underlying RES and LREM-structural equations, then, identification 2 derived in the paper can linearly "inversely" estimate the underlying RES equations from the directly-estimated OVAR equations and the inversely-estimated RES equations can be used to make policy predictions that satisfy Lucas's critique. If Sims considered directly-estimated OVAR equations to fit in-sample data adequately (credibly) and their inversely-estimated RES equations to make accurate (credible) out-of-sample policy predictions, then, he should consider the inversely-estimated RES equations to be credible. Thus, inversely-estimated RES equations by identification 2 can reconcile Lucas's advocacy for making policy predictions with RES equations and Sims's advocacy for directly estimating OVAR equations. The paper also derives identification 1 of structural coefficients from RES coefficients that contributes mainly by showing that directly estimated reduced-form OVAR equations can have underlying LREM-structural equations. |
Keywords: | cross-equation restrictions of rational expectations, factorization of matrix polynomials, reconciliation of Lucas's advocacy of rational-expectations modelling and policy predictions and Sims's advocacy of VAR modelling |
JEL: | C32 C43 C53 C63 |
Date: | 2022 |
URL: | http://d.repec.org/n?u=RePEc:zbw:cfswop:682&r=ecm |
By: | Jiafeng Chen; Jonathan Roth |
Abstract: | Researchers frequently estimate the average treatment effect (ATE) in logs, which has the desirable property that its units approximate percentages. When the outcome takes on zero values, researchers often use alternative transformations (e.g., $\log(1+Y)$, $\mathrm{arcsinh}(Y)$) that behave like $\log(Y)$ for large values of $Y$, and interpret the units as percentages. In this paper, we show that ATEs for transformations other than $\log(Y)$ cannot be interpreted as percentages, at least if one imposes the seemingly reasonable requirement that a percentage does not depend on the original scaling of the outcome (e.g. dollars versus cents). We first show that if $m(y)$ is a function that behaves like $\log(y)$ for large values of $y$ and the treatment affects the probability that $Y=0$, then the ATE for $m(Y)$ can be made arbitrarily large or small in magnitude by re-scaling the units of $Y$. Moreover, we show that any parameter of the form $\theta_g = E[ g(Y(1),Y(0)) ]$ is necessarily scale dependent if it is point-identified and defined with zero-valued outcomes. We conclude by outlining a variety of options available to empirical researchers dealing with zero-valued outcomes, including (i) estimating ATEs for normalized outcomes, (ii) explicitly calibrating the value placed on the extensive versus intensive margins, or (iii) estimating separate effects for the intensive and extensive margins. |
Date: | 2022–12 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2212.06080&r=ecm |