
on Econometrics 
By:  Chen, J.; Li, D.; Li, Y.; Linton, O. B. 
Abstract:  We explore timevarying networks for highdimensional locally stationary time series, using the large VAR model framework with both the transition and (error) precision matrices evolving smoothly over time. Two types of timevarying graphs are investigated: one containing directed edges of Granger causality linkages, and the other containing undirected edges of partial correlation linkages. Under the sparse structural assumption, we propose a penalised local linear method with timevarying weighted group LASSO to jointly estimate the transition matrices and identify their significant entries, and a timevarying CLIME method to estimate the precision matrices. The estimated transition and precision matrices are then used to determine the timevarying network structures. Under some mild conditions, we derive the theoretical properties of the proposed estimates including the consistency and oracle properties. In addition, we extend the methodology and theory to cover highlycorrelated largescale time series, for which the sparsity assumption becomes invalid and we allow for common factors before estimating the factoradjusted timevarying networks. We provide extensive simulation studies and an empirical application to a large U.S. macroeconomic dataset to illustrate the finitesample performance of our methods. 
Keywords:  CLIME, Factor model, Granger causality, lasso, local linear smoothing, partial correlation, timevarying network, VAR 
JEL:  C13 C14 C32 C38 
Date:  2022–12–14 
URL:  http://d.repec.org/n?u=RePEc:cam:camdae:2273&r=ecm 
By:  Hugo Bodory; Martin Huber; Michael Lechner 
Abstract:  This paper investigates the finite sample performance of a range of parametric, semiparametric, and nonparametric instrumental variable estimators when controlling for a fixed set of covariates to evaluate the local average treatment effect. Our simulation designs are based on empirical labor market data from the US and vary in several dimensions, including effect heterogeneity, instrument selectivity, instrument strength, outcome distribution, and sample size. Among the estimators and simulations considered, nonparametric estimation based on the random forest (a machine learner controlling for covariates in a datadriven way) performs competitive in terms of the average coverage rates of the (bootstrapbased) 95% confidence intervals, while also being relatively precise. Nonparametric kernel regression as well as certain versions of semiparametric radius matching on the propensity score, pair matching on the covariates, and inverse probability weighting also have a decent coverage, but are less precise than the random forestbased method. In terms of the average root mean squared error of LATE estimation, kernel regression performs best, closely followed by the random forest method, which has the lowest average absolute bias. 
Date:  2022–12 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2212.07379&r=ecm 
By:  Takahide Yanagi 
Abstract:  We develop a differenceindifferences method in a general setting in which the treatment variable of interest may be nonbinary and its value may change in each time period. It is generally difficult to estimate treatment parameters defined with the potential outcome given the entire path of treatment adoption, as each treatment path may be experienced by only a small number of observations. We propose an empirically tractable alternative using the concept of effective treatment, which summarizes the treatment path into a lowdimensional variable. Under a parallel trends assumption conditional on observed covariates, we show that doubly robust differenceindifferences estimands can identify certain average treatment effects for movers, even when the chosen effective treatment is misspecified. We consider doubly robust estimation and multiplier bootstrap inference, which are asymptotically justifiable if either an outcome regression function for stayers or a generalized propensity score is correctly parametrically specified. We illustrate the usefulness of our method by estimating the instantaneous and dynamic effects of union membership on wages. 
Date:  2022–12 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2212.13226&r=ecm 
By:  Kazuhiko Shinoda; Takahiro Hoshino 
Abstract:  In various fields of data science, researchers are often interested in estimating the ratio of conditional expectation functions (CEFR). Specifically in causal inference problems, it is sometimes natural to consider ratiobased treatment effects, such as odds ratios and hazard ratios, and even differencebased treatment effects are identified as CEFR in some empirically relevant settings. This chapter develops the general framework for estimation and inference on CEFR, which allows the use of flexible machine learning for infinitedimensional nuisance parameters. In the first stage of the framework, the orthogonal signals are constructed using debiased machine learning techniques to mitigate the negative impacts of the regularization bias in the nuisance estimates on the target estimates. The signals are then combined with a novel series estimator tailored for CEFR. We derive the pointwise and uniform asymptotic results for estimation and inference on CEFR, including the validity of the Gaussian bootstrap, and provide lowlevel sufficient conditions to apply the proposed framework to some specific examples. We demonstrate the finitesample performance of the series estimator constructed under the proposed framework by numerical simulations. Finally, we apply the proposed method to estimate the causal effect of the 401(k) program on household assets. 
Date:  2022–12 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2212.13145&r=ecm 
By:  Dante Amengual (CEMFI, Centro de Estudios Monetarios y Financieros); Gabriele Fiorentini (Università di Firenze); Enrique Sentana (CEMFI, Centro de Estudios Monetarios y Financieros) 
Abstract:  We propose specification tests for independent component analysis and structural vector autoregressions that assess the assumed crosssectional independence of the nonGaussian shocks. Our tests effectively compare their joint cumulative distribution with the product of their marginals at discrete or continuous grids of values for its arguments, the latter yielding a consistent test. We explicitly consider the sampling variability from using consistent estimators to compute the shocks. We study the finite sample size of our tests in several simulation exercises, with special attention to resampling procedures. We also show that they have nonnegligible power against a variety of empirically plausible alternatives. 
Keywords:  Consistest tests, copulas, finite normal mixtures, independence tests, pseudo maximum likelihood estimators. 
JEL:  C32 C52 
Date:  2022–12 
URL:  http://d.repec.org/n?u=RePEc:cmf:wpaper:wp2022_2212&r=ecm 
By:  Irene Botosaru; Chris Muris 
Abstract:  We present identification results for counterfactual parameters in a class of nonlinear semiparametric panel models with fixed effects and time effects. This class accommodates both discrete and continuous outcomes and discrete and continuous regressors, and includes the binary choice model with twoway fixed effects, the ordered choice model with timevarying thresholds, the censored regression model with timevarying censoring, and various transformation models for continuous dependent variables. We show that the survival distribution of counterfactual outcomes is identified (point or partial) in this class of models. This parameter is a building block for most partial and marginal effects of interest in applied practice that are based on the average structural function as defined by Blundell and Powell (2003, 2004). Our main results focus on static models, with a set of results applying to models without any exogeneity conditions. Our results do not require parametric assumptions on the distribution of the error terms and do not require timehomogeneity on the outcome equation. To the best of our knowledge, ours are the first results on average partial and marginal effects for binary choice and ordered choice models with fixed effects, time effects, and nonlogistic errors. 
Date:  2022–12 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2212.09193&r=ecm 
By:  Siqi Wei (CEMFI, Centro de Estudios Monetarios y Financieros) 
Abstract:  The ExpectationMaximization (EM) algorithm is a popular tool for estimating models with latent variables. In complex models, simulated versions such as stochastic EM, are often implemented to overcome the difficulties in computing expectations analytically. A drawback of the EM algorithm and its variants is the slow convergence in some cases, especially when the models contain highdimensional latent variables. Liu et al., 1998 proposed a parameterexpanded algorithm (PXEM) to speed up convergence. This paper explores the potential of parameter expansion ideas for estimating nonlinear panel models using the stochastic EM algorithm. We develop PXSEM methods for two types of nonlinear panel data models: 1) binary choice models with individual effects and persistent shocks, and 2) persistenttransitory dynamic quantile processes. We find that PXSEM can greatly speed up convergence especially when the initial guess is relatively far away from true values. 
Keywords:  Stochastic EM, parameterexpansion, discrete choice model, dynamic quantile regression, latent variables. 
JEL:  C13 C33 C63 
Date:  2022–07 
URL:  http://d.repec.org/n?u=RePEc:cmf:wpaper:wp2022_2206&r=ecm 
By:  Andrii Babii; Eric Ghysels; Junsu Pan 
Abstract:  In this paper, we develop new methods for analyzing highdimensional tensor datasets. A tensor factor model describes a highdimensional dataset as a sum of a lowrank component and an idiosyncratic noise, generalizing traditional factor models for panel data. We propose an estimation algorithm, called tensor principal component analysis (PCA), which generalizes the traditional PCA applicable to panel data. The algorithm involves unfolding the tensor into a sequence of matrices along different dimensions and applying PCA to the unfolded matrices. We provide theoretical results on the consistency and asymptotic distribution for tensor PCA estimator of loadings and factors. The algorithm demonstrates good performance in Mote Carlo experiments and is applied to sorted portfolios. 
Date:  2022–12 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2212.12981&r=ecm 
By:  Alexandre Belloni; Fei Fang; Alexander Volfovsky 
Abstract:  Estimating causal effects has become an integral part of most applied fields. Solving these modern causal questions requires tackling violations of many classical causal assumptions. In this work we consider the violation of the classical nointerference assumption, meaning that the treatment of one individuals might affect the outcomes of another. To make interference tractable, we consider a known network that describes how interference may travel. However, unlike previous work in this area, the radius (and intensity) of the interference experienced by a unit is unknown and can depend on different subnetworks of those treated and untreated that are connected to this unit. We study estimators for the average direct treatment effect on the treated in such a setting. The proposed estimator builds upon a Lepskilike procedure that searches over the possible relevant radii and treatment assignment patterns. In contrast to previous work, the proposed procedure aims to approximate the relevant network interference patterns. We establish oracle inequalities and corresponding adaptive rates for the estimation of the interference function. We leverage such estimates to propose and analyze two estimators for the average direct treatment effect on the treated. We address several challenges steaming from the datadriven creation of the patterns (i.e. feature engineering) and the network dependence. In addition to rates of convergence, under mild regularity conditions, we show that one of the proposed estimators is asymptotically normal and unbiased. 
Date:  2022–12 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2212.03683&r=ecm 
By:  Pacifico, Antonio 
Abstract:  A novel for multivariate dynamic panel data analysis with correlated random effects is proposed when estimating high dimensional parameter spaces. A semiparametric hierarchical Bayesian strategy is used to jointly deal with incidental parameters, endogeneity issues, and model misspecification problems. The underlying methodology involves addressing an \texttt{adhoc} model selection based on conjugate informative proper mixture priors to select promising subsets of predictors affecting outcomes. Monte Carlo algorithms are then conducted on the resulting submodels to construct empirical Bayes estimators and investigate ratiooptimality and posterior consistency for forecasting purposes and policy issues. An empirical approach to a large panel of economies is conducted describing the functioning of the model. Simulations based on Monte Carlo designs are also performed to account for relative regrets dealing with crosssectional heterogeneity. 
Keywords:  Multidimensional data; Bayesian Inference; Conditional Forecasting; Incidental Parameters; Tweedie Correction; Multicountry Analysis. 
JEL:  C1 C5 O1 
Date:  2021 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:115711&r=ecm 
By:  Ng Cheuk Fai 
Abstract:  Cluster standard error (Liang and Zeger, 1986) is widely used by empirical researchers to account for cluster dependence in linear model. It is well known that this standard error is biased. We show that the bias does not vanish under high dimensional asymptotics by revisiting Chesher and Jewitt (1987)'s approach. An alternative leaveclusterout crossfit (LCOC) estimator that is unbiased, consistent and robust to cluster dependence is provided under high dimensional setting introduced by Cattaneo, Jansson and Newey (2018). Since LCOC estimator nests the leaveoneout crossfit estimator of Kline, Saggio and Solvsten (2019), the two papers are unified. Monte Carlo comparisons are provided to give insights on its finite sample properties. The LCOC estimator is then applied to Angrist and Lavy's (2009) study of the effects of high school achievement award and Donohue III and Levitt's (2001) study of the impact of abortion on crime. 
Date:  2022–12 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2212.05554&r=ecm 
By:  Baltagi, Badi H. (Syracuse University); Bresson, Georges (University of Paris 2); Chaturvedi, Anoop (University of Allahabad); Lacroix, Guy (Université Laval) 
Abstract:  This paper extends the Baltagi et al. (2018, 2021) static and dynamic ?contamination papers to dynamic spacetime models. We investigate the robustness of Bayesian panel data models to possible misspecification of the prior distribution. The proposed robust Bayesian approach departs from the standard Bayesian framework in two ways. First, we consider the ?contamination class of prior distributions for the model parameters as well as for the individual effects. Second, both the base elicited priors and the ?contamination priors use Zellner (1986)’s gpriors for the variancecovariance matrices. We propose a general “toolbox” for a wide range of specifications which includes the dynamic space time panel model with random effects, with crosscorrelated effects à la Chamberlain, for the HausmanTaylor world and for dynamic panel data models with homogeneous/ heterogeneous slopes and crosssectional dependence. Using an extensive Monte Carlo simulation study, we compare the finite sample properties of our proposed estimator to those of standard classical estimators. We illustrate our robust Bayesian estimator using the same data as in Keane and Neal (2020). We obtain short run as well as long run effects of climate change on corn producers in the United States. 
Keywords:  climate change, robust Bayesian estimator, ÃƒÆ’ÅÂ½Ãƒâ€šÂÂµcontamination, crop yields, panel data, dynamic model, spacetime 
JEL:  C11 C23 C26 Q15 Q54 
Date:  2022–12 
URL:  http://d.repec.org/n?u=RePEc:iza:izadps:dp15815&r=ecm 
By:  Miguel Cabello 
Abstract:  Statistical identification of structural vector autoregressive, movingaverage (SVARMA) models requires structural shocks to be an independent process, to have mutually independent components, and each component must be nonGaussian distributed. Taken as granted the former two conditions, common procedures for testing joint Gaussianity of structural errors vector is not sufficient to validate the latter requirement, because rejection of the null hypothesis only implicates the existence of at least one structural shock that is nonGaussian distributed. Therefore, it is required to estimate the number of nonGaussian components in the structural disturbances vector. This work abords such problem with a sequential testing procedure, which generalizes the current proposals, designed only for fundamental SVAR models, and allows for possibly nonfundamental SVARMA models. Under our setup, current procedures are invalid since reducedform errors are a possibly infinite, linear combination of present, past and future values of structural errors, and they are only serially uncorrelated, but not independent. Our approach employs third and fourth order cumulant spectrum to construct some arrays whose rank is equivalent to the number nonGaussian structural errors. Montecarlo simulations show that our approach estimates satisfactorily the number of nonGaussian components. 
Date:  2022–12 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2212.07263&r=ecm 
By:  Zhexiao Lin; Fang Han 
Abstract:  Imputing missing potential outcomes using an estimated regression function is a natural idea for estimating causal effects. In the literature, estimators that combine imputation and regression adjustments are believed to be comparable to augmented inverse probability weighting. Accordingly, people for a long time conjectured that such estimators, while avoiding directly constructing the weights, are also doubly robust (Imbens, 2004; Stuart, 2010). Generalizing an earlier result of the authors (Lin et al., 2021), this paper formalizes this conjecture, showing that a large class of regressionadjusted imputation methods are indeed doubly robust for estimating the average treatment effect. In addition, they are provably semiparametrically efficient as long as both the density and regression models are correctly specified. Notable examples of imputation methods covered by our theory include kernel matching, (weighted) nearest neighbor matching, local linear matching, and (honest) random forests. 
Date:  2022–12 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2212.05424&r=ecm 
By:  Arthur Lewbel (Boston College); Xi Qu (Shanghai Jiao Tong University); Xun Tang (Rice University) 
Abstract:  We propose an adjusted 2SLS estimator for social network models when some existing network links are missing from the sample (due, e.g., to recall errors by survey respondents, or lapses in data input). In the feasible structural form, missing links make all covariates endogenous and add a new source of correlation between the structural errors and endogenous peer outcomes (in addition to simultaneity), thus invalidating conventional estimators used in the literature. We resolve these issues by rescaling peer outcomes with estimates of missing rates and constructing instruments that exploit properties of the noisy network measures. We apply our method to study peer effects in household decisions to participate in a microfinance program in Indian villages. We find that ignoring missing links and applying conventional instruments would result in a sizeable upward bias in peer effect estimates. 
Keywords:  social networks, 2SLS, missing links 
Date:  2022–12–20 
URL:  http://d.repec.org/n?u=RePEc:boc:bocoec:1056&r=ecm 
By:  Mboya, Mwasi; Sibbertsen, Philipp 
Abstract:  We develop methods to obtain optimal forecast under long memory in the presence of a discrete structural break based on different weighting schemes for the observations. We observe significant changes in the forecasts when longrange dependence is taken into account. Using Monte Carlo simulations, we confirm that our methods substantially improve the forecasting performance under long memory. We further present an empirical application to in inflation rates that emphasizes the importance of our methods. 
Keywords:  Long memory; Forecasting; Structural break; Optimal weight; ARFIMA model 
JEL:  C12 C22 
Date:  2022–12 
URL:  http://d.repec.org/n?u=RePEc:han:dpaper:dp705&r=ecm 
By:  Dante Amengual (CEMFI, Centro de Estudios Monetarios y Financieros); Gabriele Fiorentini (Universitá di Firenze); Enrique Sentana (CEMFI, Centro de Estudios Monetarios y Financieros) 
Abstract:  Arellano (1989a) showed that valid equality restrictions on covariance matrices could result in efficiency losses for Gaussian PMLEs in simultaneous equations models. We revisit his twoequation example using finite normal mixtures PMLEs instead, which are also consistent for mean and variance parameters regardless of the true distribution of the shocks. Because such mixtures provide good approximations to many distributions, we relate the asymptotic variance of our estimators to the relevant semiparametric efficiency bound. Our Monte Carlo results indicate that they systematically dominate MD, and that the version that imposes the valid covariance restriction is more efficient than the unrestricted one. 
Keywords:  Covariance restrictions, distributional misspecification, efficiency bound, finite normal mixtures, partial adaptivity. 
JEL:  C30 C36 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:cmf:wpaper:wp2022_2210&r=ecm 
By:  Ruonan Xu; Jeffrey M. Wooldridge 
Abstract:  When observing spatial data, what standard errors should we report? With the finite population framework, we identify three channels of spatial correlation: sampling scheme, assignment design, and model specification. The EickerHuberWhite standard error, the clusterrobust standard error, and the spatial heteroskedasticity and autocorrelation consistent standard error are compared under different combinations of the three channels. Then, we provide guidelines for whether standard errors should be adjusted for spatial correlation for both linear and nonlinear estimators. As it turns out, the answer to this question also depends on the magnitude of the sampling probability. 
Date:  2022–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2211.14354&r=ecm 
By:  Eva Biswas (Department of Statistics, Iowa State University); Farzad Sabzikar (Department of Statistics, Iowa State University); Peter C. B. Phillips (Cowles Foundation, Yale University) 
Abstract:  This paper extends recent asymptotic theory developed for the Hodrick Prescott (HP) filter and boosted HP (bHP) filter to long range dependent time series that have fractional Brownian motion (fBM) limit processes after suitable standardization. Under general conditions it is shown that the asymptotic form of the HP filter is a smooth curve, analogous to the finding in Phillips and Jin (2021) for integrated time series and series with deterministic drifts. Boosting the filter using the iterative procedure suggested in Phillips and Shi (2021) leads under well defined rate conditions to a consistent estimate of the fBM limit process or the fBM limit process with an accompanying deterministic drift when that is present. A stopping criterion is used to automate the boosting algorithm, giving a datadetermined method for practical implementation. The theory is illustrated in simulations and two real data examples that highlight the differences between simple HP filtering and the use of boosting. The analysis is assisted by employing a uniformly and almost surely convergent trigonometric series representation of fBM. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:cwl:cwldpp:2347&r=ecm 
By:  Zadrozny, Peter A. 
Abstract:  Linear rationalexpectations models (LREMs) are conventionally "forwardly" estimated as follows. Structural coefficients are restricted by economic restrictions in terms of deep parameters. For given deep parameters, structural equations are solved for "rationalexpectations solution" (RES) equations that determine endogenous variables. For given vector autoregressive (VAR) equations that determine exogenous variables, RES equations reduce to reducedform VAR equations for endogenous variables with exogenous variables (VARX). The combined endogenousVARX and exogenousVAR equations comprise the reducedform overall VAR (OVAR) equations of all variables in a LREM. The sequence of specified, solved, and combined equations defines a mapping from deep parameters to OVAR coefficients that is used to forwardly estimate a LREM in terms of deep parameters. Forwardlyestimated deep parameters determine forwardlyestimated RES equations that Lucas (1976) advocated for making policy predictions in his critique of policy predictions made with reducedform equations. Sims (1980) called economic identifying restrictions on deep parameters of forwardlyestimated LREMs "incredible", because he considered insample fits of forwardlyestimated OVAR equations inadequate and outofsample policy predictions of forwardlyestimated RES equations inaccurate. Sims (1980, 1986) instead advocated directly estimating OVAR equations restricted by statistical shrinkage restrictions and directly using the directlyestimated OVAR equations to make policy predictions. However, if assumed or predicted outofsample policy variables in directlymade policy predictions differ significantly from insample values, then, the outofsample policy predictions won't satisfy Lucas's critique. If directlyestimated OVAR equations are reducedform equations of underlying RES and LREMstructural equations, then, identification 2 derived in the paper can linearly "inversely" estimate the underlying RES equations from the directlyestimated OVAR equations and the inverselyestimated RES equations can be used to make policy predictions that satisfy Lucas's critique. If Sims considered directlyestimated OVAR equations to fit insample data adequately (credibly) and their inverselyestimated RES equations to make accurate (credible) outofsample policy predictions, then, he should consider the inverselyestimated RES equations to be credible. Thus, inverselyestimated RES equations by identification 2 can reconcile Lucas's advocacy for making policy predictions with RES equations and Sims's advocacy for directly estimating OVAR equations. The paper also derives identification 1 of structural coefficients from RES coefficients that contributes mainly by showing that directly estimated reducedform OVAR equations can have underlying LREMstructural equations. 
Keywords:  crossequation restrictions of rational expectations, factorization of matrix polynomials, reconciliation of Lucas's advocacy of rationalexpectations modelling and policy predictions and Sims's advocacy of VAR modelling 
JEL:  C32 C43 C53 C63 
Date:  2022 
URL:  http://d.repec.org/n?u=RePEc:zbw:cfswop:682&r=ecm 
By:  Jiafeng Chen; Jonathan Roth 
Abstract:  Researchers frequently estimate the average treatment effect (ATE) in logs, which has the desirable property that its units approximate percentages. When the outcome takes on zero values, researchers often use alternative transformations (e.g., $\log(1+Y)$, $\mathrm{arcsinh}(Y)$) that behave like $\log(Y)$ for large values of $Y$, and interpret the units as percentages. In this paper, we show that ATEs for transformations other than $\log(Y)$ cannot be interpreted as percentages, at least if one imposes the seemingly reasonable requirement that a percentage does not depend on the original scaling of the outcome (e.g. dollars versus cents). We first show that if $m(y)$ is a function that behaves like $\log(y)$ for large values of $y$ and the treatment affects the probability that $Y=0$, then the ATE for $m(Y)$ can be made arbitrarily large or small in magnitude by rescaling the units of $Y$. Moreover, we show that any parameter of the form $\theta_g = E[ g(Y(1),Y(0)) ]$ is necessarily scale dependent if it is pointidentified and defined with zerovalued outcomes. We conclude by outlining a variety of options available to empirical researchers dealing with zerovalued outcomes, including (i) estimating ATEs for normalized outcomes, (ii) explicitly calibrating the value placed on the extensive versus intensive margins, or (iii) estimating separate effects for the intensive and extensive margins. 
Date:  2022–12 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2212.06080&r=ecm 