|
on Econometrics |
By: | Max Cytrynbaum |
Abstract: | This paper studies covariate adjusted estimation of the average treatment effect (ATE) in stratified experiments. We work in the stratified randomization framework of Cytrynbaum (2021), which includes matched tuples designs (e.g. matched pairs), coarse stratification, and complete randomization as special cases. Interestingly, we show that the Lin (2013) interacted regression is generically asymptotically inefficient, with efficiency only in the edge case of complete randomization. Motivated by this finding, we derive the optimal linear covariate adjustment for a given stratified design, constructing several new estimators that achieve the minimal variance. Conceptually, we show that optimal linear adjustment of a stratified design is equivalent in large samples to doubly-robust semiparametric adjustment of an independent design. We also develop novel asymptotically exact inference for the ATE over a general family of adjusted estimators, showing in simulations that the usual Eicker-Huber-White confidence intervals can significantly overcover. Our inference methods produce shorter confidence intervals by fully accounting for the precision gains from both covariate adjustment and stratified randomization. Simulation experiments and an empirical application to the Oregon Health Insurance Experiment data (Finkelstein et al. (2012)) demonstrate the value of our proposed methods. |
Date: | 2023–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2302.03687&r=ecm |
By: | Zongwu Cai (Department of Economics, The University of Kansas, Lawrence, KS 66045, USA); Ying Fang (The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, Fujian 361005, China and Department of Statistics & Data Science, School of Economics, Xiamen University, Xiamen, Fujian 361005, China); Ming Lin (The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, Fujian 361005, China and Department of Statistics and Data Science, School of Economics, Xiamen University, Xiamen, Fujian 361005, China); Zixuan Wu (Department of Statistics and Data Science, School of Economics, Xiamen University, Xiamen, Fujian 361005, China) |
Abstract: | To relax the convex hull assumption for the conventional synthetic control method to estimate the average treatment effect, this article proposes a quasi synthetic control method for nonlinear models under index model framework, together with a suggestion of using the minimum average variance estimation method to estimate parameters and the LASSO type procedure to choose covariates. Also, we derive the asymptotic distribution of the proposed estimators. A properly designed Bootstrap method is proposed to obtain confidence intervals and its theoretical justification is provided. Finally, Monte Carlo simulation studies are conducted to illustrate the finite sample performance and an empirical application to reanalyze the data from the National Supported Work Demonstration is also considered to demonstrate the proposed model to be practically useful. |
Keywords: | Average treatment effect; Bootstrap inference; Index model; Minimum average variance estimation method; Semiparametric estimation; Synthetic control method |
JEL: | C01 C14 C54 |
Date: | 2023–02 |
URL: | http://d.repec.org/n?u=RePEc:kan:wpaper:202305&r=ecm |
By: | Joshua C. C. Chan; Aubrey Poon; Dan Zhu |
Abstract: | We develop an efficient sampling approach for handling complex missing data patterns and a large number of missing observations in conditionally Gaussian state space models. Two important examples are dynamic factor models with unbalanced datasets and large Bayesian VARs with variables in multiple frequencies. A key insight underlying the proposed approach is that the joint distribution of the missing data conditional on the observed data is Gaussian. Moreover, the inverse covariance or precision matrix of this conditional distribution is sparse, and this special structure can be exploited to substantially speed up computations. We illustrate the methodology using two empirical applications. The first application combines quarterly, monthly and weekly data using a large Bayesian VAR to produce weekly GDP estimates. In the second application, we extract latent factors from unbalanced datasets involving over a hundred monthly variables via a dynamic factor model with stochastic volatility. |
Date: | 2023–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2302.03172&r=ecm |
By: | Heino Bohn Nielsen; Anders Rahbek |
Abstract: | We extend the theory from Fan and Li (2001) on penalized likelihood-based estimation and model-selection to statistical and econometric models which allow for non-negativity constraints on some or all of the parameters, as well as time-series dependence. It differs from classic non-penalized likelihood estimation, where limiting distributions of likelihood-based estimators and test-statistics are non-standard, and depend on the unknown number of parameters on the boundary of the parameter space. Specifically, we establish that the joint model selection and estimation, results in standard asymptotic Gaussian distributed estimators. The results are applied to the rich class of autoregressive conditional heteroskedastic (ARCH) models for the modelling of time-varying volatility. We find from simulations that the penalized estimation and model-selection works surprisingly well even for a large number of parameters. A simple empirical illustration for stock-market returns data confirms the ability of the penalized estimation to select ARCH models which fit nicely the autocorrelation function, as well as confirms the stylized fact of long-memory in financial time series data. |
Date: | 2023–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2302.02867&r=ecm |
By: | Marín Díazaraque, Juan Miguel; Lopes Moreira Da Veiga, María Helena |
Abstract: | In this paper, we propose a new asymmetric stochastic volatility model whose asymmetry parameter can change depending on the intensity of the shock and is modeled as a threshold function whose threshold depends on past returns. We study the model in terms of leverage and propagation using a new concept that has recently appeared in the literature. We find that the new model can generate more leverage and propagation than a well-known asymmetric volatility model. We also propose to estimate the parameters of the model by cloning data. We compare the estimates in finite samples of data cloning and a Bayesian approach and find that data cloning is often more accurate. Data cloning is a general technique for computing maximum likelihood estimators and their asymptotic variances using a Markov chain Monte Carlo (MCMC) method. The empirical application shows that the new model often improves the fit compared to the benchmark model. Finally, the new proposal together with data cloning estimation often leads to more accurate 1-day and 10-day volatility forecasts, especially for return series with high volatility. |
Keywords: | Asymmetric Stochastic Volatility; Data Cloning; Leverage Effect; Propagation; Volatility Forecasting |
Date: | 2023–02–14 |
URL: | http://d.repec.org/n?u=RePEc:cte:wsrepe:36569&r=ecm |
By: | Thomas-Agnan, Christine; Dargel, Lukas |
Abstract: | In the framework of spatial econometric interaction models for origin-destination flows, we develop an estimation method for the case when the list of origins may be distinct from the list of destinations, and when the origin-destination matrix may be sparse. The proposed model resembles a weighted version of the one of LeSage (2008) and we are able to retain most of the efficiency gains associated with the matrix form estimation, which we illustrate for the maximum likelihood estimator. We also derive computationally feasible tests for the coherence of the estimation results and present an efficient approximation of the conditional expectation of the flows, marginal effects and predictions. |
Keywords: | Spatial Econometric;; Interaction Models;; Zero Flow Problem;; OD Matrices;; Networks; |
JEL: | C21 C51 |
Date: | 2023–02–08 |
URL: | http://d.repec.org/n?u=RePEc:tse:wpaper:127843&r=ecm |
By: | Juan Carlos Escanciano; Telmo P\'erez-Izquierdo |
Abstract: | Many economic and causal parameters of interest depend on generated regressors, including structural parameters in models with endogenous variables estimated by control functions and in models with sample selection. Inference with generated regressors is complicated by the very complex expression for influence functions and asymptotic variances. To address this problem, we propose automatic Locally Robust/debiased GMM estimators in a general setting with generated regressors. Importantly, we allow for the generated regressors to be generated from machine learners, such as Random Forest, Neural Nets, Boosting, and many others. We use our results to construct novel Doubly Robust estimators for the Counterfactural Average Structural Function and Average Partial Effects in models with endogeneity and sample selection, respectively. |
Date: | 2023–01 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2301.10643&r=ecm |
By: | Yuehao Bai; Liang Jiang; Joseph P. Romano; Azeem M. Shaikh; Yichong Zhang |
Abstract: | This paper studies inference on the average treatment effect in experiments in which treatment status is determined according to "matched pairs" and it is additionally desired to adjust for observed, baseline covariates to gain further precision. By a "matched pairs" design, we mean that units are sampled i.i.d. from the population of interest, paired according to observed, baseline covariates and finally, within each pair, one unit is selected at random for treatment. Importantly, we presume that not all observed, baseline covariates are used in determining treatment assignment. We study a broad class of estimators based on a "doubly robust" moment condition that permits us to study estimators with both finite-dimensional and high-dimensional forms of covariate adjustment. We find that estimators with finite-dimensional, linear adjustments need not lead to improvements in precision relative to the unadjusted difference-in-means estimator. This phenomenon persists even if the adjustments are interacted with treatment; in fact, doing so leads to no changes in precision. However, gains in precision can be ensured by including fixed effects for each of the pairs. Indeed, we show that this adjustment is the "optimal" finite-dimensional, linear adjustment. We additionally study two estimators with high-dimensional forms of covariate adjustment based on the LASSO. For each such estimator, we show that it leads to improvements in precision relative to the unadjusted difference-in-means estimator and also provide conditions under which it leads to the "optimal" nonparametric, covariate adjustment. A simulation study confirms the practical relevance of our theoretical analysis, and the methods are employed to reanalyze data from an experiment using a "matched pairs" design to study the effect of macroinsurance on microenterprise. |
Date: | 2023–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2302.04380&r=ecm |
By: | Patrick Aschermayr; Konstantinos Kalogeropoulos |
Abstract: | In this paper, we explore the class of the Hidden Semi-Markov Model (HSMM), a flexible extension of the popular Hidden Markov Model (HMM) that allows the underlying stochastic process to be a semi-Markov chain. HSMMs are typically used less frequently than their basic HMM counterpart due to the increased computational challenges when evaluating the likelihood function. Moreover, while both models are sequential in nature, parameter estimation is mainly conducted via batch estimation methods. Thus, a major motivation of this paper is to provide methods to estimate HSMMs (1) in a computationally feasible time, (2) in an exact manner, i.e. only subject to Monte Carlo error, and (3) in a sequential setting. We provide and verify an efficient computational scheme for Bayesian parameter estimation on HSMMs. Additionally, we explore the performance of HSMMs on the VIX time series using Autoregressive (AR) models with hidden semi-Markov states and demonstrate how this algorithm can be used for regime switching, model selection and clustering purposes. |
Date: | 2023–01 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2301.10494&r=ecm |
By: | Jack Fosten; Daniel Gutknecht; Marc-Oliver Pohle |
Abstract: | Quantile forecasts made across multiple horizons have become an important output of many financial institutions, central banks and international organisations. This paper proposes misspecification tests for such quantile forecasts that assess optimality over a set of multiple forecast horizons and/or quantiles. The tests build on multiple Mincer-Zarnowitz quantile regressions cast in a moment equality framework. Our main test is for the null hypothesis of autocalibration, a concept which assesses optimality with respect to the information contained in the forecasts themselves. We provide an extension that allows to test for optimality with respect to larger information sets and a multivariate extension. Importantly, our tests do not just inform about general violations of optimality, but may also provide useful insights into specific forms of sub-optimality. A simulation study investigates the finite sample performance of our tests, and two empirical applications to financial returns and U.S. macroeconomic series illustrate that our tests can yield interesting insights into quantile forecast sub-optimality and its causes. |
Date: | 2023–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2302.02747&r=ecm |
By: | Mathur, Maya B; Shpitser, Ilya; VanderWeele, Tyler |
Abstract: | Average treatment effects (ATEs) may be subject to selection bias when they are estimated among only a non-representative subset of the target population. Selection bias can sometimes be eliminated by conditioning on a “sufficient adjustment set” of covariates, even for some forms of missingness not at random (MNAR). Without requiring full specification of the causal structure, we consider sufficient adjustment sets to allow nonparametric identification of conditional ATEs in the target population. Covariates in the sufficient set may be collected among only the selected sample. We establish that if a sufficient set exists, then the set consisting of common causes of the outcome and selection, excluding the exposure and its descendants, also suffices. We establish simple graphical criteria for when a sufficient set will not exist, which could help indicate whether this is plausible for a given study. Simulations considering selection due to missing data indicated that sufficiently-adjusted complete-case analysis (CCA) can considerably outperform multiple imputation under MNAR and, if the sample size is not large, sometimes even under missingness at random. Analogous to the common-cause principle for confounding, these sufficiency results clarify when and how selection bias can be eliminated through covariate adjustment. |
Date: | 2023–01–31 |
URL: | http://d.repec.org/n?u=RePEc:osf:osfxxx:ths4e&r=ecm |
By: | Niels Gillmann (ifo Institute Dresden; Technische Universit\"at Dresden); Ostap Okhrin (Technische Universit\"at Dresden) |
Abstract: | The availability of data on economic uncertainty sparked a lot of interest in models that can timely quantify episodes of international spillovers of uncertainty. This challenging task involves trading off estimation accuracy for more timely quantification. This paper develops a local vector autoregressive model (VAR) that allows for adaptive estimation of the time-varying multivariate dependency. Under local, we mean that for each point in time, we simultaneously estimate the longest interval on which the model is constant with the model parameters. The simulation study shows that the model can handle one or multiple sudden breaks as well as a smooth break in the data. The empirical application is done using monthly Economic Policy Uncertainty data. The local model highlights that the empirical data primarily consists of long homogeneous episodes, interrupted by a small number of heterogeneous ones, that correspond to crises. Based on this observation, we create a crisis index, which reflects the homogeneity of the sample over time. Furthermore, the local model shows superiority against the rolling window estimation. |
Date: | 2023–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2302.02808&r=ecm |
By: | Yi-Chun Chen; Dmitry Mitrofanov |
Abstract: | The identification of choice models is crucial for understanding consumer behavior and informing marketing or operational strategies, policy design, and product development. The identification of parametric choice-based demand models is typically straightforward. However, nonparametric models, which are highly effective and flexible in explaining customer choice, may encounter the challenge of the dimensionality curse, hindering their identification. A prominent example of a nonparametric model is the ranking-based model, which mirrors the random utility maximization (RUM) class and is known to be nonidentifiable from the collection of choice probabilities alone. Our objective in this paper is to develop a new class of nonparametric models that is not subject to the problem of nonidentifiability. Our model assumes bounded rationality of consumers, which results in symmetric demand cannibalization and intriguingly enables full identification. Additionally, our choice model demonstrates competitive prediction accuracy compared to the state-of-the-art benchmarks in a real-world case study, despite incorporating the assumption of bounded rationality which could, in theory, limit the representation power of our model. In addition, we tackle the important problem of finding the optimal assortment under the proposed choice model. We demonstrate the NP-hardness of this problem and provide a fully polynomial-time approximation scheme through dynamic programming. Additionally, we propose an efficient estimation framework using a combination of column generation and expectation-maximization algorithms, which proves to be more tractable than the estimation algorithm of the aforementioned ranking-based model. |
Date: | 2023–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2302.04354&r=ecm |
By: | Joao, Igor Custodio; Lucas, André; Schaumburg, Julia; Schwaab, Bernd |
Abstract: | We introduce a new dynamic clustering method for multivariate panel data char-acterized by time-variation in cluster locations and shapes, cluster compositions, and, possibly, the number of clusters. To avoid overly frequent cluster switching (flickering), we extend standard cross-sectional clustering techniques with a penalty that shrinks observations towards the current center of their previous cluster as-signment. This links consecutive cross-sections in the panel together, substantially reduces flickering, and enhances the economic interpretability of the outcome. We choose the shrinkage parameter in a data-driven way and study its misclassification properties theoretically as well as in several challenging simulation settings. The method is illustrated using a multivariate panel of four accounting ratios for 28 large European insurance firms between 2010 and 2020. JEL Classification: C33, C38, G22 |
Keywords: | cluster membership persistence, dynamic clustering, insurance industry, shrinkage, sil-houette index |
Date: | 2023–02 |
URL: | http://d.repec.org/n?u=RePEc:ecb:ecbwps:20232780&r=ecm |
By: | Christis Katsouris |
Abstract: | This Appendix (dated: July 2021) includes supplementary derivations related to the main limit results of the econometric framework for structural break testing in predictive regression models based on the OLS-Wald and IVX-Wald test statistics, developed by Katsouris C (2021). In particular, we derive the asymptotic distributions of the test statistics when the predictive regression model includes either mildly integrated or persistent regressors. Moreover, we consider the case in which a model intercept is included in the model vis-a-vis the case that the predictive regression model has no model intercept. In a subsequent version of this study we reexamine these particular aspects in more depth with respect to the demeaned versions of the variables of the predictive regression. |
Date: | 2023–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2302.02370&r=ecm |
By: | Alicia Curth; Mihaela van der Schaar |
Abstract: | Personalized treatment effect estimates are often of interest in high-stakes applications -- thus, before deploying a model estimating such effects in practice, one needs to be sure that the best candidate from the ever-growing machine learning toolbox for this task was chosen. Unfortunately, due to the absence of counterfactual information in practice, it is usually not possible to rely on standard validation metrics for doing so, leading to a well-known model selection dilemma in the treatment effect estimation literature. While some solutions have recently been investigated, systematic understanding of the strengths and weaknesses of different model selection criteria is still lacking. In this paper, instead of attempting to declare a global `winner', we therefore empirically investigate success- and failure modes of different selection criteria. We highlight that there is a complex interplay between selection strategies, candidate estimators and the DGP used for testing, and provide interesting insights into the relative (dis)advantages of different criteria alongside desiderata for the design of further illuminating empirical studies in this context. |
Date: | 2023–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2302.02923&r=ecm |
By: | Alho, Juha M.; Rendtel, Ulrich; Khan, Mursala |
Abstract: | High nonresponse rates have become a rule in survey sampling. In panel surveys there occur additional sample losses due to panel attrition, which are thought to worsen the bias resulting from initial nonresponse. However, under certain conditions an initial wave nonresponse bias may vanish in later panel waves. We study such a "Fade away" of an initial nonresponse bias in the context of regression analysis. By using a time series approach for the covariate and the error terms we derive the bias of cross-sectional OLS-estimates of the slope coefficient. In the case of no subsequent attrition and only serial correlation an initial bias converges to zero. If the nonresponse affects permanent components the initial bias will decrease to a limit which is determined by the size of the permanent components. Attrition is discussed here in a worst case scenario, where there is a steady selective drift into the same direction as in the initial panel wave. It is shown that the fade away effect dampens the attrition effect to a large extent depending on the temporal stability of the covariate and the dependent variable. The attrition effect may by further reduced by a weighted regression analysis, where the weights are estimated attrition probabilities on the basis of the lagged dependent variable. The results are discussed with respect to surveys with unsure selection procedures which are used in a longitudinal fashion, like access panels. |
Keywords: | Regression Analysis, Nonresponse Bias, Panel Attrition, Inverse Probability Weighting |
Date: | 2023 |
URL: | http://d.repec.org/n?u=RePEc:zbw:fubsbe:20232&r=ecm |
By: | Morris, Tim P (MRC Clinical Trials Unit at UCL); White, Ian R; Pham, Tra My; Quartagno, Matteo |
Abstract: | Simulation studies are a powerful tool in epidemiology and biostatistics, but they can be hard to conduct successfully. Sometimes unexpected results are obtained. We offer advice on how to check a simulation study when this occurs, and how to design and conduct the study to give results that are easier to check. Simulation studies should be designed to include some settings where answers are already known. They should be coded sequentially, with data generating mechanisms checked before simulated data are analysed. Results should be explored carefully, with scatterplots of standard error estimates against point estimates a powerful tool. Failed estimation and outlying estimates should be identified and avoided by changing data generating mechanisms or coding realistic hybrid analysis procedures. Finally, surprising results should be investigated by methods including considering whether sources of variation are correctly included. Following our advice may help to prevent errors and to improve the quality of published simulation studies. |
Date: | 2023–02–03 |
URL: | http://d.repec.org/n?u=RePEc:osf:osfxxx:cbr72&r=ecm |
By: | Lanier, Joshua; Large, Jeremy; Quah, John |
Abstract: | We present a discrete choice, random utility model and a new estimation technique for analyzing consumer demand for large numbers of products. We allow the consumer to purchase multiple units of any product and to purchase multiple products at once (think of a consumer selecting a bundle of goods in a supermarket). In our model each product has an associated unobservable vector of attributes from which the consumer derives utility. Our model allows for heterogeneous utility functions across consumers, complex patterns of substitution and complementarity across products, and nonlinear price effects. The dimension of the attribute space is, by assumption, much smaller than the number of products, which effectively reduces the size of the consumption space and simplifies estimation. Nonetheless, because the number of bundles available is massive, a new estimation technique, which is based on the practice of negative sampling in machine learning, is needed to sidestep an intractable likelihood function. We prove consistency of our estimator, validate the consistency result through simulation exercises, and estimate our model using supermarket scanner data. |
Keywords: | discrete choice, demand estimation, negative sampling, machine learning, scanner data |
JEL: | C13 C34 D12 L20 L66 |
Date: | 2022–06 |
URL: | http://d.repec.org/n?u=RePEc:amz:wpaper:2023-01&r=ecm |
By: | Ye Lu; Adrian Pagan |
Abstract: | Phillips and Shi (2021) have argued that there may be some leakage from the estimate of the permanent component to what is meant to be the transitory component when one uses the Hodrick-Prescott filter. They argue that this can be eliminated by boosting the filter. We show that there is no leakage from the filter per se, so boosting is not needed for that. They also argue that there are DGP’s for the components for which the boosted filter tracks these more accurately. We show that there are other plausible DGP’s where the boosted filter tracks less accurately, and what is crucial to tracking performance is how important permanent shocks are to growth in the series being filtered. In particular, the DGP’s used in Phillips and Shi (2021) have a very high contribution from permanent shocks. |
Keywords: | Boosting, Hodrick-Prescott filter, Component models |
JEL: | E32 E37 C10 |
Date: | 2023–02 |
URL: | http://d.repec.org/n?u=RePEc:een:camaaa:2023-12&r=ecm |
By: | Shuzhen Yang; Wenqing Zhang |
Abstract: | The stochastic volatility inspired (SVI) model is widely used to fit the implied variance smile. Presently, most optimizer algorithms for the SVI model have a strong dependence on the input starting point. In this study, we develop an efficient iterative algorithm for the SVI model based on a fixed-point and least-square optimizer. Furthermore, we present the convergence results in certain situations for this novel iterative algorithm. Compared with the quasi-explicit SVI method, we demonstrate the advantages of the fixed-point iterative algorithm using simulation and market data. |
Date: | 2023–01 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2301.07830&r=ecm |
By: | Paul Labonne |
Abstract: | This paper presents a new way to account for downside and upside risks when producing density nowcasts of GDP growth. The approach relies on modelling location, scale and shape common factors in real-time macroeconomic data. While movements in the location generate shifts in the central part of the predictive density, the scale controls its dispersion (akin to general uncertainty) and the shape its asymmetry, or skewness (akin to downside and upside risks). The empirical application is centred on US GDP growth and the real-time data come from Fred-MD. The results show that there is more to real-time data than their levels or means: their dispersion and asymmetry provide valuable information for nowcasting economic activity. Scale and shape common factors (i) yield more reliable measures of uncertainty and (ii) improve precision when macroeconomic uncertainty is at its peak. |
Keywords: | density nowcasting, downside risk, fred-md, nowcasting uncertainty, score driven models |
JEL: | C32 C53 E66 |
Date: | 2022–10 |
URL: | http://d.repec.org/n?u=RePEc:nsr:escoed:escoe-dp-2022-23&r=ecm |