nep-ecm New Economics Papers
on Econometrics
Issue of 2023‒10‒02
sixteen papers chosen by
Sune Karlsson, Örebro universitet

  1. Instrumental variable estimation of the proportional hazards model by presmoothing By Lorenzo Tedesco; Jad Beyhum; Ingrid Van Keilegom
  2. Cluster-Robust Inference Robust to Large Clusters By Harold D. Chiang; Yuya Sasaki; Yulong Wang
  3. Double Robust, Flexible Adjustment Methods for Causal Inference: An Overview and an Evaluation By Hoffmann, Nathan Isaac
  4. Precision-based sampling for state space models that have no measurement error By Mertens, Elmar
  5. Asymmetric AdaBoost for High-dimensional Maximum Score Regression By Jianghao Chu; Tae-Hwy Lee; Aman Ullah
  6. Another Look at the Linear Probability Model and Nonlinear Index Models By Kaicheng Chen; Robert S. Martin; Jeffrey M. Wooldridge
  7. Impulse Response Functions for Self-Exciting Nonlinear Models By Neville Francis; Michael T. Owyang; Daniel Soques
  8. Econometrics of Machine Learning Methods in Economic Forecasting By Andrii Babii; Eric Ghysels; Jonas Striaukas
  9. Stochastic Variational Inference for GARCH Models By Hanwen Xuan; Luca Maestrini; Feng Chen; Clara Grazian
  10. Incorporating Micro Data into Differentiated Products Demand Estimation with PyBLP By Christopher Conlon; Jeff Gortmaker
  11. The roots of inequality: estimating inequality of opportunity from regression trees and forests By Brunori, Paolo
  12. New general dependence measures: construction, estimation and application to high-frequency stock returns By Aleksy Leeuwenkamp; Wentao Hu
  13. Hedging Forecast Combinations With an Application to the Random Forest By Elliot Beck; Damian Kozbur; Michael Wolf
  14. Reconciling estimates of the long-term earnings effect of fertility By Simon Bensnes; Ingrid Huitfeldt; Edwin Leuven
  15. Towards seasonal adjustment of infra-monthly time series with JDemetra+ By Webel, Karsten; Smyk, Anna
  16. Splash! Robustifying Donor Pools for Policy Studies By Jared Amani Greathouse; Mani Bayani; Jason Coupet

  1. By: Lorenzo Tedesco; Jad Beyhum; Ingrid Van Keilegom
    Abstract: We consider instrumental variable estimation of the proportional hazards model of Cox (1972). The instrument and the endogenous variable are discrete but there can be (possibly continuous) exogenous covariables. By making a rank invariance assumption, we can reformulate the proportional hazards model into a semiparametric version of the instrumental variable quantile regression model of Chernozhukov and Hansen (2005). A na\"ive estimation approach based on conditional moment conditions generated by the model would lead to a highly nonconvex and nonsmooth objective function. To overcome this problem, we propose a new presmoothing methodology. First, we estimate the model nonparametrically - and show that this nonparametric estimator has a closed-form solution in the leading case of interest of randomized experiments with one-sided noncompliance. Second, we use the nonparametric estimator to generate ``proxy'' observations for which exogeneity holds. Third, we apply the usual partial likelihood estimator to the ``proxy'' data. While the paper focuses on the proportional hazards model, our presmoothing approach could be applied to estimate other semiparametric formulations of the instrumental variable quantile regression model. Our estimation procedure allows for random right-censoring. We show asymptotic normality of the resulting estimator. The approach is illustrated via simulation studies and an empirical application to the Illinois
    Date: 2023–09
  2. By: Harold D. Chiang; Yuya Sasaki; Yulong Wang
    Abstract: The recent literature Sasaki and Wang (2022) points out that the conventional cluster-robust standard errors fail in the presence of large clusters. We propose a novel method of cluster-robust inference that is valid even in the presence of large clusters. Specifically, we derive the asymptotic distribution for the t-statistics based on the common cluster-robust variance estimators when the distribution of cluster sizes follows a power law with an exponent less than two. We then propose an inference procedure based on subsampling and show its validity. Our proposed method does not require tail index estimation and remains valid under the usual thin-tailed scenarios as well.
    Date: 2023–08
  3. By: Hoffmann, Nathan Isaac
    Abstract: Double robust methods for flexible covariate adjustment in causal inference have proliferated in recent years. Despite their apparent advantages, these methods remain underutilized by social scientists. It is also unclear whether these methods actually outperform more traditional methods in finite samples. This paper has two aims: It is a guide to some of the latest methods in double robust, flexible covariate adjustment for causal inference, and it compares these methods to more traditional statistical methods. It does this by using both simulated data where the treatment effect estimate is known, and then using comparisons of experimental and observational data from the National Supported Work Demonstration. Methods covered include Augmented Inverse Propensity Weighting, Targeted Maximum Likelihood Estimation, and Double/Debiased Machine Learning. Results suggest that these methods do not necessarily outperform OLS regression or matching on propensity score estimated by logistic regression, even in cases where the data generating process is not linear.
    Date: 2023–08–29
  4. By: Mertens, Elmar
    Abstract: This article presents a computationally efficient approach to sample from Gaussian state space models. The method is an instance of precision-based sampling methods that operate on the inverse variance-covariance matrix of the states (also known as precision). The novelty is to handle cases where the observables are modeled as a linear combination of the states without measurement error. In this case, the posterior variance of the states is singular and precision is ill-defined. As in other instances of precision-based sampling, computational gains are considerable. Relevant applications include trend-cycle decompositions, (mixed-frequency) VARs with missing variables and DSGE models.
    Keywords: State space models, signal extraction, Kalman filter and smoother, precision-based sampling, band matrix
    JEL: C11 C32 C51
    Date: 2023
  5. By: Jianghao Chu (JPMorgan Chase & Co); Tae-Hwy Lee (Department of Economics, University of California Riverside); Aman Ullah (Department of Economics, University of California Riverside)
    Abstract: Carter Hill’s numerous contributions (books and articles) in econometrics stand out especially in pedagogy. An important aspect of his pedagogy is to integrate “theory and practice†of econometrics, as coined into the titles of his popular books. The new methodology we propose in this paper is consistent with these contributions of Carter Hill. In particular, we bring the maximum score regression of Manski (1975, 1985) to high dimension in theory and show that the “Asymmetric AdaBoost†provides the algorithmic implementation of the high dimensional maximum score regression in practice. Recent advances in machine learning research have not only expanded the horizon of econometrics by providing new methods but also provided the algorithmic aspects of many of traditional econometrics methods. For example, Adaptive Boosting (AdaBoost) introduced by Freund and Schapire (1996) has gained enormous success in binary/discrete classification/prediction. In this paper, we introduce the “Asymmetric AdaBoost†and relate it to the maximum score regression in the algorithmic perspective. The Asymmetric AdaBoost solves high-dimensional binary classification/prediction problems with state-dependent loss functions. Asymmetric AdaBoost produces a nonparametric classifier via minimizing the “asymmetric exponential risk†which is a convex surrogate of the non-convex 0-1 risk. The convex risk function gives a huge computational advantage over non-convex risk functions of Manski (1975, 1985) especially when the data is high-dimensional. The resulting nonparametric classifier is more robust than the parametric classifiers whose performance depends on the correct specification of the model. We show that the risk of the classifier that Asymmetric AdaBoost produces approaches the Bayes risk which is the infimum of risk that can be achieved by all classifiers. Monte Carlo experiments show that the Asymmetric AdaBoost performs better than the commonly used LASSO-regularized logistic regression when parametric assumption is violated and sample size is large. We apply the Asymmetric AdaBoost to predict business cycle turning points as in Ng (2014).
    Keywords: Maximum Score Regression; High Dimension; Asymmetric AdaBoost; Convex Relaxation; Exponential Risk.
    JEL: C25 C44 C53 C55
    Date: 2023–08
  6. By: Kaicheng Chen; Robert S. Martin; Jeffrey M. Wooldridge
    Abstract: We reconsider the pros and cons of using a linear model to approximate partial effects on a response probability for a binary outcome. In particular, we study the ramp model in Horrace and Oaxaca (2006), but focus on average partial effects (APE) rather than the parameters of the underlying linear index. We use existing theoretical results to verify that the linear projection parameters (which are always consistently estimated by ordinary least squares (OLS)) may differ from the index parameters, yet still be identical to the APEs in some cases. Using simulations, we describe other cases where OLS either does or does not approximate the APEs, and we find that having a large fraction of fitted values in [0, 1] is neither necessary nor sufficient. A practical approach to reduce the finite sample bias of OLS is to iteratively trim the observations with fitted values outside the unit interval, which we find produces estimates numerically equivalent to nonlinear least squares (NLS) estimation of the ramp model. We show that under the ramp model, NLS is consistent and asymptotically normal. Based on the theory and simulations, we provide some suggestions for empirical practice.
    Date: 2023–08
  7. By: Neville Francis; Michael T. Owyang; Daniel Soques
    Abstract: We calculate impulse response functions from regime-switching models where the driving variable can respond to the shock. Two methods used to estimate the impulse responses in these models are generalized impulse response functions and local projections. Local projections depend on the observed switches in the data, while generalized impulse response functions rely on correctly specifying regime process. Using Monte Carlos with different misspecifications, we determine under what conditions either method is preferred. We then extend model-average impulse responses to this nonlinear environment and show that they generally perform better than either generalized impulse response functions and local projections. Finally, we apply these findings to the empirical estimation of regime-dependent fiscal multipliers and find multipliers less than one and generally small differences across different states of slack.
    Keywords: generalized impulse response functions; local projections; threshold models; model averaging
    JEL: C22 C24 E62
    Date: 2023–08–29
  8. By: Andrii Babii; Eric Ghysels; Jonas Striaukas
    Abstract: This paper surveys the recent advances in machine learning method for economic forecasting. The survey covers the following topics: nowcasting, textual data, panel and tensor data, high-dimensional Granger causality tests, time series cross-validation, classification with economic losses.
    Date: 2023–08
  9. By: Hanwen Xuan; Luca Maestrini; Feng Chen; Clara Grazian
    Abstract: Stochastic variational inference algorithms are derived for fitting various heteroskedastic time series models. We examine Gaussian, t, and skew-t response GARCH models and fit these using Gaussian variational approximating densities. We implement efficient stochastic gradient ascent procedures based on the use of control variates or the reparameterization trick and demonstrate that the proposed implementations provide a fast and accurate alternative to Markov chain Monte Carlo sampling. Additionally, we present sequential updating versions of our variational algorithms, which are suitable for efficient portfolio construction and dynamic asset allocation.
    Date: 2023–08
  10. By: Christopher Conlon; Jeff Gortmaker
    Abstract: We provide a general framework for incorporating many types of micro data from summary statistics to full surveys of selected consumers into Berry, Levinsohn, and Pakes (1995)-style estimates of differentiated products demand systems. We extend best practices for BLP estimation in Conlon and Gortmaker (2020) to the case with micro data and implement them in our open-source package PyBLP. Monte Carlo experiments and empirical examples suggest that incorporating micro data can substantially improve the finite sample performance of the BLP estimator, particularly when using well-targeted summary statistics or "optimal micro moments" that we derive and show how to compute.
    JEL: C13 C18 C30 D12 L0 L66
    Date: 2023–08
  11. By: Brunori, Paolo
    Abstract: We propose the use of machine learning methods to estimate inequality of opportunity and to illustrate that regression trees and forests represent a substantial improvement over existing approaches: they reduce the risk of ad hoc model selection and trade off upward and downward bias in inequality of opportunity estimates. The advantages of regression trees and forests are illustrated by an empirical application for a cross-section of 31 European countries. We show that arbitrary model selection might lead to significant biases in inequality of opportunity estimates relative to our preferred method. These biases are reflected in both point estimates and country rankings.
    Keywords: equality of opportunity; machine learning; random forests; Equality of opportunity; Wiley deal
    JEL: J1
    Date: 2023–02–20
  12. By: Aleksy Leeuwenkamp; Wentao Hu
    Abstract: We propose a set of dependence measures that are non-linear, local, invariant to a wide range of transformations on the marginals, can show tail and risk asymmetries, are always well-defined, are easy to estimate and can be used on any dataset. We propose a nonparametric estimator and prove its consistency and asymptotic normality. Thereby we significantly improve on existing (extreme) dependence measures used in asset pricing and statistics. To show practical utility, we use these measures on high-frequency stock return data around market distress events such as the 2010 Flash Crash and during the GFC. Contrary to ubiquitously used correlations we find that our measures clearly show tail asymmetry, non-linearity, lack of diversification and endogenous buildup of risks present during these distress events. Additionally, our measures anticipate large (joint) losses during the Flash Crash while also anticipating the bounce back and flagging the subsequent market fragility. Our findings have implications for risk management, portfolio construction and hedging at any frequency.
    Date: 2023–08
  13. By: Elliot Beck; Damian Kozbur; Michael Wolf
    Abstract: This papers proposes a generic, high-level methodology for generating forecast combinations that would deliver the optimal linearly combined forecast in terms of the mean-squared forecast error if one had access to two population quantities: the mean vector and the covariance matrix of the vector of individual forecast errors. We point out that this problem is identical to a mean-variance portfolio construction problem, in which portfolio weights correspond to forecast combination weights. We allow negative forecast weights and interpret such weights as hedging over and under estimation risks across estimators. This interpretation follows directly as an implication of the portfolio analogy. We demonstrate our method's improved out-of-sample performance relative to standard methods in combining tree forecasts to form weighted random forests in 14 data sets.
    Date: 2023–08
  14. By: Simon Bensnes; Ingrid Huitfeldt; Edwin Leuven (Statistics Norway)
    Abstract: This paper presents novel methodological and empirical contributions to the child penalty literature. We propose a new estimator that combines elements from standard event study and instrumental variable estimators and demonstrate their relatedness. Our analysis shows that all three approaches yield substantial estimates of the long-term impact of children on the earnings gap between mothers and their partners, commonly known as the child penalty, ranging from 11 to 18 percent. However, the models not only estimate different magnitudes of the child penalty, they also lead to very different conclusions as to whether it is mothers or partners who drive this penalty – the key policy concern. While the event study attributes the entire impact to mothers, our results suggest that maternal responses account for only around one fourth of the penalty. Our paper also has broader implications for event-study designs. In particular, we assess the validity of the event-study assumptions using external information and characterize biases arising from selection in treatment timing. We find that women time fertility as their earnings profile flattens. The implication of this is that the event-study overestimates women’s earnings penalty as it relies on estimates of counterfactual wage profiles that are too high. These new insights in the nature of selection into fertility show that common intuitions regarding parallel trend assumptions may be misleading, and that pre-trends may be uninformative about the sign of the selection bias in the treatment period.
    Keywords: Child penalty; female labor supply; event study; instrumental variable
    JEL: C36 J13 J16 J21 J22 J31
    Date: 2023–08
  15. By: Webel, Karsten; Smyk, Anna
    Abstract: Infra-monthly economic time series have become increasingly popular in official statistics in recent years. This evolution has been largely fostered by official statistics' digital transformation during the last decade. The COVID-19 pandemic outbreak in 2020 has added fuel to the fire as many data users immediately asked for timely weekly and even daily data on economic developments. Such infra-monthly data often display seasonal behavior that calls for adjustment. For that reason, JDemetra+, the official software for harmonized seasonal adjustment of monthly and quarterly data in the European Statistical System and the European System of Central Banks, has been augmented recently with a regARIMA-esque pretreatment model and extended versions of the ARIMA model-based, STL and X-11 seasonal adjustment approaches that are tailored to the specifics of infra-monthly data and accessible through an ecosystem of R packages. This ecosystem also provides easy access to structural time series modeling. We give a comprehensive overview of the packages' current developmental stage and illustrate selected capabilities, including code snippets, using daily births in France, hourly electricity consumption in Germany, and weekly initial claims for unemployment insurance in the United States.
    Keywords: extended Airline model, high-frequency data, official statistics, signalextraction, unobserved-components decomposition
    JEL: C01 C02 C14 C18 C22 C40 C50
    Date: 2023
  16. By: Jared Amani Greathouse; Mani Bayani; Jason Coupet
    Abstract: Policy researchers using synthetic control methods typically choose a donor pool in part by using policy domain expertise so the untreated units are most like the treated unit in the pre intervention period. This potentially leaves estimation open to biases, especially when researchers have many potential donors. We compare how functional principal component analysis synthetic control, forward-selection, and the original synthetic control method select donors. To do this, we use Gaussian Process simulations as well as policy case studies from West German Reunification, a hotel moratorium in Barcelona, and a sugar-sweetened beverage tax in San Francisco. We then summarize the implications for policy research and provide avenues for future work.
    Date: 2023–08

This nep-ecm issue is ©2023 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.