nep-ecm New Economics Papers
on Econometrics
Issue of 2021‒12‒20
twenty-two papers chosen by
Sune Karlsson
Örebro universitet

  1. Zero-inflated regression for unobserved effects panel data models and difference-in-differences estimation By Hervé Cardot; Antonio Musolesi
  2. Optimized Inference in Regression Kink Designs By Majed Dodin
  3. Dynamic treatment effects: high-dimensional inference under model misspecification By Yuqian Zhang; Jelena Bradic; Weijie Ji
  4. Nonparametric estimator of the tail dependence coefficient: balancing bias and variance By Matthieu Garcin; Maxime L. D. Nicolas
  5. Dynamic Network Quantile Regression Model By Xiu Xu; Weining Wang; Yongcheol Shin; Chaowen Zheng
  6. Identifying Dynamic Discrete Choice Models with Hyperbolic Discounting By Taiga Tsubota
  7. Large Order-Invariant Bayesian VARs with Stochastic Volatility By Joshua C. C. Chan; Gary Koop; Xuewen Yu
  8. Designing Representative and Balanced Experiments by Local Randomization By Max Cytrynbaum
  9. Why Synthetic Control estimators are biased and what to do about it: Introducing Relaxed and Penalized Synthetic Controls By Oscar Engelbrektson
  10. Network regression and supervised centrality estimation By Junhui Cai; Dan Yang; Wu Zhu; Haipeng Shen; Linda Zhao
  11. Policy Learning Under Ambiguity By Riccardo D'Adamo
  12. Peer Groups and Bias Detection in Least Squares Regression By Blankmeyer, Eric
  13. Decoding Causality by Fictitious VAR Modeling By Xingwei Hu
  14. On the Assumptions of Synthetic Control Methods By Claudia Shi; Dhanya Sridhar; Vishal Misra; David M. Blei
  15. Fan charts 2.0: flexible forecast distributions with expert judgement By Sokol, Andrej
  16. A simple linear alternative to multiplicative error models with an application to trading volume By Clements, Adam; Hurn, Stan; Volkov, Vladimir
  17. Difference in Differences and Ratio in Ratios for Limited Dependent Variables By Myoung-jae Lee; Sanghyeok Lee
  18. Cleaning the covariance matrix of strongly nonstationary systems with time-independent eigenvalues By Christian Bongiorno; Damien Challet; Gr\'egoire Loeper
  19. UnFEAR: Unsupervised Feature Extraction Clustering with an Application to Crisis Regimes Classification By Mr. Jorge A Chan-Lau
  20. Bayesian inference for time varying partial adjustment model with application to intraday price discovery By Kenji Hatakenaka; Kosuke Oya
  21. An Outcome Test of Discrimination for Ranked Lists By Jonathan Roth; Guillaume Saint-Jacques; YinYin Yu
  22. Time Series Forecasting with Ensembled Stochastic Differential Equations Driven by L\'evy Noise By Luxuan Yang; Ting Gao; Yubin Lu; Jinqiao Duan; Tao Liu

  1. By: Hervé Cardot (Université de Bourgogne Franche-Comté); Antonio Musolesi (Università degli Studi di Ferrara)
    Abstract: We introduce a statistical model combining a continuous response regression model, which can take either positive or negative values, and a mass at zero. The proposed zero-inflated regression model may be appropriate in many empirical circumstances such as unobserved effects panel data models, difference-in-differences treatment effect estimation and, more generally, when the dependent variable is expressed in terms of variation over time. We provide a mathematical formalization by means of conditional mixtures, and we first show that in this context the classical ordinary least squares estimator is generally biased. We then propose a subset estimator based on the subsample of units for which the dependent variable has non-null values and derive its asymptotic properties under a conditional independence assumption. Such an estimator can be used, along with a binary response model for the conditional probability of facing a mass at zero, to compute the partial effects arising from zero-inflated regression models. We prove the asymptotic normality of the estimator as well as consistency of the empirical bootstrap. Then, we focus on unobserved effects panel data models and on difference-in-differences estimation under zero inflation and propose an estimator of the average treatment effect that is proven to be consistent. We finally provide a Monte Carlo simulation study as well as empirical illustrations showing the usefulness of the proposed approach and bringing new insights on the size of the bias in commonly used regression models, which are based on the assumption that the response variable is continuous.
    Keywords: Mixture of Distributions; Zero Inflation; Bootstrap; Panel Data; Policy Evaluation.
    JEL: C21 C23 C25
    Date: 2021–12
    URL: http://d.repec.org/n?u=RePEc:srt:wpaper:1121&r=
  2. By: Majed Dodin
    Abstract: We propose a method to remedy finite sample coverage problems and improve upon the efficiency of commonly employed procedures for the construction of nonparametric confidence intervals in regression kink designs. The proposed interval is centered at the half-length optimal, numerically obtained linear minimax estimator over distributions with Lipschitz constrained conditional mean function. Its construction ensures excellent finite sample coverage and length properties which are demonstrated in a simulation study and an empirical illustration. Given the Lipschitz constant that governs how much curvature one plausibly allows for, the procedure is fully data driven, computationally inexpensive, incorporates shape constraints and is valid irrespective of the distribution of the assignment variable.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.10713&r=
  3. By: Yuqian Zhang; Jelena Bradic; Weijie Ji
    Abstract: This paper considers the inference for heterogeneous treatment effects in dynamic settings that covariates and treatments are longitudinal. We focus on high-dimensional cases that the sample size, $N$, is potentially much larger than the covariate vector's dimension, $d$. The marginal structural mean models are considered. We propose a "sequential model doubly robust" estimator constructed based on "moment targeted" nuisance estimators. Such nuisance estimators are carefully designed through non-standard loss functions, reducing the bias resulting from potential model misspecifications. We achieve $\sqrt N$-inference even when model misspecification occurs. We only require one nuisance model to be correctly specified at each time spot. Such model correctness conditions are weaker than all the existing work, even containing the literature on low dimensions.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.06818&r=
  4. By: Matthieu Garcin; Maxime L. D. Nicolas
    Abstract: A theoretical expression is derived for the mean squared error of a nonparametric estimator of the tail dependence coefficient, depending on a threshold that defines which rank delimits the tails of a distribution. We propose a new method to optimally select this threshold. It combines the theoretical mean squared error of the estimator with a parametric estimation of the copula linking observations in the tails. Using simulations, we compare this semiparametric method with other approaches proposed in the literature, including the plateau-finding algorithm.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.11128&r=
  5. By: Xiu Xu; Weining Wang; Yongcheol Shin; Chaowen Zheng
    Abstract: We propose a dynamic network quantile regression model to investigate the quantile connectedness using a predetermined network information. We extend the existing network quantile autoregression model of Zhu et al. (2019b) by explicitly allowing the contemporaneous network effects and controlling for the common factors across quantiles. To cope with the endogeneity issue due to simultaneous network spillovers, we adopt the instrumental variable quantile regression (IVQR) estimation and derive the consistency and asymptotic normality of the IVQR estimator using the near epoch dependence property of the network process. Via Monte Carlo simulations, we confirm the satisfactory performance of the IVQR estimator across different quantiles under the different network structures. Finally, we demonstrate the usefulness of our proposed approach with an application to the dataset on the stocks traded in NYSE and NASDAQ in 2016.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.07633&r=
  6. By: Taiga Tsubota
    Abstract: We study identification of dynamic discrete choice models with hyperbolic discounting. We show that the standard discount factor, present bias factor, instantaneous utility functions, and the perceived conditional choice probabilities for the sophisticated agent are point-identified in a finite horizon model. The main idea to achieve identification is to exploit variation of the observed conditional choice probabilities over time. We also show that, if the data have an additional state variable, the identification result is still valid with less severe requirements for the number of time periods in the data. We also present the estimation method and demonstrate a good performance of the estimator by simulation.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.10721&r=
  7. By: Joshua C. C. Chan; Gary Koop; Xuewen Yu
    Abstract: Many popular specifications for Vector Autoregressions (VARs) with multivariate stochastic volatility are not invariant to the way the variables are ordered due to the use of a Cholesky decomposition for the error covariance matrix. We show that the order invariance problem in existing approaches is likely to become more serious in large VARs. We propose the use of a specification which avoids the use of this Cholesky decomposition. We show that the presence of multivariate stochastic volatility allows for identification of the proposed model and prove that it is invariant to ordering. We develop a Markov Chain Monte Carlo algorithm which allows for Bayesian estimation and prediction. In exercises involving artificial and real macroeconomic data, we demonstrate that the choice of variable ordering can have non-negligible effects on empirical results. In a macroeconomic forecasting exercise involving VARs with 20 variables we find that our order-invariant approach leads to the best forecasts and that some choices of variable ordering can lead to poor forecasts using a conventional, non-order invariant, approach.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.07225&r=
  8. By: Max Cytrynbaum
    Abstract: This paper studies treatment effect estimation in a novel two-stage model of experimentation. In the first stage, using baseline covariates, the researcher selects units to participate in the experiment from a sample of eligible units. Next, they assign each selected unit to one of two treatment arms. We relate estimator efficiency to representative selection of participants and balanced assignment of treatments. We define a new family of local randomization procedures, which can be used for both selection and assignment. This family nests stratified block randomization and matched pairs, the most commonly used designs in practice in development economics, but also produces many useful new designs, embedding them in a unified framework. When used to select representative units into the experiment, local randomization boosts effective sample size, making estimators behave as if they were estimated using a larger experiment. When used for treatment assignment, local randomization does model-free non-parametric regression adjustment by design. We give novel asymptotically exact inference methods for locally randomized selection and assignment, allowing experimenters to report smaller confidence intervals if they designed a representative experiment. We apply our methods to the setting of two-wave design, where the researcher has access to a pilot study when designing the main experiment. We use local randomization methods to give the first fully efficient solution to this problem.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.08157&r=
  9. By: Oscar Engelbrektson
    Abstract: This paper extends the literature on the theoretical properties of synthetic controls to the case of non-linear generative models, showing that the synthetic control estimator is generally biased in such settings. I derive a lower bound for the bias, showing that the only component of it that is affected by the choice of synthetic control is the weighted sum of pairwise differences between the treated unit and the untreated units in the synthetic control. To address this bias, I propose a novel synthetic control estimator that allows for a constant difference of the synthetic control to the treated unit in the pre-treatment period, and that penalizes the pairwise discrepancies. Allowing for a constant offset makes the model more flexible, thus creating a larger set of potential synthetic controls, and the penalization term allows for the selection of the potential solution that will minimize bias. I study the properties of this estimator and propose a data-driven process for parameterizing the penalization term.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.10784&r=
  10. By: Junhui Cai; Dan Yang; Wu Zhu; Haipeng Shen; Linda Zhao
    Abstract: The centrality in a network is a popular metric for agents' network positions and is often used in regression models to model the network effect on an outcome variable of interest. In empirical studies, researchers often adopt a two-stage procedure to first estimate the centrality and then infer the network effect using the estimated centrality. Despite its prevalent adoption, this two-stage procedure lacks theoretical backing and can fail in both estimation and inference. We, therefore, propose a unified framework, under which we prove the shortcomings of the two-stage in centrality estimation and the undesirable consequences in the regression. We then propose a novel supervised network centrality estimation (SuperCENT) methodology that simultaneously yields superior estimations of the centrality and the network effect and provides valid and narrower confidence intervals than those from the two-stage. We showcase the superiority of SuperCENT in predicting the currency risk premium based on the global trade network.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.12921&r=
  11. By: Riccardo D'Adamo
    Abstract: This paper studies the problem of estimating individualized treatment rules when treatment effects are partially identified, as it is often the case with observational data. We first study the population problem of assigning treatment under partial identification and derive the population optimal policies using classic optimality criteria for decision under ambiguity. We then propose an algorithm for computation of the estimated optimal treatment policy and provide statistical guarantees for its convergence to the population counterpart. Our estimation procedure leverages recent advances in the orthogonal machine learning literature, while our theoretical results account for the presence of non-differentiabilities in the problem. The proposed methods are illustrated using data from the Job Partnership Training Act study.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.10904&r=
  12. By: Blankmeyer, Eric
    Abstract: A correlation between regressors and disturbances presents challenging problems in linear regression. In the context of spatial econometrics LeSage and Pace (2009) show that an autoregressive model estimated by maximum likelihood may be able to detect least squares bias. I suggest that spatial neighbors can be replaced by “peer groups” as in Blankmeyer et al. (2011), thereby extending considerably the range of contexts where the autoregressive model can be utilized. The procedure is applied to two data sets and in a simulation
    Keywords: peer groups, least-squares bias, spatial autoregression
    JEL: C4
    Date: 2021–11–15
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:110866&r=
  13. By: Xingwei Hu
    Abstract: In modeling multivariate time series for either forecast or policy analysis, it would be beneficial to have figured out the cause-effect relations within the data. Regression analysis, however, is generally for correlation relation, and very few researches have focused on variance analysis for causality discovery. We first set up an equilibrium for the cause-effect relations using a fictitious vector autoregressive model. In the equilibrium, long-run relations are identified from noise, and spurious ones are negligibly close to zero. The solution, called causality distribution, measures the relative strength causing the movement of all series or specific affected ones. If a group of exogenous data affects the others but not vice versa, then, in theory, the causality distribution for other variables is necessarily zero. The hypothesis test of zero causality is the rule to decide a variable is endogenous or not. Our new approach has high accuracy in identifying the true cause-effect relations among the data in the simulation studies. We also apply the approach to estimating the causal factors' contribution to climate change.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.07465&r=
  14. By: Claudia Shi; Dhanya Sridhar; Vishal Misra; David M. Blei
    Abstract: Synthetic control (SC) methods have been widely applied to estimate the causal effect of large-scale interventions, e.g., the state-wide effect of a change in policy. The idea of synthetic controls is to approximate one unit's counterfactual outcomes using a weighted combination of some other units' observed outcomes. The motivating question of this paper is: how does the SC strategy lead to valid causal inferences? We address this question by re-formulating the causal inference problem targeted by SC with a more fine-grained model, where we change the unit of the analysis from "large units" (e.g., states) to "small units" (e.g., individuals in states). Under this re-formulation, we derive sufficient conditions for the non-parametric causal identification of the causal effect. We highlight two implications of the reformulation: (1) it clarifies where "linearity" comes from, and how it falls naturally out of the more fine-grained and flexible model, and (2) it suggests new ways of using available data with SC methods for valid causal inference, in particular, new ways of selecting observations from which to estimate the counterfactual.
    Date: 2021–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2112.05671&r=
  15. By: Sokol, Andrej
    Abstract: I propose a new model, conditional quantile regression (CQR), that generates density forecasts consistent with a specific view of the future evolution of some variables. This addresses a shortcoming of existing quantile regression-based models, for example the at-risk framework popularised by Adrian et al. (2019), when used in settings, such as most forecasting processes within central banks and similar institutions, that require forecasts to be conditional on a set of technical assumptions. Through an application to house price inflation in the euro area, I show that CQR provides a viable alternative to existing approaches to conditional density forecasting, notably Bayesian VARs, with considerable advantages in terms of flexibility and additional insights that do not come at the cost of forecasting performance. JEL Classification: C22, C53, E37, R31
    Keywords: at-risk, conditional forecasting, density forecast evaluation, house prices, quantile regression
    Date: 2021–12
    URL: http://d.repec.org/n?u=RePEc:ecb:ecbwps:20212624&r=
  16. By: Clements, Adam (Queensland University of Technology, Australia); Hurn, Stan (Queensland University of Technology, Australia); Volkov, Vladimir (Tasmanian School of Business & Economics, University of Tasmania)
    Abstract: Forecasting intraday trading volume is an important problem in economics and finance. One influential approach to achieving this objective is the non-linear Component Multiplicative Error Model (CMEM) that captures time series dependence and intraday periodicity in volume. While the model is well suited to dealing with a non-negative time series, it is relatively cumbersome to implement. This paper proposes a system of linear equations, that is estimated using ordinary least squares, and provides at least as good a forecasting performance as that of the CMEM. This linear specification can easily be applied to model any time series that exhibits diurnal behaviour.
    Keywords: Volume, forecasting, high-frequency data, CMEM, diurnal
    JEL: C22 G00
    Date: 2021
    URL: http://d.repec.org/n?u=RePEc:tas:wpaper:38716&r=
  17. By: Myoung-jae Lee; Sanghyeok Lee
    Abstract: Applying Difference in Differences (DD) to a limited dependent variable (LDV) Y has been problematic, which this paper addresses for binary, count, categorical, censored and fractional responses. The DD effect on a latent Y* can be found using a qualification dummy Q, a time dummy S and the treatment QS in the Y* model, which, however, does not satisfy the critical 'parallel trend' assumption for Y. We show that the assumption holds in different forms for LDV Y: 'ratio in ratios' or 'ratio in odds ratios'. Our simulation and empirical studies show that Poisson Quasi-MLE for non-negative Y and (multinomial) logit for binary, fractional and categorical Y work fine.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.12948&r=
  18. By: Christian Bongiorno; Damien Challet; Gr\'egoire Loeper
    Abstract: We propose a data-driven way to clean covariance matrices in strongly nonstationary systems. Our method rests on long-term averaging of optimal eigenvalues obtained from temporally contiguous covariance matrices, which encodes the average influence of the future on present eigenvalues. This zero-th order approximation outperforms optimal methods designed for stationary systems.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.13109&r=
  19. By: Mr. Jorge A Chan-Lau
    Abstract: We introduce unFEAR, Unsupervised Feature Extraction Clustering, to identify economic crisis regimes. Given labeled crisis and non-crisis episodes and the corresponding features values, unFEAR uses unsupervised representation learning and a novel mode contrastive autoencoder to group episodes into time-invariant non-overlapping clusters, each of which could be identified with a different regime. The likelihood that a country may experience an econmic crisis could be set equal to its cluster crisis frequency. Moreover, unFEAR could serve as a first step towards developing cluster-specific crisis prediction models tailored to each crisis regime.
    Keywords: clustering;unsupervised feature extraction;autoencoder;deep learning;biased label problem;crisis prediction;WP;crisis frequency;crisis observation;crisis risk;crisis data points; machine learning; Early warning systems; Global
    Date: 2020–11–25
    URL: http://d.repec.org/n?u=RePEc:imf:imfwpa:2020/262&r=
  20. By: Kenji Hatakenaka (Graduate School of Economics, Osaka University); Kosuke Oya (Graduate School of Economics, Osaka University)
    Abstract: Price discovery is an important built-in function of financial markets and the central issue in the market microstructure research. Market participants need to know whether the price discovery has been achieved or how much progress has been made in order to trade at an appropriate price they consider. Since various economic events such as earnings announcement affect the price discovery, the intraday transition of price discovery varies date-by-date. In this study, we propose a statistical method to see when and how fast the intraday price discovery progresses using the high frequency price series on a daily basis. The proposed method consists of estimating three candidate models which gauge the different types of price discovery progress, i.e. no progress, smooth progress and abrupt progress, and selecting the most appropriate model based on Bayesian approach. We conduct simulation analysis to assess the performance of our proposed method and confirm that the method depicts the state of price discovery appropriately. The empirical study using the Japanese stock market index shows that the proposed method well categorizes the intraday price discovery progresses on a daily basis.
    Keywords: pre-opening period, market microstructure, partial adjustment model
    JEL: C11 G14
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:osk:wpaper:2119&r=
  21. By: Jonathan Roth; Guillaume Saint-Jacques; YinYin Yu
    Abstract: This paper extends Becker (1957)'s outcome test of discrimination to settings where a (human or algorithmic) decision-maker produces a ranked list of candidates. Ranked lists are particularly relevant in the context of online platforms that produce search results or feeds, and also arise when human decisionmakers express ordinal preferences over a list of candidates. We show that non-discrimination implies a system of moment inequalities, which intuitively impose that one cannot permute the position of a lower-ranked candidate from one group with a higher-ranked candidate from a second group and systematically improve the objective. Moreover, we show that that these moment inequalities are the only testable implications of non-discrimination when the auditor observes only outcomes and group membership by rank. We show how to statistically test the implied inequalities, and validate our approach in an application using data from LinkedIn.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.07889&r=
  22. By: Luxuan Yang; Ting Gao; Yubin Lu; Jinqiao Duan; Tao Liu
    Abstract: With the fast development of modern deep learning techniques, the study of dynamic systems and neural networks is increasingly benefiting each other in a lot of different ways. Since uncertainties often arise in real world observations, SDEs (stochastic differential equations) come to play an important role. To be more specific, in this paper, we use a collection of SDEs equipped with neural networks to predict long-term trend of noisy time series which has big jump properties and high probability distribution shift. Our contributions are, first, we use the phase space reconstruction method to extract intrinsic dimension of the time series data so as to determine the input structure for our forecasting model. Second, we explore SDEs driven by $\alpha$-stable L\'evy motion to model the time series data and solve the problem through neural network approximation. Third, we construct the attention mechanism to achieve multi-time step prediction. Finally, we illustrate our method by applying it to stock marketing time series prediction and show the results outperform several baseline deep learning models.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.13164&r=

This nep-ecm issue is ©2021 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.