
on Econometrics 
By:  Hervé Cardot (Université de Bourgogne FrancheComté); Antonio Musolesi (Università degli Studi di Ferrara) 
Abstract:  We introduce a statistical model combining a continuous response regression model, which can take either positive or negative values, and a mass at zero. The proposed zeroinflated regression model may be appropriate in many empirical circumstances such as unobserved effects panel data models, differenceindifferences treatment effect estimation and, more generally, when the dependent variable is expressed in terms of variation over time. We provide a mathematical formalization by means of conditional mixtures, and we first show that in this context the classical ordinary least squares estimator is generally biased. We then propose a subset estimator based on the subsample of units for which the dependent variable has nonnull values and derive its asymptotic properties under a conditional independence assumption. Such an estimator can be used, along with a binary response model for the conditional probability of facing a mass at zero, to compute the partial effects arising from zeroinflated regression models. We prove the asymptotic normality of the estimator as well as consistency of the empirical bootstrap. Then, we focus on unobserved effects panel data models and on differenceindifferences estimation under zero inflation and propose an estimator of the average treatment effect that is proven to be consistent. We finally provide a Monte Carlo simulation study as well as empirical illustrations showing the usefulness of the proposed approach and bringing new insights on the size of the bias in commonly used regression models, which are based on the assumption that the response variable is continuous. 
Keywords:  Mixture of Distributions; Zero Inflation; Bootstrap; Panel Data; Policy Evaluation. 
JEL:  C21 C23 C25 
Date:  2021–12 
URL:  http://d.repec.org/n?u=RePEc:srt:wpaper:1121&r= 
By:  Majed Dodin 
Abstract:  We propose a method to remedy finite sample coverage problems and improve upon the efficiency of commonly employed procedures for the construction of nonparametric confidence intervals in regression kink designs. The proposed interval is centered at the halflength optimal, numerically obtained linear minimax estimator over distributions with Lipschitz constrained conditional mean function. Its construction ensures excellent finite sample coverage and length properties which are demonstrated in a simulation study and an empirical illustration. Given the Lipschitz constant that governs how much curvature one plausibly allows for, the procedure is fully data driven, computationally inexpensive, incorporates shape constraints and is valid irrespective of the distribution of the assignment variable. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.10713&r= 
By:  Yuqian Zhang; Jelena Bradic; Weijie Ji 
Abstract:  This paper considers the inference for heterogeneous treatment effects in dynamic settings that covariates and treatments are longitudinal. We focus on highdimensional cases that the sample size, $N$, is potentially much larger than the covariate vector's dimension, $d$. The marginal structural mean models are considered. We propose a "sequential model doubly robust" estimator constructed based on "moment targeted" nuisance estimators. Such nuisance estimators are carefully designed through nonstandard loss functions, reducing the bias resulting from potential model misspecifications. We achieve $\sqrt N$inference even when model misspecification occurs. We only require one nuisance model to be correctly specified at each time spot. Such model correctness conditions are weaker than all the existing work, even containing the literature on low dimensions. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.06818&r= 
By:  Matthieu Garcin; Maxime L. D. Nicolas 
Abstract:  A theoretical expression is derived for the mean squared error of a nonparametric estimator of the tail dependence coefficient, depending on a threshold that defines which rank delimits the tails of a distribution. We propose a new method to optimally select this threshold. It combines the theoretical mean squared error of the estimator with a parametric estimation of the copula linking observations in the tails. Using simulations, we compare this semiparametric method with other approaches proposed in the literature, including the plateaufinding algorithm. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.11128&r= 
By:  Xiu Xu; Weining Wang; Yongcheol Shin; Chaowen Zheng 
Abstract:  We propose a dynamic network quantile regression model to investigate the quantile connectedness using a predetermined network information. We extend the existing network quantile autoregression model of Zhu et al. (2019b) by explicitly allowing the contemporaneous network effects and controlling for the common factors across quantiles. To cope with the endogeneity issue due to simultaneous network spillovers, we adopt the instrumental variable quantile regression (IVQR) estimation and derive the consistency and asymptotic normality of the IVQR estimator using the near epoch dependence property of the network process. Via Monte Carlo simulations, we confirm the satisfactory performance of the IVQR estimator across different quantiles under the different network structures. Finally, we demonstrate the usefulness of our proposed approach with an application to the dataset on the stocks traded in NYSE and NASDAQ in 2016. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.07633&r= 
By:  Taiga Tsubota 
Abstract:  We study identification of dynamic discrete choice models with hyperbolic discounting. We show that the standard discount factor, present bias factor, instantaneous utility functions, and the perceived conditional choice probabilities for the sophisticated agent are pointidentified in a finite horizon model. The main idea to achieve identification is to exploit variation of the observed conditional choice probabilities over time. We also show that, if the data have an additional state variable, the identification result is still valid with less severe requirements for the number of time periods in the data. We also present the estimation method and demonstrate a good performance of the estimator by simulation. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.10721&r= 
By:  Joshua C. C. Chan; Gary Koop; Xuewen Yu 
Abstract:  Many popular specifications for Vector Autoregressions (VARs) with multivariate stochastic volatility are not invariant to the way the variables are ordered due to the use of a Cholesky decomposition for the error covariance matrix. We show that the order invariance problem in existing approaches is likely to become more serious in large VARs. We propose the use of a specification which avoids the use of this Cholesky decomposition. We show that the presence of multivariate stochastic volatility allows for identification of the proposed model and prove that it is invariant to ordering. We develop a Markov Chain Monte Carlo algorithm which allows for Bayesian estimation and prediction. In exercises involving artificial and real macroeconomic data, we demonstrate that the choice of variable ordering can have nonnegligible effects on empirical results. In a macroeconomic forecasting exercise involving VARs with 20 variables we find that our orderinvariant approach leads to the best forecasts and that some choices of variable ordering can lead to poor forecasts using a conventional, nonorder invariant, approach. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.07225&r= 
By:  Max Cytrynbaum 
Abstract:  This paper studies treatment effect estimation in a novel twostage model of experimentation. In the first stage, using baseline covariates, the researcher selects units to participate in the experiment from a sample of eligible units. Next, they assign each selected unit to one of two treatment arms. We relate estimator efficiency to representative selection of participants and balanced assignment of treatments. We define a new family of local randomization procedures, which can be used for both selection and assignment. This family nests stratified block randomization and matched pairs, the most commonly used designs in practice in development economics, but also produces many useful new designs, embedding them in a unified framework. When used to select representative units into the experiment, local randomization boosts effective sample size, making estimators behave as if they were estimated using a larger experiment. When used for treatment assignment, local randomization does modelfree nonparametric regression adjustment by design. We give novel asymptotically exact inference methods for locally randomized selection and assignment, allowing experimenters to report smaller confidence intervals if they designed a representative experiment. We apply our methods to the setting of twowave design, where the researcher has access to a pilot study when designing the main experiment. We use local randomization methods to give the first fully efficient solution to this problem. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.08157&r= 
By:  Oscar Engelbrektson 
Abstract:  This paper extends the literature on the theoretical properties of synthetic controls to the case of nonlinear generative models, showing that the synthetic control estimator is generally biased in such settings. I derive a lower bound for the bias, showing that the only component of it that is affected by the choice of synthetic control is the weighted sum of pairwise differences between the treated unit and the untreated units in the synthetic control. To address this bias, I propose a novel synthetic control estimator that allows for a constant difference of the synthetic control to the treated unit in the pretreatment period, and that penalizes the pairwise discrepancies. Allowing for a constant offset makes the model more flexible, thus creating a larger set of potential synthetic controls, and the penalization term allows for the selection of the potential solution that will minimize bias. I study the properties of this estimator and propose a datadriven process for parameterizing the penalization term. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.10784&r= 
By:  Junhui Cai; Dan Yang; Wu Zhu; Haipeng Shen; Linda Zhao 
Abstract:  The centrality in a network is a popular metric for agents' network positions and is often used in regression models to model the network effect on an outcome variable of interest. In empirical studies, researchers often adopt a twostage procedure to first estimate the centrality and then infer the network effect using the estimated centrality. Despite its prevalent adoption, this twostage procedure lacks theoretical backing and can fail in both estimation and inference. We, therefore, propose a unified framework, under which we prove the shortcomings of the twostage in centrality estimation and the undesirable consequences in the regression. We then propose a novel supervised network centrality estimation (SuperCENT) methodology that simultaneously yields superior estimations of the centrality and the network effect and provides valid and narrower confidence intervals than those from the twostage. We showcase the superiority of SuperCENT in predicting the currency risk premium based on the global trade network. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.12921&r= 
By:  Riccardo D'Adamo 
Abstract:  This paper studies the problem of estimating individualized treatment rules when treatment effects are partially identified, as it is often the case with observational data. We first study the population problem of assigning treatment under partial identification and derive the population optimal policies using classic optimality criteria for decision under ambiguity. We then propose an algorithm for computation of the estimated optimal treatment policy and provide statistical guarantees for its convergence to the population counterpart. Our estimation procedure leverages recent advances in the orthogonal machine learning literature, while our theoretical results account for the presence of nondifferentiabilities in the problem. The proposed methods are illustrated using data from the Job Partnership Training Act study. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.10904&r= 
By:  Blankmeyer, Eric 
Abstract:  A correlation between regressors and disturbances presents challenging problems in linear regression. In the context of spatial econometrics LeSage and Pace (2009) show that an autoregressive model estimated by maximum likelihood may be able to detect least squares bias. I suggest that spatial neighbors can be replaced by “peer groups” as in Blankmeyer et al. (2011), thereby extending considerably the range of contexts where the autoregressive model can be utilized. The procedure is applied to two data sets and in a simulation 
Keywords:  peer groups, leastsquares bias, spatial autoregression 
JEL:  C4 
Date:  2021–11–15 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:110866&r= 
By:  Xingwei Hu 
Abstract:  In modeling multivariate time series for either forecast or policy analysis, it would be beneficial to have figured out the causeeffect relations within the data. Regression analysis, however, is generally for correlation relation, and very few researches have focused on variance analysis for causality discovery. We first set up an equilibrium for the causeeffect relations using a fictitious vector autoregressive model. In the equilibrium, longrun relations are identified from noise, and spurious ones are negligibly close to zero. The solution, called causality distribution, measures the relative strength causing the movement of all series or specific affected ones. If a group of exogenous data affects the others but not vice versa, then, in theory, the causality distribution for other variables is necessarily zero. The hypothesis test of zero causality is the rule to decide a variable is endogenous or not. Our new approach has high accuracy in identifying the true causeeffect relations among the data in the simulation studies. We also apply the approach to estimating the causal factors' contribution to climate change. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.07465&r= 
By:  Claudia Shi; Dhanya Sridhar; Vishal Misra; David M. Blei 
Abstract:  Synthetic control (SC) methods have been widely applied to estimate the causal effect of largescale interventions, e.g., the statewide effect of a change in policy. The idea of synthetic controls is to approximate one unit's counterfactual outcomes using a weighted combination of some other units' observed outcomes. The motivating question of this paper is: how does the SC strategy lead to valid causal inferences? We address this question by reformulating the causal inference problem targeted by SC with a more finegrained model, where we change the unit of the analysis from "large units" (e.g., states) to "small units" (e.g., individuals in states). Under this reformulation, we derive sufficient conditions for the nonparametric causal identification of the causal effect. We highlight two implications of the reformulation: (1) it clarifies where "linearity" comes from, and how it falls naturally out of the more finegrained and flexible model, and (2) it suggests new ways of using available data with SC methods for valid causal inference, in particular, new ways of selecting observations from which to estimate the counterfactual. 
Date:  2021–12 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2112.05671&r= 
By:  Sokol, Andrej 
Abstract:  I propose a new model, conditional quantile regression (CQR), that generates density forecasts consistent with a specific view of the future evolution of some variables. This addresses a shortcoming of existing quantile regressionbased models, for example the atrisk framework popularised by Adrian et al. (2019), when used in settings, such as most forecasting processes within central banks and similar institutions, that require forecasts to be conditional on a set of technical assumptions. Through an application to house price inflation in the euro area, I show that CQR provides a viable alternative to existing approaches to conditional density forecasting, notably Bayesian VARs, with considerable advantages in terms of flexibility and additional insights that do not come at the cost of forecasting performance. JEL Classification: C22, C53, E37, R31 
Keywords:  atrisk, conditional forecasting, density forecast evaluation, house prices, quantile regression 
Date:  2021–12 
URL:  http://d.repec.org/n?u=RePEc:ecb:ecbwps:20212624&r= 
By:  Clements, Adam (Queensland University of Technology, Australia); Hurn, Stan (Queensland University of Technology, Australia); Volkov, Vladimir (Tasmanian School of Business & Economics, University of Tasmania) 
Abstract:  Forecasting intraday trading volume is an important problem in economics and finance. One influential approach to achieving this objective is the nonlinear Component Multiplicative Error Model (CMEM) that captures time series dependence and intraday periodicity in volume. While the model is well suited to dealing with a nonnegative time series, it is relatively cumbersome to implement. This paper proposes a system of linear equations, that is estimated using ordinary least squares, and provides at least as good a forecasting performance as that of the CMEM. This linear specification can easily be applied to model any time series that exhibits diurnal behaviour. 
Keywords:  Volume, forecasting, highfrequency data, CMEM, diurnal 
JEL:  C22 G00 
Date:  2021 
URL:  http://d.repec.org/n?u=RePEc:tas:wpaper:38716&r= 
By:  Myoungjae Lee; Sanghyeok Lee 
Abstract:  Applying Difference in Differences (DD) to a limited dependent variable (LDV) Y has been problematic, which this paper addresses for binary, count, categorical, censored and fractional responses. The DD effect on a latent Y* can be found using a qualification dummy Q, a time dummy S and the treatment QS in the Y* model, which, however, does not satisfy the critical 'parallel trend' assumption for Y. We show that the assumption holds in different forms for LDV Y: 'ratio in ratios' or 'ratio in odds ratios'. Our simulation and empirical studies show that Poisson QuasiMLE for nonnegative Y and (multinomial) logit for binary, fractional and categorical Y work fine. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.12948&r= 
By:  Christian Bongiorno; Damien Challet; Gr\'egoire Loeper 
Abstract:  We propose a datadriven way to clean covariance matrices in strongly nonstationary systems. Our method rests on longterm averaging of optimal eigenvalues obtained from temporally contiguous covariance matrices, which encodes the average influence of the future on present eigenvalues. This zeroth order approximation outperforms optimal methods designed for stationary systems. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.13109&r= 
By:  Mr. Jorge A ChanLau 
Abstract:  We introduce unFEAR, Unsupervised Feature Extraction Clustering, to identify economic crisis regimes. Given labeled crisis and noncrisis episodes and the corresponding features values, unFEAR uses unsupervised representation learning and a novel mode contrastive autoencoder to group episodes into timeinvariant nonoverlapping clusters, each of which could be identified with a different regime. The likelihood that a country may experience an econmic crisis could be set equal to its cluster crisis frequency. Moreover, unFEAR could serve as a first step towards developing clusterspecific crisis prediction models tailored to each crisis regime. 
Keywords:  clustering;unsupervised feature extraction;autoencoder;deep learning;biased label problem;crisis prediction;WP;crisis frequency;crisis observation;crisis risk;crisis data points; machine learning; Early warning systems; Global 
Date:  2020–11–25 
URL:  http://d.repec.org/n?u=RePEc:imf:imfwpa:2020/262&r= 
By:  Kenji Hatakenaka (Graduate School of Economics, Osaka University); Kosuke Oya (Graduate School of Economics, Osaka University) 
Abstract:  Price discovery is an important builtin function of financial markets and the central issue in the market microstructure research. Market participants need to know whether the price discovery has been achieved or how much progress has been made in order to trade at an appropriate price they consider. Since various economic events such as earnings announcement affect the price discovery, the intraday transition of price discovery varies datebydate. In this study, we propose a statistical method to see when and how fast the intraday price discovery progresses using the high frequency price series on a daily basis. The proposed method consists of estimating three candidate models which gauge the different types of price discovery progress, i.e. no progress, smooth progress and abrupt progress, and selecting the most appropriate model based on Bayesian approach. We conduct simulation analysis to assess the performance of our proposed method and confirm that the method depicts the state of price discovery appropriately. The empirical study using the Japanese stock market index shows that the proposed method well categorizes the intraday price discovery progresses on a daily basis. 
Keywords:  preopening period, market microstructure, partial adjustment model 
JEL:  C11 G14 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:osk:wpaper:2119&r= 
By:  Jonathan Roth; Guillaume SaintJacques; YinYin Yu 
Abstract:  This paper extends Becker (1957)'s outcome test of discrimination to settings where a (human or algorithmic) decisionmaker produces a ranked list of candidates. Ranked lists are particularly relevant in the context of online platforms that produce search results or feeds, and also arise when human decisionmakers express ordinal preferences over a list of candidates. We show that nondiscrimination implies a system of moment inequalities, which intuitively impose that one cannot permute the position of a lowerranked candidate from one group with a higherranked candidate from a second group and systematically improve the objective. Moreover, we show that that these moment inequalities are the only testable implications of nondiscrimination when the auditor observes only outcomes and group membership by rank. We show how to statistically test the implied inequalities, and validate our approach in an application using data from LinkedIn. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.07889&r= 
By:  Luxuan Yang; Ting Gao; Yubin Lu; Jinqiao Duan; Tao Liu 
Abstract:  With the fast development of modern deep learning techniques, the study of dynamic systems and neural networks is increasingly benefiting each other in a lot of different ways. Since uncertainties often arise in real world observations, SDEs (stochastic differential equations) come to play an important role. To be more specific, in this paper, we use a collection of SDEs equipped with neural networks to predict longterm trend of noisy time series which has big jump properties and high probability distribution shift. Our contributions are, first, we use the phase space reconstruction method to extract intrinsic dimension of the time series data so as to determine the input structure for our forecasting model. Second, we explore SDEs driven by $\alpha$stable L\'evy motion to model the time series data and solve the problem through neural network approximation. Third, we construct the attention mechanism to achieve multitime step prediction. Finally, we illustrate our method by applying it to stock marketing time series prediction and show the results outperform several baseline deep learning models. 
Date:  2021–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2111.13164&r= 