nep-ecm 2020-03-16 papers

on Econometrics

Issue of 2020‒03‒16
sixteen papers chosen by
Sune Karlsson
Örebro universitet

Testing the existence of moments for GARCH processes By Francq, Christian; Zakoian, Jean-Michel
Sampling properties of the Bayesian posterior mean with an application to WALS estimation By Giuseppe De Luca; Jan R. Magnus; Franco Peracchi
Density Deconvolution with Laplace Errors and Unknown Variance By Jun Cai; William C. Horrace; Christopher F. Parmeter
Complete Subset Averaging for Quantile Regressions By Ji Hyung Lee; Youngki Shin
Identification of the average treatment effect when SUTVA is violated By Lafférs, Lukáš; Mellace, Giovanni
A New Tool for Robust Estimation and Identification of Unusual Data Points By Christian Garciga; Randal Verbrugge
Identification-Robust Nonparametric Interference in a Linear IV Model By Bertille Antoine; Pascal Lavergne
Estimating the Effect of Central Bank Independence on Inflation Using Longitudinal Targeted Maximum Likelihood Estimation By Philipp Baumann; Michael Schomaker; Enzo Rossi
How to Improve the Model Selection Procedure in a Stress-testing Framework By Jiri Panos; Petr Polak
Embracing equifinality with efficiency : limits of acceptability sampling using the DREAM(LOA) algorithm By Vrugt, Jasper A.; Beven, Keith J.
Real-Time Forecasting Using Mixed-Frequency VARS with Time-Varying Parameters By Markus Heinrich; Magnus Reif
Estimation of Weak Factor Models By Yoshimasa Uematsu; Takashi Yamagatay
Causal mediation analysis with double machine learning By Helmut Farbmacher; Martin Huber; Henrika Langen; Martin Spindler
Binary Classification Problems in Economics and 136 Different Ways to Solve Them By Anton Gerunov
Double Machine Learning based Program Evaluation under Unconfoundedness By Knaus, Michael C.
Determination of Latent Dimensionality in International Trade Flow By Duc P. Truong; Erik Skau; Vladimir I. Valtchinov; Boian S. Alexandrov

Testing the existence of moments for GARCH processes

By:	Francq, Christian; Zakoian, Jean-Michel
Abstract:	It is generally admitted that many financial time series have heavy tailed marginal distributions. When time series models are fitted on such data, the non-existence of appropriate moments may invalidate standard statistical tools used for inference. Moreover, the existence of moments can be crucial for risk management, for instance when risk is measured through the expected shortfall. This paper considers testing the existence of moments in the framework of GARCH processes. While the second-order stationarity condition does not depend on the distribution of the innovation, higher-order moment conditions involve moments of the independent innovation process. We propose tests for the existence of high moments of the returns process which are based on the joint asymptotic distribution of the Quasi-Maximum Likelihood (QML) estimator of the volatility parameters and empirical moments of the residuals. A bootstrap procedure is proposed to improve the finite-sample performance of our test. To achieve efficiency gains we consider non Gaussian QML estimators founded on reparametrizations of the GARCH model, and we discuss optimality issues. Monte-Carlo experiments and an empirical study illustrate the asymptotic results.
Keywords:	Conditional heteroskedasticity, Efficiency comparisons, Non-Gaussian QMLE, Residual Bootstrap, Stationarity tests
JEL:	C12 C13 C22
Date:	2019–12–02
URL:	http://d.repec.org/n?u=RePEc:pra:mprapa:98892&r=all

Sampling properties of the Bayesian posterior mean with an application to WALS estimation

By:	Giuseppe De Luca (University of Palermo); Jan R. Magnus (Vrije Universiteit Amsterdam); Franco Peracchi (Georgetown University)
Abstract:	Many statistical and econometric learning methods rely on Bayesian ideas, often applied or reinterpreted in a frequentist setting. Two leading examples are shrinkage estimators and model averaging estimators, such as weighted-average least squares (WALS). In many instances, the accuracy of these learning methods in repeated samples is assessed using the variance of the posterior distribution of the parameters of interest given the data. This may be permissible when the sample size is large because, under the conditions of the Bernstein--von Mises theorem, the posterior variance agrees asymptotically with the frequentist variance. In finite samples, however, things are less clear. In this paper we explore this issue by first considering the frequentist properties (bias and variance) of the posterior mean in the important case of the normal location model, which consists of a single observation on a univariate Gaussian distribution with unknown mean and known variance. Based on these results, we derive new estimators of the frequentist bias and variance of the WALS estimator in finite samples. We then study the finite-sample performance of the proposed estimators by a Monte Carlo experiment with design derived from a real data application about the effect of abortion on crime rates.
Keywords:	Normal location model, posterior moments and cumulants, higher-order delta method approximations, double-shrinkage estimators, WALS
JEL:	C11 C13 C15 C52 I21
Date:	2020–03–09
URL:	http://d.repec.org/n?u=RePEc:tin:wpaper:20200015&r=all

Density Deconvolution with Laplace Errors and Unknown Variance

By:	Jun Cai (Center for Policy Research, Maxwell School, Syracuse University, 426 Eggers Hall, Syracuse, NY 13244); William C. Horrace (Center for Policy Research, Maxwell School, Syracuse University, 426 Eggers Hall, Syracuse, NY 13244); Christopher F. Parmeter (Department of Economics, University of Miami)
Abstract:	We consider density deconvolution with zero-mean Laplace errors in the context of an error component regression model. We adapt the minimax deconvolution methods of Meister (2006) to allow for unknown variance of the Laplace errors. We propose a semi-uniformly consistent deconvolution estimator for an ordinary smooth target density and a modified “variance truncation device" for the unknown Laplace error variance. We provide practical guidance for the choice of smoothness parameters of the target density. A simulation study and applications to a stochastic frontier model of US banks and a statistical measurement error model of daily saturated fat intake are provided.
Keywords:	Efficiency Estimation, Laplace Distribution, Stochastic Frontier
JEL:	C12 C14 C44 D24
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:max:cprwps:225&r=all

Complete Subset Averaging for Quantile Regressions

By:	Ji Hyung Lee; Youngki Shin
Abstract:	We propose a novel conditional quantile prediction method based on the complete subset averaging (CSA) for quantile regressions. All models under consideration are potentially misspecified and the dimension of regressors goes to infinity as the sample size increases. Since we average over the complete subsets, the number of models is much larger than the usual model averaging method which adopts sophisticated weighting schemes. We propose to use an equal weight but select the proper size of the complete subset based on the leave-one-out cross-validation method. Building upon the theory of Lu and Su (2015), we investigate the large sample properties of CSA and show the asymptotic optimality in the sense of Li (1987). We check the finite sample performance via Monte Carlo simulations and empirical applications.
Keywords:	complete subset averaging; quantile regression; prediction; equal-weight; model averaging
JEL:	C21 C52 C53
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:mcm:deptwp:2020-03&r=all

Identification of the average treatment effect when SUTVA is violated

By:	Lafférs, Lukáš (Department of Mathematics); Mellace, Giovanni (Department of Business and Economics)
Abstract:	The stable unit treatment value assumption (SUTVA) ensures that only two potential outcomes exist and that one of them is observed for each individual. After providing new insights on SUTVA validity, we derive sharp bounds on the average treatment effect (ATE) of a binary treatment on a binary outcome as a function of the share of units, a, for which SUTVA is potentially violated. Then we show how to compute the maximum value of a such that the sign of the ATE is still identified. After decomposing SUTVA into two separate assumptions, we provide weaker conditions that might help sharpening our bounds. Furthermore, we show how some of our results can be extended to continuous outcomes. Finally, we estimate our bounds in two well known experiments, the U.S. Job Corps training program and the Colombian PACES vouchers for private schooling.
Keywords:	SUTVA; Bounds; Average treatment effect; Sensitivity analysis
JEL:	C14 C21 C31
Date:	2020–03–04
URL:	http://d.repec.org/n?u=RePEc:hhs:sdueko:2020_003&r=all

A New Tool for Robust Estimation and Identification of Unusual Data Points

By:	Christian Garciga; Randal Verbrugge (Virginia Polytechnic Institute and State University)
Abstract:	Most consistent estimators are what Müller (2007) terms “highly fragile”: prone to total breakdown in the presence of a handful of unusual data points. This compromises inference. Robust estimation is a (seldom-used) solution, but commonly used methods have drawbacks. In this paper, building on methods that are relatively unknown in economics, we provide a new tool for robust estimates of mean and covariance, useful both for robust estimation and for detection of unusual data points. It is relatively fast and useful for large data sets. Our performance testing indicates that our baseline method performs on par with, or better than, two of the currently best available methods, and that it works well on benchmark data sets. We also demonstrate that the issues we discuss are not merely hypothetical, by re-examining a prominent economic study and demonstrating its central results are driven by a set of unusual points.
Keywords:	big data; machine learning; outlier identification; fragility; robust estimation; detMCD; RMVN
JEL:	C3 C4 C5
Date:	2020–03–05
URL:	http://d.repec.org/n?u=RePEc:fip:fedcwq:87580&r=all

Identification-Robust Nonparametric Interference in a Linear IV Model

By:	Bertille Antoine (Simon Fraser University); Pascal Lavergne (Toulouse School of Economics)
Abstract:	For a linear IV regression, we propose two new inference procedures on parameters of endogenous variables that are robust to any identification pattern, do not rely on a linear first-stage equation, and account for heteroskedasticity of unknown form. Building on Bierens (1982), we first propose an Integrated Conditional Moment (ICM) type statistic constructed by setting the parameters to the value under the null hypothesis. The ICM procedure tests at the same time the value of the coeffcient and the specification of the model. We then adopt a conditionality principle to condition on a set of ICM statistics that informs on identification strength. Our two procedures uniformly control size irrespective of identification strength. They are powerful irrespective of the nonlinear form of the link between instruments and endogenous variables and are competitive with existing procedures in simulations and applications.
Keywords:	Weak Instruments, Hypothesis Testing, Semiparametric Model
JEL:	C13 C21
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:sfu:sfudps:dp20-03&r=all

Estimating the Effect of Central Bank Independence on Inflation Using Longitudinal Targeted Maximum Likelihood Estimation

By:	Philipp Baumann; Michael Schomaker; Enzo Rossi
Abstract:	Whether a country's central bank independence (CBI) status has a lowering effect on inflation is a controversial hypothesis. To date, this question could not be answered satisfactorily because the complex macroeconomics structure that gives rise to the data has not been adequately incorporated into statistical analyses. We have developed a causal model that summarizes the economic process of inflation. Based on this causal model and recent data, we discuss and identify the assumptions under which the effect of CBI on inflation can be identified and estimated. Given these and alternative assumptions we estimate this effect using modern doubly robust effect estimators, i.e. longitudinal targeted maximum likelihood estimators. The estimation procedure incorporated machine learning algorithms and was tailored to address the challenges that come with complex longitudinal macroeconomics data. We could not find strong support for the hypothesis that a central bank that is independent over a long period of time necessarily lowers inflation. Simulation studies evaluate the sensitivity of the proposed methods in complex settings when assumptions are violated, and highlight the importance of working with appropriate learning algorithms for estimation.
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2003.02208&r=all

How to Improve the Model Selection Procedure in a Stress-testing Framework

By:	Jiri Panos; Petr Polak
Abstract:	This paper aims to introduce a contemporary, computing-power-driven approach to econometric modeling in a stress-testing framework. The presented approach explicitly takes into account model uncertainty of satellite models used for projecting forward paths of financial variables employing the constrained Bayesian model averaging (BMA) technique. The constrained BMA technique allows for selecting models with reasonably severe but plausible trajectories conditional on given macro-financial scenarios. It also ensures that the modeling is conducted in a sufficiently robust and prudential manner despite the limited time-series length for the explained and/or explanatory variables.
Keywords:	Bayesian model averaging, model selection, model uncertainty, probability of default, stress testing
JEL:	C11 C22 C51 C52 E58 G21
Date:	2019–12
URL:	http://d.repec.org/n?u=RePEc:cnb:wpaper:2019/9&r=all

Embracing equifinality with efficiency : limits of acceptability sampling using the DREAM(LOA) algorithm

By:	Vrugt, Jasper A.; Beven, Keith J.
Abstract:	This essay illustrates some recent developments to the DiffeRential Evolution Adaptive Metropolis (DREAM) MATLAB toolbox of Vrugt, 2016 to delineate and sample the behavioural solution space of set-theoretic likelihood functions used within the GLUE (Limits of Acceptability) framework (Beven and Binley, 1992; Beven and Freer, 2001; Beven, 2006; Beven et al., 2014). This work builds on the DREAM (ABC) algorithm of Sadegh and Vrugt, 2014 and enhances significantly the accuracy and CPU-efficiency of Bayesian inference with GLUE. In particular it is shown how lack of adequate sampling in the model space might lead to unjustified model rejection.
Keywords:	GLUE; Limits of Acceptability; Markov Chain Monte Carlo; Posterior Sampling; DREAM; DREAM(LOA); Sufficiency; Hydrological modelling
JEL:	C1
Date:	2018–04–01
URL:	http://d.repec.org/n?u=RePEc:ehl:lserod:87291&r=all

Real-Time Forecasting Using Mixed-Frequency VARS with Time-Varying Parameters

By:	Markus Heinrich; Magnus Reif
Abstract:	This paper provides a detailed assessment of the real-time forecast accuracy of a wide range of vector autoregressive models (VAR) that allow for both structural change and indicators sampled at different frequencies. We extend the literature by evaluating a mixed-frequency time-varying parameter VAR with stochastic volatility (MF-TVP-SV-VAR). Overall, the MF-TVP-SV-VAR delivers accurate now- and forecasts and, on average, outperforms its competitors. We assess the models’ accuracy relative to expert forecasts and show that the MF-TVP-SV-VAR delivers better inflation nowcasts in this regard. Using an optimal prediction pool, we moreover demonstrate that the MF-TVP-SV-VAR has gained importance since the Great Recession.
Keywords:	time-varying parameters, forecasting, nowcasting, mixed-frequency models, Bayesian methods
JEL:	C11 C53 C55 E32
Date:	2020
URL:	http://d.repec.org/n?u=RePEc:ces:ceswps:_8054&r=all

Estimation of Weak Factor Models

By:	Yoshimasa Uematsu; Takashi Yamagatay
Abstract:	This paper proposes a novel estimation method for the weak factor models, a slightly stronger version of the approximate factor models of Chamberlain and Rothschild (1983), with large cross-sectional and time-series dimensions (N and T, respectively). It assumes that the kth largest eigenvalue of data covariance matrix grows proportionally to Nƒ¿k with unknown exponents 0
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:toh:dssraa:108&r=all

Causal mediation analysis with double machine learning

By:	Helmut Farbmacher; Martin Huber; Henrika Langen; Martin Spindler
Abstract:	This paper combines causal mediation analysis with double machine learning to control for observed confounders in a data-driven way under a selection-on-observables assumption in a high-dimensional setting. We consider the average indirect effect of a binary treatment operating through an intermediate variable (or mediator) on the causal path between the treatment and the outcome, as well as the unmediated direct effect. Estimation is based on efficient score functions, which possess a multiple robustness property w.r.t. misspecifications of the outcome, mediator, and treatment models. This property is key for selecting these models by double machine learning, which is combined with data splitting to prevent overfitting in the estimation of the effects of interest. We demonstrate that the direct and indirect effect estimators are asymptotically normal and root-n consistent under specific regularity conditions and investigate the finite sample properties of the suggested methods in a simulation study when considering lasso as machine learner. We also provide an empirical application to the U.S. National Longitudinal Survey of Youth, assessing the indirect effect of health insurance coverage on general health operating via routine checkups as mediator, as well as the direct effect. We find a moderate short term effect of health insurance coverage on general health which is, however, not mediated by routine checkups.
Date:	2020–02
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2002.12710&r=all

Binary Classification Problems in Economics and 136 Different Ways to Solve Them

By:	Anton Gerunov (Faculty of Economics and Business Administration, Sofia University ÒSt. Kliment Ohridski")
Abstract:	This article investigates the performance of 136 different classification algorithms for economic problems of binary choice. They are applied to model five different choice situations Ð consumer acceptance during a direct marketing campaign, predicting default on credit card debt, credit scoring, forecasting firm insolvency, and modelling online consumer purchases. Algorithms are trained to generate class predictions of a given binary target variable, which are then used to measure their forecast accuracy using the area under a ROC curve. Results show that algorithms of the Random Forest family consistently outperform alternative methods and may be thus suitable for modelling a wide range of discrete choice situations.
Keywords:	Bdiscrete choice, classification, machine learning algorithms, modelling decisions.
JEL:	C35 C44 C45 D81
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:sko:wpaper:bep-2020-02&r=all

Double Machine Learning based Program Evaluation under Unconfoundedness

By:	Knaus, Michael C.
Abstract:	This paper consolidates recent methodological developments based on Double Machine Learning (DML) with a focus on program evaluation under unconfoundedness. DML based methods leverage flexible prediction methods to control for confounding in the estimation of (i) standard average effects, (ii) different forms of heterogeneous effects, and (iii) optimal treatment assignment rules. We emphasize that these estimators build all on the same doubly robust score, which allows to utilize computational synergies. An evaluation of multiple programs of the Swiss Active Labor Market Policy shows how DML based methods enable a comprehensive policy analysis. However, we find evidence that estimates of individualized heterogeneous effects can become unstable.
Keywords:	Causal machine learning, conditional average treatment effects, optimal policy learning, individualized treatment rules, multiple treatments
JEL:	C21
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:usg:econwp:2020:04&r=all

Determination of Latent Dimensionality in International Trade Flow

By:	Duc P. Truong; Erik Skau; Vladimir I. Valtchinov; Boian S. Alexandrov
Abstract:	Currently, high-dimensional data is ubiquitous in data science, which necessitates the development of techniques to decompose and interpret such multidimensional (aka tensor) datasets. Finding a low dimensional representation of the data, that is, its inherent structure, is one of the approaches that can serve to understand the dynamics of low dimensional latent features hidden in the data. Nonnegative RESCAL is one such technique, particularly well suited to analyze self-relational data, such as dynamic networks found in international trade flows. Nonnegative RESCAL computes a low dimensional tensor representation by finding the latent space containing multiple modalities. Estimating the dimensionality of this latent space is crucial for extracting meaningful latent features. Here, to determine the dimensionality of the latent space with nonnegative RESCAL, we propose a latent dimension determination method which is based on clustering of the solutions of multiple realizations of nonnegative RESCAL decompositions. We demonstrate the performance of our model selection method on synthetic data and then we apply our method to decompose a network of international trade flows data from International Monetary Fund and validate the resulting features against empirical facts from economic literature.
Date:	2020–02
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2003.00129&r=all

This nep-ecm issue is ©2020 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.