|
on Econometrics |
By: | Francq, Christian; Zakoian, Jean-Michel |
Abstract: | It is generally admitted that many financial time series have heavy tailed marginal distributions. When time series models are fitted on such data, the non-existence of appropriate moments may invalidate standard statistical tools used for inference. Moreover, the existence of moments can be crucial for risk management, for instance when risk is measured through the expected shortfall. This paper considers testing the existence of moments in the framework of GARCH processes. While the second-order stationarity condition does not depend on the distribution of the innovation, higher-order moment conditions involve moments of the independent innovation process. We propose tests for the existence of high moments of the returns process which are based on the joint asymptotic distribution of the Quasi-Maximum Likelihood (QML) estimator of the volatility parameters and empirical moments of the residuals. A bootstrap procedure is proposed to improve the finite-sample performance of our test. To achieve efficiency gains we consider non Gaussian QML estimators founded on reparametrizations of the GARCH model, and we discuss optimality issues. Monte-Carlo experiments and an empirical study illustrate the asymptotic results. |
Keywords: | Conditional heteroskedasticity, Efficiency comparisons, Non-Gaussian QMLE, Residual Bootstrap, Stationarity tests |
JEL: | C12 C13 C22 |
Date: | 2019–12–02 |
URL: | http://d.repec.org/n?u=RePEc:pra:mprapa:98892&r=all |
By: | Giuseppe De Luca (University of Palermo); Jan R. Magnus (Vrije Universiteit Amsterdam); Franco Peracchi (Georgetown University) |
Abstract: | Many statistical and econometric learning methods rely on Bayesian ideas, often applied or reinterpreted in a frequentist setting. Two leading examples are shrinkage estimators and model averaging estimators, such as weighted-average least squares (WALS). In many instances, the accuracy of these learning methods in repeated samples is assessed using the variance of the posterior distribution of the parameters of interest given the data. This may be permissible when the sample size is large because, under the conditions of the Bernstein--von Mises theorem, the posterior variance agrees asymptotically with the frequentist variance. In finite samples, however, things are less clear. In this paper we explore this issue by first considering the frequentist properties (bias and variance) of the posterior mean in the important case of the normal location model, which consists of a single observation on a univariate Gaussian distribution with unknown mean and known variance. Based on these results, we derive new estimators of the frequentist bias and variance of the WALS estimator in finite samples. We then study the finite-sample performance of the proposed estimators by a Monte Carlo experiment with design derived from a real data application about the effect of abortion on crime rates. |
Keywords: | Normal location model, posterior moments and cumulants, higher-order delta method approximations, double-shrinkage estimators, WALS |
JEL: | C11 C13 C15 C52 I21 |
Date: | 2020–03–09 |
URL: | http://d.repec.org/n?u=RePEc:tin:wpaper:20200015&r=all |
By: | Jun Cai (Center for Policy Research, Maxwell School, Syracuse University, 426 Eggers Hall, Syracuse, NY 13244); William C. Horrace (Center for Policy Research, Maxwell School, Syracuse University, 426 Eggers Hall, Syracuse, NY 13244); Christopher F. Parmeter (Department of Economics, University of Miami) |
Abstract: | We consider density deconvolution with zero-mean Laplace errors in the context of an error component regression model. We adapt the minimax deconvolution methods of Meister (2006) to allow for unknown variance of the Laplace errors. We propose a semi-uniformly consistent deconvolution estimator for an ordinary smooth target density and a modified “variance truncation device" for the unknown Laplace error variance. We provide practical guidance for the choice of smoothness parameters of the target density. A simulation study and applications to a stochastic frontier model of US banks and a statistical measurement error model of daily saturated fat intake are provided. |
Keywords: | Efficiency Estimation, Laplace Distribution, Stochastic Frontier |
JEL: | C12 C14 C44 D24 |
Date: | 2020–03 |
URL: | http://d.repec.org/n?u=RePEc:max:cprwps:225&r=all |
By: | Ji Hyung Lee; Youngki Shin |
Abstract: | We propose a novel conditional quantile prediction method based on the complete subset averaging (CSA) for quantile regressions. All models under consideration are potentially misspecified and the dimension of regressors goes to infinity as the sample size increases. Since we average over the complete subsets, the number of models is much larger than the usual model averaging method which adopts sophisticated weighting schemes. We propose to use an equal weight but select the proper size of the complete subset based on the leave-one-out cross-validation method. Building upon the theory of Lu and Su (2015), we investigate the large sample properties of CSA and show the asymptotic optimality in the sense of Li (1987). We check the finite sample performance via Monte Carlo simulations and empirical applications. |
Keywords: | complete subset averaging; quantile regression; prediction; equal-weight; model averaging |
JEL: | C21 C52 C53 |
Date: | 2020–03 |
URL: | http://d.repec.org/n?u=RePEc:mcm:deptwp:2020-03&r=all |
By: | Lafférs, Lukáš (Department of Mathematics); Mellace, Giovanni (Department of Business and Economics) |
Abstract: | The stable unit treatment value assumption (SUTVA) ensures that only two potential outcomes exist and that one of them is observed for each individual. After providing new insights on SUTVA validity, we derive sharp bounds on the average treatment effect (ATE) of a binary treatment on a binary outcome as a function of the share of units, a, for which SUTVA is potentially violated. Then we show how to compute the maximum value of a such that the sign of the ATE is still identified. After decomposing SUTVA into two separate assumptions, we provide weaker conditions that might help sharpening our bounds. Furthermore, we show how some of our results can be extended to continuous outcomes. Finally, we estimate our bounds in two well known experiments, the U.S. Job Corps training program and the Colombian PACES vouchers for private schooling. |
Keywords: | SUTVA; Bounds; Average treatment effect; Sensitivity analysis |
JEL: | C14 C21 C31 |
Date: | 2020–03–04 |
URL: | http://d.repec.org/n?u=RePEc:hhs:sdueko:2020_003&r=all |
By: | Christian Garciga; Randal Verbrugge (Virginia Polytechnic Institute and State University) |
Abstract: | Most consistent estimators are what Müller (2007) terms “highly fragile”: prone to total breakdown in the presence of a handful of unusual data points. This compromises inference. Robust estimation is a (seldom-used) solution, but commonly used methods have drawbacks. In this paper, building on methods that are relatively unknown in economics, we provide a new tool for robust estimates of mean and covariance, useful both for robust estimation and for detection of unusual data points. It is relatively fast and useful for large data sets. Our performance testing indicates that our baseline method performs on par with, or better than, two of the currently best available methods, and that it works well on benchmark data sets. We also demonstrate that the issues we discuss are not merely hypothetical, by re-examining a prominent economic study and demonstrating its central results are driven by a set of unusual points. |
Keywords: | big data; machine learning; outlier identification; fragility; robust estimation; detMCD; RMVN |
JEL: | C3 C4 C5 |
Date: | 2020–03–05 |
URL: | http://d.repec.org/n?u=RePEc:fip:fedcwq:87580&r=all |
By: | Bertille Antoine (Simon Fraser University); Pascal Lavergne (Toulouse School of Economics) |
Abstract: | For a linear IV regression, we propose two new inference procedures on parameters of endogenous variables that are robust to any identification pattern, do not rely on a linear first-stage equation, and account for heteroskedasticity of unknown form. Building on Bierens (1982), we first propose an Integrated Conditional Moment (ICM) type statistic constructed by setting the parameters to the value under the null hypothesis. The ICM procedure tests at the same time the value of the coeffcient and the specification of the model. We then adopt a conditionality principle to condition on a set of ICM statistics that informs on identification strength. Our two procedures uniformly control size irrespective of identification strength. They are powerful irrespective of the nonlinear form of the link between instruments and endogenous variables and are competitive with existing procedures in simulations and applications. |
Keywords: | Weak Instruments, Hypothesis Testing, Semiparametric Model |
JEL: | C13 C21 |
Date: | 2020–03 |
URL: | http://d.repec.org/n?u=RePEc:sfu:sfudps:dp20-03&r=all |
By: | Philipp Baumann; Michael Schomaker; Enzo Rossi |
Abstract: | Whether a country's central bank independence (CBI) status has a lowering effect on inflation is a controversial hypothesis. To date, this question could not be answered satisfactorily because the complex macroeconomics structure that gives rise to the data has not been adequately incorporated into statistical analyses. We have developed a causal model that summarizes the economic process of inflation. Based on this causal model and recent data, we discuss and identify the assumptions under which the effect of CBI on inflation can be identified and estimated. Given these and alternative assumptions we estimate this effect using modern doubly robust effect estimators, i.e. longitudinal targeted maximum likelihood estimators. The estimation procedure incorporated machine learning algorithms and was tailored to address the challenges that come with complex longitudinal macroeconomics data. We could not find strong support for the hypothesis that a central bank that is independent over a long period of time necessarily lowers inflation. Simulation studies evaluate the sensitivity of the proposed methods in complex settings when assumptions are violated, and highlight the importance of working with appropriate learning algorithms for estimation. |
Date: | 2020–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2003.02208&r=all |
By: | Jiri Panos; Petr Polak |
Abstract: | This paper aims to introduce a contemporary, computing-power-driven approach to econometric modeling in a stress-testing framework. The presented approach explicitly takes into account model uncertainty of satellite models used for projecting forward paths of financial variables employing the constrained Bayesian model averaging (BMA) technique. The constrained BMA technique allows for selecting models with reasonably severe but plausible trajectories conditional on given macro-financial scenarios. It also ensures that the modeling is conducted in a sufficiently robust and prudential manner despite the limited time-series length for the explained and/or explanatory variables. |
Keywords: | Bayesian model averaging, model selection, model uncertainty, probability of default, stress testing |
JEL: | C11 C22 C51 C52 E58 G21 |
Date: | 2019–12 |
URL: | http://d.repec.org/n?u=RePEc:cnb:wpaper:2019/9&r=all |
By: | Vrugt, Jasper A.; Beven, Keith J. |
Abstract: | This essay illustrates some recent developments to the DiffeRential Evolution Adaptive Metropolis (DREAM) MATLAB toolbox of Vrugt, 2016 to delineate and sample the behavioural solution space of set-theoretic likelihood functions used within the GLUE (Limits of Acceptability) framework (Beven and Binley, 1992; Beven and Freer, 2001; Beven, 2006; Beven et al., 2014). This work builds on the DREAM (ABC) algorithm of Sadegh and Vrugt, 2014 and enhances significantly the accuracy and CPU-efficiency of Bayesian inference with GLUE. In particular it is shown how lack of adequate sampling in the model space might lead to unjustified model rejection. |
Keywords: | GLUE; Limits of Acceptability; Markov Chain Monte Carlo; Posterior Sampling; DREAM; DREAM(LOA); Sufficiency; Hydrological modelling |
JEL: | C1 |
Date: | 2018–04–01 |
URL: | http://d.repec.org/n?u=RePEc:ehl:lserod:87291&r=all |
By: | Markus Heinrich; Magnus Reif |
Abstract: | This paper provides a detailed assessment of the real-time forecast accuracy of a wide range of vector autoregressive models (VAR) that allow for both structural change and indicators sampled at different frequencies. We extend the literature by evaluating a mixed-frequency time-varying parameter VAR with stochastic volatility (MF-TVP-SV-VAR). Overall, the MF-TVP-SV-VAR delivers accurate now- and forecasts and, on average, outperforms its competitors. We assess the models’ accuracy relative to expert forecasts and show that the MF-TVP-SV-VAR delivers better inflation nowcasts in this regard. Using an optimal prediction pool, we moreover demonstrate that the MF-TVP-SV-VAR has gained importance since the Great Recession. |
Keywords: | time-varying parameters, forecasting, nowcasting, mixed-frequency models, Bayesian methods |
JEL: | C11 C53 C55 E32 |
Date: | 2020 |
URL: | http://d.repec.org/n?u=RePEc:ces:ceswps:_8054&r=all |
By: | Yoshimasa Uematsu; Takashi Yamagatay |
Abstract: | This paper proposes a novel estimation method for the weak factor models, a slightly stronger version of the approximate factor models of Chamberlain and Rothschild (1983), with large cross-sectional and time-series dimensions (N and T, respectively). It assumes that the kth largest eigenvalue of data covariance matrix grows proportionally to Nƒ¿k with unknown exponents 0 |
Date: | 2020–03 |
URL: | http://d.repec.org/n?u=RePEc:toh:dssraa:108&r=all |
By: | Helmut Farbmacher; Martin Huber; Henrika Langen; Martin Spindler |
Abstract: | This paper combines causal mediation analysis with double machine learning to control for observed confounders in a data-driven way under a selection-on-observables assumption in a high-dimensional setting. We consider the average indirect effect of a binary treatment operating through an intermediate variable (or mediator) on the causal path between the treatment and the outcome, as well as the unmediated direct effect. Estimation is based on efficient score functions, which possess a multiple robustness property w.r.t. misspecifications of the outcome, mediator, and treatment models. This property is key for selecting these models by double machine learning, which is combined with data splitting to prevent overfitting in the estimation of the effects of interest. We demonstrate that the direct and indirect effect estimators are asymptotically normal and root-n consistent under specific regularity conditions and investigate the finite sample properties of the suggested methods in a simulation study when considering lasso as machine learner. We also provide an empirical application to the U.S. National Longitudinal Survey of Youth, assessing the indirect effect of health insurance coverage on general health operating via routine checkups as mediator, as well as the direct effect. We find a moderate short term effect of health insurance coverage on general health which is, however, not mediated by routine checkups. |
Date: | 2020–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2002.12710&r=all |
By: | Anton Gerunov (Faculty of Economics and Business Administration, Sofia University ÒSt. Kliment Ohridski") |
Abstract: | This article investigates the performance of 136 different classification algorithms for economic problems of binary choice. They are applied to model five different choice situations Ð consumer acceptance during a direct marketing campaign, predicting default on credit card debt, credit scoring, forecasting firm insolvency, and modelling online consumer purchases. Algorithms are trained to generate class predictions of a given binary target variable, which are then used to measure their forecast accuracy using the area under a ROC curve. Results show that algorithms of the Random Forest family consistently outperform alternative methods and may be thus suitable for modelling a wide range of discrete choice situations. |
Keywords: | Bdiscrete choice, classification, machine learning algorithms, modelling decisions. |
JEL: | C35 C44 C45 D81 |
Date: | 2020–03 |
URL: | http://d.repec.org/n?u=RePEc:sko:wpaper:bep-2020-02&r=all |
By: | Knaus, Michael C. |
Abstract: | This paper consolidates recent methodological developments based on Double Machine Learning (DML) with a focus on program evaluation under unconfoundedness. DML based methods leverage flexible prediction methods to control for confounding in the estimation of (i) standard average effects, (ii) different forms of heterogeneous effects, and (iii) optimal treatment assignment rules. We emphasize that these estimators build all on the same doubly robust score, which allows to utilize computational synergies. An evaluation of multiple programs of the Swiss Active Labor Market Policy shows how DML based methods enable a comprehensive policy analysis. However, we find evidence that estimates of individualized heterogeneous effects can become unstable. |
Keywords: | Causal machine learning, conditional average treatment effects, optimal policy learning, individualized treatment rules, multiple treatments |
JEL: | C21 |
Date: | 2020–03 |
URL: | http://d.repec.org/n?u=RePEc:usg:econwp:2020:04&r=all |
By: | Duc P. Truong; Erik Skau; Vladimir I. Valtchinov; Boian S. Alexandrov |
Abstract: | Currently, high-dimensional data is ubiquitous in data science, which necessitates the development of techniques to decompose and interpret such multidimensional (aka tensor) datasets. Finding a low dimensional representation of the data, that is, its inherent structure, is one of the approaches that can serve to understand the dynamics of low dimensional latent features hidden in the data. Nonnegative RESCAL is one such technique, particularly well suited to analyze self-relational data, such as dynamic networks found in international trade flows. Nonnegative RESCAL computes a low dimensional tensor representation by finding the latent space containing multiple modalities. Estimating the dimensionality of this latent space is crucial for extracting meaningful latent features. Here, to determine the dimensionality of the latent space with nonnegative RESCAL, we propose a latent dimension determination method which is based on clustering of the solutions of multiple realizations of nonnegative RESCAL decompositions. We demonstrate the performance of our model selection method on synthetic data and then we apply our method to decompose a network of international trade flows data from International Monetary Fund and validate the resulting features against empirical facts from economic literature. |
Date: | 2020–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2003.00129&r=all |