Econometrics
http://lists.repec.org/mailman/listinfo/nep-ecm
Econometrics
2019-06-24
Saddlepoint Approximations for Spatial Panel Data Models
http://d.repec.org/n?u=RePEc:chf:rpseri:rp1918&r=ecm
We develop new higher-order asymptotic techniques for the Gaussian maximum likelihood estimator of the parameters in a spatial panel data model, with fixed effects, time-varying covariates, and spatially correlated errors. We introduce a new saddlepoint density and tail area approximation to improve on the accuracy of the extant asymptotics. It features relative error of order O(m to the power of -1) for m = n(T -1) with n being the cross-sectional dimension and T the time-series dimension. The main theoretical tool is the tilted-Edgeworth technique. It yields a density approximation that is always non-negative, does not need resampling, and is accurate in the tails. We provide an algorithm to implement our saddlepoint approximation and we illustrate the good performance of our method via numerical examples. Monte Carlo experiments show that, for the spatial panel data model with fixed effects and T = 2, the saddlepoint approximation yields accuracy improvements over the routinely applied first-order asymptotics and Edgeworth expansions, in small to moderate sample sizes, while preserving analytical tractability. An empirical application on the investment-saving relationship in OECD countries shows disagreement between testing results based on first-order asymptotics and saddlepoint techniques, which questions some implications based on the former.
Chaonan Jiang
Davide La Vecchia
Elvezio Ronchetti
O. Scaillet
Spatial statistics, Panel data, Small samples, Saddlepoint approximation
2019-03
The Confidence Interval Method for Selecting Valid Instrumental Variables
http://d.repec.org/n?u=RePEc:bri:uobdis:19/715&r=ecm
We propose a new method, the conÂ…dence interval (CI) method, to select valid instruments from a set of potential instruments that may contain invalid ones, for instrumental variables estimation of the causal effect of an exposure on an outcome. Invalid instruments are such that they fail the exclusion restriction and enter the model as explanatory variables. The CI method is based on the conÂ…dence intervals of the per instrument causal effects estimates. Each instrument speciÂ…fic causal effect estimate is obtained whilst treating all other instruments as invalid. The CI method selects the largest group with all conÂ…dence intervals overlapping with each other as the set of valid instruments. Under a plurality rule, we show that the resulting IV, or two-stage least squares (2SLS) estimator has oracle properties, meaning that it has the same limiting distribution as the oracle 2SLS estimator with the set of invalid instruments known. This result is the same as for the hard thresholding with voting (HT) method of Guo et al. (2018). Unlike the HT method, the number of instruments selected as valid by the CI method is guaranteed to be monotonically decreasing for decreasing values of the tuning parameter, which determines the width of the conÂ…dence intervals. For the CI method, we can therefore use a downward testing procedure based on the Sargan test for overidentifying restrictions. We Â…find in a simulation design similar to that of Guo et al. (2018) better properties for the CI method based estimation and inference than for the HT method and in an application of the effect of BMI on blood pressure that the CI method is better able to detect invalid instruments.
Frank Windmeijer
Xiaoran Liang
Fernando P Hartwig
Jack Bowden
Causal inference; Instrumental variables; Invalid instruments
2019-06-17
Estimation and Inference for Multi-dimensional Heterogeneous Panel Datasets with Hierarchical Multi-factor Error Structure
http://d.repec.org/n?u=RePEc:bai:series:series_wp_03-2019&r=ecm
Given the growing availability of large datasets and following recent research trends on multi-dimensional modelling, we develop three dimensional (3D) panel data models with hierarchical error components that allow for strong cross-sectional dependence through unobserved heterogeneous global and local factors. We propose consistent estimation procedures by extending the common correlated effects (CCE) estimation approach proposed by Pesaran (2006). The standard CCE approach needs to be modified in order to account for the hierarchical factor structure in 3D panels. Further, we provide the associated asymptotic theory, including new nonparametric variance estimators. The validity of the proposed approach is confirmed by Monte Carlo simulation studies. We also demonstrate the empirical usefulness of the proposed approach through an application to a 3D panel gravity model of bilateral export flows.
George Kapetanios
Laura Serlenga
Yongcheol Shin
Multi-dimensional Panel Data Models, Cross-sectional Error Dependence, Unobserved Heterogeneous Global and Local Factors, Multilateral Resistance, The Gravity Model of Bilateral Export Flows
2019-06
The multivariate simultaneous unobserved components model and identification via heteroskedasticity
http://d.repec.org/n?u=RePEc:uts:ecowps:2019/08&r=ecm
We propose a multivariate simultaneous unobserved components framework to determine the two-sided interactions between structural trend and cycle innovations. We relax the standard assumption in unobserved components models that trends are only driven by permanent shocks and cycles are only driven by transitory shocks by considering the possible spillover effects between structural innovations. The direction of spillover has a structural interpretation, whose identification is achieved via heteroskedasticity. We provide identifiability conditions and develop an efficient Bayesian MCMC procedure for estimation. Empirical implementations for both Okun’s law and the Phillips curve show evidence of significant spillovers between trend and cycle components.
Mengheng Li
Ivan Mendieta-Munoz
Unobserved components; identification via heteroskedasticity; trends and cycles; permanent and transitory shocks; state space models; spillover structural effects
2019-06-04
Posterior Average Effects
http://d.repec.org/n?u=RePEc:arx:papers:1906.06360&r=ecm
Economists are often interested in computing averages with respect to a distribution of unobservables. Examples are moments or distributions of individual fixed-effects, average partial effects in discrete choice models, or counterfactual policy simulations based on a structural model. We consider posterior estimators of such effects, where the average is computed conditional on the observation sample. While in various settings it is common to "shrink" individual estimates -- e.g., of teacher value-added or hospital quality -- toward a common mean to reduce estimation noise, a study of the frequentist properties of posterior average estimators is lacking. We establish two robustness properties of posterior estimators under misspecification of the assumed distribution of unobservables: they are optimal in terms of local worst-case bias, and their global bias is no larger than twice the minimum worst-case bias that can be achieved within a large class of estimators. These results provide a theoretical foundation for the use of posterior average estimators. In addition, our theory suggests a simple measure of the information contained in the posterior conditioning. For illustration, we consider two empirical settings: the estimation of the distribution of neighborhood effects in the US, and the estimation of the densities of permanent and transitory components in a model of income dynamics.
St\'ephane Bonhomme
Martin Weidner
2019-06
A Correction for Regression Discontinuity Designs with Group-Specific Mismeasurement of the Running Variable
http://d.repec.org/n?u=RePEc:iza:izadps:dp12366&r=ecm
When the running variable in a regression discontinuity (RD) design is measured with error, identification of the local average treatment effect of interest will typically fail. While the form of this measurement error varies across applications, in many cases the measurement error structure is heterogeneous across different groups of observations. We develop a novel measurement error correction procedure capable of addressing heterogeneous mismeasurement structures by leveraging auxiliary information. We also provide adjusted asymptotic variance and standard errors that take into consideration the variability introduced by the estimation of nuisance parameters, and honest confidence intervals that account for potential misspecification. Simulations provide evidence that the proposed procedure corrects the bias introduced by heterogeneous measurement error and achieves empirical coverage closer to nominal test size than "naïve" alternatives. Two empirical illustrations demonstrate that correcting for measurement error can either reinforce the results of a study or provide a new empirical perspective on the data.
Bartalotti, Otávio
Brummet, Quentin
Dieterle, Steven G.
nonclassical measurement error, regression discontinuity, heterogeneous measurement error
2019-05
High-Dimensional Functional Factor Models
http://d.repec.org/n?u=RePEc:eca:wpaper:2013/288340&r=ecm
In this paper, we set up the theoretical foundations for a high-dimensional functional factor model approach in the analysis of large panels of functional time series (FTS). We first establish a representation result stating that if the first r eigenvalues of the covariance operator of a cross-section of N FTS are unbounded as N diverges and if the (r + 1) th one is bounded, then we can represent each FTS as a sum of a common component driven by r factors, common to (almost) all the series, and a weakly cross-correlated idiosyncratic component (all the eigenvalues of the idiosyncratic covariance operator are bounded as N !1). Our model and theory are developed in a general Hilbert space setting that allows for panels mixing functional and scalar time series. We then turn to the estimation of the factors, their loadings, and the common components. We derive consistency results in the asymptotic regime where the number N of series and the number T of time observations diverge, thus exemplifying the “blessing of dimensionality” that explains the success of factor models in the context of high-dimensional (scalar) time series. Our results encompass the scalar case, for which they reproduce and extend, under weaker conditions, well-established results (Bai & Ng 2002).We provide numerical illustrations that corroborate the convergence rates predicted by the theory, and provide finer understanding of the interplay between N and T for estimation purposes. We conclude with an empirical illustration on a dataset of intraday S&P100 and Eurostoxx 50 stock returns, along with their scalar overnight returns.
Marc Hallin
Gilles Nisol
Shahin Tavakoli
Functional time series, High-dimensional time series, Factor model, Panel data, Functional data analysis..
2019-06
Uniform Consistency of Marked and Weighted Empirical Distributions of Residuals
http://d.repec.org/n?u=RePEc:aah:create:2019-12&r=ecm
A uniform weak consistency theory is presented for the marked and weighted empirical distribution function of residuals. New and weaker sufficient conditions for uniform consistency are derived. The theory allows for a wide variety of regressors and error distributions. We apply the theory to 1-step Huber-skip estimators. These estimators describe the widespread practice of removing outlying observations from an intial estimation of the model of interest and updating the estimation in a second step by applying least squares to the selected observations. Two results are presented. First, we give new and weaker conditions for consistency of the estimators. Second, we analyze the gauge, which is the rate of false detection of outliers, and which can be used to decide the cut-off in the rule for selecting outliers.
Vanessa Berenguer-Rico
Søren Johansen
Bent Nielsen
1-step Huber skip, Asymptotic theory, Empirical processes, Gauge, Marked and Weighted Empirical processes, Non-stationarity, Robust Statistics, Stationarity.
2019-05-24
Detecting p-hacking
http://d.repec.org/n?u=RePEc:arx:papers:1906.06711&r=ecm
We analyze what can be learned from tests for p-hacking based on distributions of t-statistics and p-values across multiple studies. We analytically characterize restrictions on these distributions that conform with the absence of p-hacking. This forms a testable null hypothesis and suggests statistical tests for p-hacking. We extend our results to p-hacking when there is also publication bias, and also consider what types of distributions arise under the alternative hypothesis that researchers engage in p-hacking. We show that the power of statistical tests for detecting p-hacking is low even if p-hacking is quite prevalent.
Graham Elliott
Nikolay Kudrin
Kaspar Wuthrich
2019-06
On the Properties of the Synthetic Control Estimator with Many Periods and Many Controls
http://d.repec.org/n?u=RePEc:arx:papers:1906.06665&r=ecm
We consider the asymptotic properties of the Synthetic Control (SC) estimator when both the number of pre-treatment periods and control units are large. If potential outcomes follow a linear factor model, we provide conditions under which the factor loadings of the SC unit converge in probability to the factor loadings of the treated unit. This happens when there are weights diluted among many control units such that a weighted average of the factor loadings of the control units reconstructs the factor loadings of the treated unit. In this case, the SC estimator is asymptotically unbiased even when treatment assignment is correlated with time-varying unobservables. This result can be valid even when the number of control units is larger than the number of pre-treatment periods.
Bruno Ferman
2019-06
lpdensity: Local Polynomial Density Estimation and Inference
http://d.repec.org/n?u=RePEc:arx:papers:1906.06529&r=ecm
Density estimation and inference methods are widely used in empirical work. When the data has compact support, as all empirical applications de facto do, conventional kernel-based density estimators are inapplicable near or at the boundary because of their well known boundary bias. Alternative smoothing methods are available to handle boundary points in density estimation, but they all require additional tuning parameter choices or other typically ad hoc modifications depending on the evaluation point and/or approach considered. This article discusses the R and Stata package lpdensity implementing a novel local polynomial density estimator proposed in Cattaneo, Jansson and Ma (2019), which is boundary adaptive, fully data-driven and automatic, and requires only the choice of one tuning parameter. The methods implemented also cover local polynomial estimation of the cumulative distribution function and density derivatives, as well as several other theoretical and methodological results. In addition to point estimation and graphical procedures, the package offers consistent variance estimators, mean squared error optimal bandwidth selection, and robust bias-corrected inference. A comparison with several other density estimation packages and functions available in R using a Monte Carlo experiment is provided.
Matias D. Cattaneo
Michael Jansson
Xinwei Ma
2019-06
Statistical Tests for Cross-Validation of Kriging Models
http://d.repec.org/n?u=RePEc:tiu:tiucen:35fba511-2931-47d5-a9ba-30b1229e9093&r=ecm
We derive new statistical tests for leave-one-out cross-validation of Kriging models. Graphically, we present these tests as scatterplots augmented with confi…dence intervals. We may wish to avoid extrapolation, which we de…fine as prediction of the output for a point that is a vertex of the convex hull of the given input combinations. Moreover, we may use bootstrapping to estimate the true variance of the Kriging predictor. The resulting tests (with or without extrapolation or bootstrapping) have type-I and type-II error probabilities, which we estimate through Monte Carlo experiments. To illustrate the application of our tests, we use an example with two inputs and the popular borehole example with eight inputs.
Kleijnen, Jack
van Beers, W.C.M.
validation; cross-validation; Kriging; Gaussian process; extrapolation; convex hull; Monte Carlo Technique
2019
Online Block Layer Decomposition schemes for training Deep Neural Networks
http://d.repec.org/n?u=RePEc:aeg:report:2019-06&r=ecm
Deep Feedforward Neural Networks' (DFNNs) weights estimation relies on the solution of a very large nonconvex optimization problem that may have many local (no global) minimizers, saddle points and large plateaus. Furthermore, the time needed to find good solutions to the training problem heavily depends on both the number of samples and the number of weights (variables). In this work, we show how Block Coordinate Descent (BCD) methods can be applied to improve the performance of state-of-the-art algorithms by avoiding bad stationary points and flat regions. We first describe a batch BCD method able to effectively tackle difficulties due to the network's depth; then we further extend the algorithm proposing an online BCD scheme able to scale with respect to both the number of variables and the number of samples. We perform extensive numerical results on standard datasets using different deep networks, and we showed how the application of (online) BCD methods to the training phase of DFNNs permits to outperform standard batch/online algorithms leading to an improvement on both the training phase and the generalization performance of the networks.
Laura Palagi
Ruggiero Seccia
Deep Feedforward Neural Networks ; Block coordinate decomposition ; Online Optimization ; Large scale optimization
2019
Nonparametric estimation in a regression model with additive and multiplicative noise
http://d.repec.org/n?u=RePEc:arx:papers:1906.07695&r=ecm
In this paper, we consider an unknown functional estimation problem in a general nonparametric regression model with the characteristic of having both multiplicative and additive noise. We propose two wavelet estimators, which, to our knowledge, are new in this general context. We prove that they achieve fast convergence rates under the mean integrated square error over Besov spaces. The rates obtained have the particularity of being established under weak conditions on the model. A numerical study in a context comparable to stochastic frontier estimation (with the difference that the boundary is not necessarily a production function) supports the theory.
Christophe Chesneau
Salima El Kolei
Junke Kou
Fabien Navarro
2019-06
Pareto Models for Top Incomes
http://d.repec.org/n?u=RePEc:hal:cesptp:hal-02145024&r=ecm
Top incomes are often related to Pareto distribution. To date, economists have mostly used Pareto Type I distribution to model the upper tail of income and wealth distribution. It is a parametric distribution, with an attractive property, that can be easily linked to economic theory. In this paper, we first show that modelling top incomes with Pareto Type I distribution can lead to severe over-estimation of inequality, even with millions of observations. Then, we show that the Generalized Pareto distribution and, even more, the Extended Pareto distribution, are much less sensitive to the choice of the threshold. Thus, they provide more reliable results. We discuss different types of bias that could be encountered in empirical studies and, we provide some guidance for practice. To illustrate, two applications are investigated, on the distribution of income in South Africa in 2012 and on the distribution of wealth in the United States in 2013.
Arthur Charpentier
Emmanuel Flachaire
Pareto distribution,top incomes,inequality measures
2019-05-31
A Flexible Regime Switching Model for Asset Returns
http://d.repec.org/n?u=RePEc:chf:rpseri:rp1927&r=ecm
A non-Gaussian multivariate regime switching dynamic correlation model for fi nancial asset returns is proposed. It incorporates the multivariate generalized hyperbolic law for the conditional distribution of returns. All model parameters are estimated consistently using a new two-stage expectation-maximization algorithm that also allows for incorporation of shrinkage estimation via quasi-Bayesian priors. It is shown that use of Markov switching correlation dynamics not only leads to highly accurate risk forecasts, but also potentially reduces the regulatory capital requirements during periods of distress. In terms of portfolio performance, the new regime switching model delivers consistently higher Sharpe ratios and smaller losses than the equally weighted portfolio and all competing models. Finally, the regime forecasts are employed in a dynamic risk control strategy that avoids most losses during the fi nancial crisis and vastly improves risk-adjusted returns.
Marc S. Paolella
Pawel Polak
Patrick S. Walker
sGARCH; Markov Switching; Multivariate Generalized Hyperbolic Distribution; Portfolio Optimization; Value-at-Risk
2019-05
Partial Identification of Population Average and Quantile Treatment Effects in Observational Data under Sample Selection
http://d.repec.org/n?u=RePEc:idb:brikps:9520&r=ecm
We partially identify population treatment effects in observational data under sample selection, without the benefit of random treatment assignment. We provide bounds both for the average and the quantile population treatment effects, combining assumptions for the selected and the non-selected subsamples. We show how different assumptions help narrow identification regions, and illustrate our methods by partially identifying the effect of maternal education on the 2015 PISA math test scores in Brazil. We find that while sample selection increases considerably the uncertainty around the effect of maternal education, it is still possible to calculate informative identification regions.
Christelis, Dimitris
Messina, Julián
2019-06
Sentiment-Driven Stochastic Volatility Model: A High-Frequency Textual Tool for Economists
http://d.repec.org/n?u=RePEc:arx:papers:1906.00059&r=ecm
We propose how to quantify high-frequency market sentiment using high-frequency news from NASDAQ news platform and support vector machine classifiers. News arrive at markets randomly and the resulting news sentiment behaves like a stochastic process. To characterize the joint evolution of sentiment, price, and volatility, we introduce a unified continuous-time sentiment-driven stochastic volatility model. We provide closed-form formulas for moments of the volatility and news sentiment processes and study the news impact. Further, we implement a simulation-based method to calibrate the parameters. Empirically, we document that news sentiment raises the threshold of volatility reversion, sustaining high market volatility.
Jozef Barunik
Cathy Yi-Hsuan Chen
Jan Vecer
2019-05
Export sophistication: A dynamic panel data approach
http://d.repec.org/n?u=RePEc:ara:wpaper:001&r=ecm
In this paper we analyze export sophistication based on a large panel dataset (2001?2015; 101 countries) and using various estimation algorithms. Using Monte Carlo simulations we evaluate the bias properties of estimators and show that GMM-type estimators outperform instrumentalvariable and fixed-effects estimators. Based on our analysis we document that GDP per capita and the size of the economy exhibit significant and positive effects on export sophistication; weak institutional quality exhibits negative effect. We also show that export sophistication is path-dependent and stable even during a major economic crisis, which is especially important for emerging and developing economies.
Evzen Kocenda
Karen Poghosyan
international trade; export sophistication; emerging and developing economies; specialization; dynamic panel data; Monte-Carlo simulation; panel data estimators
2017-11
Score estimation of monotone partially linear index model
http://d.repec.org/n?u=RePEc:cep:stiecm:603&r=ecm
Taisuke Otsu
Mengshan Xu
2019-05