nep-ecm 2016-07-02 papers

on Econometrics

Issue of 2016‒07‒02
sixteen papers chosen by
Sune Karlsson
Örebro universitet

Alternative HAC Covariance Matrix Estimators with Improved Finite Sample Properties By Luke Hartigan
Practical Kolmogorov-Smirnov Testing by Minimum Distance Applied to Measure Top Income Shares in Korea By JIN SEO CHO; MYUNG-HO PARK
Generalized State-Dependent Models: A Multivariate Approach By S. Heravi; J. Easaw; R. Golinelli
Multi-class vector autoregressive models for multi-store sales data By Ines Wilms; Luca Barbaglia; Christophe Croux
Using geographically weighted choice models to account for spatial heterogeneity of preferences By Wiktor Budziński; Danny Campbell; Mikołaj Czajkowski; Urška Demšar; Nick Hanley
Bias-corrected confidence intervals in a class of linear inverse problems By Jean-Pierre Florens; Joel Horowitz; Ingred van Keilegom
Bounding average treatment effects using linear programming By Lukáš Lafférs
Predication in a Generalized Spatial Panel Data Model with Serial Correlation By Badi Baltagi; Long Liu
Second-order corrected likelihood for nonlinear models with fixed effects By Yutao Sun
Testing for Speculative Bubbles in Large-Dimensional Financial Panel Data Sets By HORIE, Tetsushi; YAMAMOTO, Yohei
Quantile selection models: with an application to understanding changes in wage inequality By Manuel Arellano; Stéphane Bonhomme
Exact Smooth Term Structure Estimation By Damir Filipovi\'c; Sander Willems
Using String Invariants for Prediction Searching for Optimal Parameters By Marek Bundzel; Tomas Kasanicky; Richard Pincak
Revisiting the synthetic control estimator By Ferman, Bruno; Pinto, Cristine Campos de Xavier
Time-Varying Persistence of Inflation: Evidence from a Wavelet-Based Approach By Heni Boubaker; Giorgio Canarella; Rangan Gupta; Stephen M. Miller
The Null Distribution of the Empirical AUC for Classi ers with Estimated Parameters: a Special Case By Robert P. Lieli; Yu-Chin Hsu

Alternative HAC Covariance Matrix Estimators with Improved Finite Sample Properties

By:	Luke Hartigan (School of Economics, UNSW Business School, UNSW)
Abstract:	HAC estimators are known to produce test statistics that reject too frequently in finite samples. One neglected reason comes from using the OLS residuals when constructing the HAC estimator. If the regression matrix contains high leverage points, such as from outliers, then the OLS residuals will be negatively biased. This reduces the variance of the OLS residuals and the HAC estimator takes this to signal a more accurate coefficient estimate. Transformations to reflate the OLS residuals and offset the bias have been used in the related HC literature for many years, but these have been overlooked in the HAC literature. Using a suite of simulations I provide strong evidence in favour of replacing the OLS residual-based HAC estimator with estimators related to extensions of either of the two main HC alternatives. In an empirical application I show how different inference from using the alternative HAC estimators can be important, not only from a statistical perspective, but also from an economic one as well.
Keywords:	Covariance matrix estimation, Finite sample analysis, Leverage points, Autocorrelation, Hypothesis testing, Monte Carlo simulation, Inference
JEL:	C12 C13 C15 C22
Date:	2016–05
URL:	http://d.repec.org/n?u=RePEc:swe:wpaper:2016-06&r=ecm

Practical Kolmogorov-Smirnov Testing by Minimum Distance Applied to Measure Top Income Shares in Korea

By:	JIN SEO CHO (Yonsei University); MYUNG-HO PARK (Korea Institute of Public Finance)
Abstract:	We study Kolmogorov-Smirnov goodness of fit tests for evaluating distributional hypotheses where unknown parameters need to be fitted. Following work of Pollard (1980), our approach uses a Cram¢¥ervon Mises minimum distance estimator for parameter estimation. The asymptotic null distribution of the resulting test statistic is represented by invariance principle arguments as a functional of a Brownian bridge in a simple regression format for which asymptotic critical values are readily delivered by simulations. Asymptotic power is examined under fixed and local alternatives and finite sample performance of the test is evaluated in simulations. The test is applied to measure top income shares using Korean income tax return data over 2007 to 2012. When the data relate to estimating the upper 0.1% or higher income shares, the conventional assumption of a Pareto tail distribution cannot be rejected. But the Pareto tail hypothesis is rejected for estimating the top 1.0% or 0.5% income shares at the 5% significance level. A Supplement containing proofs and data descriptions is available online. Key Words: Distribution-free asymptotics, null distribution, minimum distance estimator, Cr¢¥amer-von Mises distance, top income shares, Pareto interpolation.
JEL:	C12 C13 D31 E01 O15
Date:	2016–06
URL:	http://d.repec.org/n?u=RePEc:yon:wpaper:2016rwp-88&r=ecm

Generalized State-Dependent Models: A Multivariate Approach

By:	S. Heravi; J. Easaw; R. Golinelli
Abstract:	The main purpose of this paper is to develop generalized ‘State Dependent Models’ (SDM) in a multivariate framework for empirical analysis. This significantly extends the existing SDM which only allow univariate analysis following a simple AR process. The extended model enables greater possibility for empirical analysis of economic relationships. The principle advantage of SDM is that it allows for a general form of non-linearity and can be fitted without any specific prior assumption about the form of non-linearity. We describe the general structure of the SDM and the problem of its identification is also considered. Finally, we apply the algorithm to show the impact of sentiment and income when modelling US consumption.
JEL:	C32 C51 E32
Date:	2016–05
URL:	http://d.repec.org/n?u=RePEc:bol:bodewp:wp1067&r=ecm

Multi-class vector autoregressive models for multi-store sales data

By:	Ines Wilms; Luca Barbaglia; Christophe Croux
Abstract:	Retailers use the Vector AutoRegressive (VAR) model as a standard tool to estimate the effects of prices, promotions and sales in one product category on the sales of another product category. Besides, these price, promotion and sales data are available for not just one store, but a whole chain of stores. We propose to study cross-category effects using a multi-class VAR model: we jointly estimate cross-category effects for several distinct but related VAR models, one for each store. Our methodology encourages effects to be similar across stores, while still allowing for small differences between stores to account for store heterogeneity. Moreover, our estimator is sparse: unimportant effects are estimated as exactly zero, which facilitates the interpretation of the results. A simulation study shows that the proposed multi-class estimator improves estimation accuracy by borrowing strength across classes. Finally, we provide three visual tools showing (i) the clustering of stores on identical cross-category effects, (ii) the networks of product categories and (iii) the similarity matrices of shared cross-category effects across stores.
Keywords:	Fused Lasso, Multi-class estimation, Multi-store sales application, Sparse estimation, Vector AutoRegressive model
Date:	2016–05
URL:	http://d.repec.org/n?u=RePEc:ete:kbiper:540947&r=ecm

Using geographically weighted choice models to account for spatial heterogeneity of preferences

By:	Wiktor Budziński (Faculty of Economic Sciences, University of Warsaw); Danny Campbell (University of Stirling, Stirling Management School); Mikołaj Czajkowski (Faculty of Economic Sciences, University of Warsaw); Urška Demšar (University of St Andrews, School of Geography and Geosciences); Nick Hanley (University of St Andrews, School of Geography and Geosciences)
Abstract:	In this paper we investigate the prospects of using geographically weighted choice models for modelling of spatially clustered preferences. The data used in this study comes from a discrete choice experiment survey regarding public preferences for the implementation of a new country-wide forest management and protection program in Poland. We combine it with high-resolution geographical information system data related to local forest characteristics. Using locally estimated discrete choice models we obtain location-specific estimates of willingness to pay (WTP). Variation in these estimates is explained by the socio-demographic characteristics of respondents and characteristics of the forests in their place of residence. The results are compared with those obtained from a more typical, two stage procedure which uses Bayesian posterior means of the mixed logit model random parameters to calculate individual-specific estimates of WTP. The latter approach, although easier to implement and more common in the literature, does not explicitly assume any spatial relationship between individuals. In contrast, the geographically weighted approach differs in this aspect and can provide additional insight on spatial patterns of individuals’ preferences. Our study shows that although the geographically weighted discrete choice models have some advantages, it is not without drawbacks, such as the difficulty and subjectivity in choosing an appropriate bandwidth. We find a number of notable differences in WTP estimates and their spatial distributions. At the current level of development of the two techniques, we find mixed evidence on which approach gives the better results.
Keywords:	discrete choice experiment, contingent valuation, willingness to pay, spatial heterogeneity of preferences, forest management, passive protection, litter, tourist infrastructure, mixed logit, geographically weighted model, weighted maximum likelihood, local maximum likelihood
JEL:	Q23 Q28 I38 Q51 Q57 Q58
Date:	2016
URL:	http://d.repec.org/n?u=RePEc:war:wpaper:2016-17&r=ecm

Bias-corrected confidence intervals in a class of linear inverse problems

By:	Jean-Pierre Florens (Institute for Fiscal Studies); Joel Horowitz (Institute for Fiscal Studies and Northwestern University); Ingred van Keilegom (Institute for Fiscal Studies)
Abstract:	In this paper we propose a novel method to construct confi dence intervals in a class of linear inverse problems. First, point estimators are obtained via a spectral cut-o ff method depending on a regularisation parameter , that determines the bias of the estimator. Next, the proposed con fidence interval corrects for this bias by explicitly estimating it based on a second regularisation parameter , which is asymptotically smaller than . The coverage error of the interval is shown to converge to zero. The proposed method is illustrated via two simulation studies, one in the context of functional linear regression, and the second one in the context of instrumental regression.
Keywords:	Bias-correction; functional linear regression; instrumental regression; inverse problem; regularisation; spectral cut-o ff
Date:	2016–05–09
URL:	http://d.repec.org/n?u=RePEc:ifs:cemmap:19/16&r=ecm

Bounding average treatment effects using linear programming

By:	Lukáš Lafférs (Institute for Fiscal Studies)
Abstract:	This paper presents a method of calculating sharp bounds on the average treatment effect using linear programming under identifying assumptions commonly used in the literature. This new method provides a sensitivity analysis of the identifying assumptions and missing data in an application regarding the effect of parent’s schooling on children’s schooling. Even a mild departure from identifying assumptions may substantially widen the bounds on average treatment effects. Allowing for a small fraction of the data to be missing also has a large impact on the results.
Date:	2015–11–13
URL:	http://d.repec.org/n?u=RePEc:ifs:cemmap:70/15&r=ecm

Predication in a Generalized Spatial Panel Data Model with Serial Correlation

By:	Badi Baltagi (Center for Policy Research, Maxwell School, Syracuse University, 426 Eggers Hall, Syracuse, NY 13244); Long Liu (College of Business, University of Texas at San Antonio, UTSA Circle, Texa 78249)
Abstract:	This paper considers the generalized spatial panel data model with serial correlation proposed by Lee and Yu (2012) which encompasses a lot of the spatial panel data models considered in the literature, and derives the best linear unbiased predictor (BLUP) for that model. This in turn provides valuable BLUP for several spatial panel models as special cases.
Keywords:	Prediction; Panel Data; Fixed Effects; Random Effects; Serial Correlation; Spatial Error Correlation
JEL:	C33
Date:	2016–02
URL:	http://d.repec.org/n?u=RePEc:max:cprwps:188&r=ecm

Second-order corrected likelihood for nonlinear models with fixed effects

By:	Yutao Sun
Abstract:	We introduce a second-order correction technique for nonlinear fixed-effect models exposed to the incidental parameter problem. This technique produces a bias-corrected log-likelihood function that possesses a bias only to the order (in expectation) of O (T-3) where T is the number of time periods. As a consequence, the maximizer of the corrected log-likelihood, the corrected estimator, is also only biased to the order of O(T-3). The technique applies to static nonlinear fixed-effect models in which N, the number of individuals, is allowed to grow rapidly and T is assumed to grow at a rate satisfying N/T5 converging to 0. The proposed technique is general in the sense that it does not depend on a specific functional form of the log-likelihood function.
Date:	2016–05
URL:	http://d.repec.org/n?u=RePEc:ete:ceswps:541931&r=ecm

Testing for Speculative Bubbles in Large-Dimensional Financial Panel Data Sets

By:	HORIE, Tetsushi; YAMAMOTO, Yohei
Abstract:	Towards the financial crisis of 2007 to 2008, speculative bubbles prevailed in various financial assets. Whether these bubbles are an economy-wide phenomenon or market-specific events is an important question. This study develops a testing approach to investigate whether the bubbles lie in the common or in the idiosyncratic components of large-dimensional financial panel data sets. To this end, we extend the right-tailed unit root tests to common factor models, benchmarking the panel analysis of nonstationarity in idiosyncratic and common component (PANIC) proposed by Bai and Ng (2004). We find that when the PANIC test is applied to the explosive alternative hypothesis as opposed to the stationary alternative hypothesis, the test for the idiosyncratic component may suffer from the nonmonotonic power problem. In this paper, we newly propose a cross-sectional (CS) approach to disentangle the common and the idiosyncratic components in a relatively short explosive window. This method first estimates the factor loadings in the training sample and then uses them in cross-sectional regressions to extract the common factors in the explosive window. A Monte Carlo simulation shows that the CS approach is robust to the nonmonotonic power problem. We apply this method to 24 exchange rates against the U.S. dollar to identify the currency values that were explosive during the financial crisis period.
Keywords:	speculative bubbles, explosive behaviors, factor model, moderate deviations, local asymptotic power, nonmonotonic power
JEL:	C12 C38 F31
Date:	2016–06–17
URL:	http://d.repec.org/n?u=RePEc:hit:econdp:2016-04&r=ecm

Quantile selection models: with an application to understanding changes in wage inequality

By:	Manuel Arellano (Institute for Fiscal Studies and CEMFI); Stéphane Bonhomme (Institute for Fiscal Studies and University of Chicago)
Abstract:	We propose a method to correct for sample selection in quantile regression models. Selection is modelled via the cumulative distribution function, or copula, of the percentile error in the outcome equation and the error in the participation decision. Copula parameters are estimated by minimizing a method-of-moments criterion. Given these parameter estimates, the percentile levels of the outcome are re-adjusted to correct for selection, and quantile parameters are estimated by minimizing a rotated “check” function. We apply the method to correct wage percentiles for selection into employment, using data for the UK for the period 1978-2000. We also extend the method to account for the presence of equilibrium e?ects when performing counterfactual exercises.
Date:	2015–12–21
URL:	http://d.repec.org/n?u=RePEc:ifs:cemmap:75/15&r=ecm

Exact Smooth Term Structure Estimation

By:	Damir Filipovi\'c; Sander Willems
Abstract:	We introduce a novel method to estimate the discount curve from market quotes based on the Moore-Penrose pseudoinverse such that 1) the market quotes are exactly replicated, 2) the curve has maximal smoothness, 3) no ad hoc interpolation is needed, and 4) no numerical root-finding algorithms are required. We provide a full theoretical framework as well as practical applications for both single-curve and multi-curve estimation.
Date:	2016–06
URL:	http://d.repec.org/n?u=RePEc:arx:papers:1606.03899&r=ecm

Using String Invariants for Prediction Searching for Optimal Parameters

By:	Marek Bundzel; Tomas Kasanicky; Richard Pincak
Abstract:	We have developed a novel prediction method based on string invariants. The method does not require learning but a small set of parameters must be set to achieve optimal performance. We have implemented an evolutionary algorithm for the parametric optimization. We have tested the performance of the method on artificial and real world data and compared the performance to statistical methods and to a number of artificial intelligence methods. We have used data and the results of a prediction competition as a benchmark. The results show that the method performs well in single step prediction but the methods performance for multiple step prediction needs to be improved. The method works well for a wide range of parameters.
Date:	2016–06
URL:	http://d.repec.org/n?u=RePEc:arx:papers:1606.06003&r=ecm

Revisiting the synthetic control estimator

By:	Ferman, Bruno; Pinto, Cristine Campos de Xavier
Abstract:	The synthetic control (SC) method has been recently proposed as an alternative method to estimate treatment e ects in comparative case studies. The SC relies on the assumption that there is a weighted average of the control units that reconstruct the factor loadings of the treated unit. If these weights were known, then one could estimate the counterfactual for the treated unit in the absence of treatment using a weighted average of the control units. With these weights, the SC would provide an unbiased estimator for the treatment e ect even if selection into treatment is correlated with the unobserved heterogeneity. In this paper, we revisit the SC method in a linear factors model where the SC weights are considered nuisance parameters that are estimated to construct the SC estimator. We show that, when the number of control units is xed, the estimated SC weights will not converge to the weights that reconstruct the factor loadings of the treated unit even when the number of pre-intervention periods goes to in nity. As a consequence, the SC estimator will be asymptotically biased if the treatment assignment is correlated with the unobserved heterogeneity. The asymptotic bias only vanishes when the variance of the idiosyncratic error goes to zero.
Date:	2016–06–16
URL:	http://d.repec.org/n?u=RePEc:fgv:eesptd:421&r=ecm

Time-Varying Persistence of Inflation: Evidence from a Wavelet-Based Approach

By:	Heni Boubaker (IPAG LAB, IPAG Business School, France); Giorgio Canarella (University of Nevada, Las Vegas, USA); Rangan Gupta (Department of Economics, University of Pretoria); Stephen M. Miller (University of Nevada, Las Vegas, USA)
Abstract:	We propose a new long-memory model with a time-varying fractional integration parameter, evolving non-linearly according to a Logistic Smooth Transition Autoregressive (LSTAR) specification. To estimate the time-varying fractional integration parameter, we implement a method based on the wavelet approach, using the instantaneous least squares estimator (ILSE). The empirical results show the relevance of the modeling approach and provide evidence of regime change in inflation persistence that contributes to a better understanding of the inflationary process in the US. Most importantly, these empirical findings remind us that a "one-size-fits-all" monetary policy is unlikely to work in all circumstances.
Keywords:	Time-varying long-memory, LSTAR model, MODWT algorithm, ILSE estimator
JEL:	C13 C22 C32 C54 E31
Date:	2016–06
URL:	http://d.repec.org/n?u=RePEc:pre:wpaper:201647&r=ecm

The Null Distribution of the Empirical AUC for Classi ers with Estimated Parameters: a Special Case

By:	Robert P. Lieli (Department of Economics, Central European University); Yu-Chin Hsu (Institute of Economics, Academia Sinica, Taipei, Taiwan)
Abstract:	We study the distribution of the area under an empirical receiver operating characteristic (ROC) curve constructed from a first stage regression model with parameters estimated on the same data set. We provide a general, but somewhat intrinsic, characterization of the limit distribution of this area, denoted AUC, when the regressors are Bernoulli random variables jointly independent of the outcome. Using the general theory, we further analyze the limit distribution in the two regressor case. It is non-normal and right-skewed. Though the theory applies, explicit expressions for the limit distribution are cumbersome to write down for a larger number of regressors. We provide a trivariate example as further illustration.
Keywords:	binary classification, ROC curve, area under the ROC curve, overfitting, hypothesis testing, model selection
Date:	2016–06
URL:	http://d.repec.org/n?u=RePEc:sin:wpaper:16-a007&r=ecm

This nep-ecm issue is ©2016 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.