nep-ecm 2018-08-20 papers

on Econometrics

Issue of 2018‒08‒20
sixteen papers chosen by
Sune Karlsson
Örebro universitet

A scoring rule for factor and autoregressive models under misspecification By Roberto Casarin; Fausto Corradin; Francesco Ravazzolo; Domenico Sartore
Testing the fractionally integrated hypothesis using M estimation: With an application to stock market volatility By Matei Demetrescu; Antonio Rubia; Paulo M.M. Rodrigues
A Causal Bootstrap By Guido Imbens; Konrad Menzel
Quantile-Regression Inference With Adaptive Control of Size By Juan Carlos Escanciano; Chuan Goh
Estimation of Dynamic Stochastic Frontier Model using Likelihood-based Approaches By Lai, Hung-pin; Kumbhakar, Subal C.
PC COMPLEX: PC ALGORITHM FOR COMPLEX SURVEY DATA By Daniela Marella
Machine Learning for Dynamic Models of Imperfect Information and Semiparametric Moment Inequalities By Vira Semenova
Coverage Error Optimal Confidence Intervals By Sebastian Calonico; Matias D. Cattaneo; Max H. Farrell
Cross Validation Based Model Selection via Generalized Method of Moments By Junpei Komiyama; Hajime Shimao
Modeling joint probability distribution of yield curve parameters By Jarek Duda; Ma{\l}gorzata Snarska
On the Unbiased Asymptotic Normality of Quantile Regression with Fixed Effects By Antonio F. Galvao; Jiaying Gu; Stanislav Volgushev
Modeling Systemic Risk with Markov Switching Graphical SUR Models By Daniele Bianchi; Monica Billio; Roberto Casarin; Massimo Guidolin
Gaussian stochastic volatility models: Large deviation, moderate deviation, and central limit scaling regimes By Archil Gulisashvili
Multi-Dimensional Sorting in the Data By Ilse Lindenlaub; Fabien Postel-Vinay
Learning to Average Predictively over Good and Bad: Comment on: Using Stacking to Average Bayesian Predictive Distributions By Lennart (L.F.) Hoogerheide; Herman (H.K.) van Dijk
Limit Theorems for Factor Models By Stanislav Anatolyev; Anna Mikusheva

A scoring rule for factor and autoregressive models under misspecification

By:	Roberto Casarin (Department of Economics, University of Venice Cà Foscari); Fausto Corradin (Department of Economics, University of Venice Cà Foscari); Francesco Ravazzolo (Free University of Bozen-Bolzano); Domenico Sartore (Department of Economics, University of Venice Cà Foscari)
Abstract:	Factor models (FM) are now widely used for forecasting with large set of time series. Another class of models, which can be easily estimated and used in a large dimensional setting, is multivariate autoregressive models (MAR), where independent autoregressive processes are assumed for the series in the panel. We compare the forecasting abilities of FM and MAR models when assuming both models are misspecified and the data generating process is a vector autoregressive model. We establish which conditions need to be satisfied for a FM to overperform MAR in terms of mean square forecasting error. The condition indicates in presence of misspecification that FM is not always overperforming MAR and that the FM predictive performance depends crucially on the parameter values of the data generating process. Building on the theoretical relationship between FM and MAR predictive performances, we provide a scoring rule which can be evaluated on the data to either select the model, or combine the models in forecasting exercises. Some numerical illustrations are provided both on simulated data and on wel-known large economic datasets. The empirical results show that the frequency of the true positive signals is larger when FM and MAR forecasting performances differ substantially and it decreases as the horizon increases.
Keywords:	Factor models, Large datasets, Multivariate autoregressive models, Forecasting, Scoring rules, VAR models.
JEL:	C32 C52 C53
Date:	2018
URL:	http://d.repec.org/n?u=RePEc:ven:wpaper:2018:18&r=ecm

Testing the fractionally integrated hypothesis using M estimation: With an application to stock market volatility

By:	Matei Demetrescu; Antonio Rubia; Paulo M.M. Rodrigues
Abstract:	A new class of tests for fractional integration in the time domain based on M estimation is developed. This approach offers more robust properties against non-Gaussian errors than least squares or other estimation principles. The asymptotic properties of the tests are discussed under fairly general assumptions, and for different estimation approaches based on direct optimization of the M loss-function and on iterated k-step and reweighted LS numeric algorithms. Monte Carlo simulations illustrate the good finite sample performance of the new tests and an application to daily volatility of several stock market indices shows the empirical relevance of the new tests.
JEL:	C12 C22
Date:	2018
URL:	http://d.repec.org/n?u=RePEc:ptu:wpaper:w201817&r=ecm

A Causal Bootstrap

By:	Guido Imbens; Konrad Menzel
Abstract:	The bootstrap, introduced by Efron (1982), has become a very popular method for estimating variances and constructing confidence intervals. A key insight is that one can approximate the properties of estimators by using the empirical distribution function of the sample as an approximation for the true distribution function. This approach views the uncertainty in the estimator as coming exclusively from sampling uncertainty. We argue that for causal estimands the uncertainty arises entirely, or partially, from a different source, corresponding to the stochastic nature of the treatment received. We develop a bootstrap procedure that accounts for this uncertainty, and compare its properties to that of the classical bootstrap.
JEL:	C01 C31
Date:	2018–07
URL:	http://d.repec.org/n?u=RePEc:nbr:nberwo:24833&r=ecm

Quantile-Regression Inference With Adaptive Control of Size

By:	Juan Carlos Escanciano; Chuan Goh
Abstract:	Regression quantiles have asymptotic variances that depend on the conditional densities of the response variable given regressors. This paper develops a new estimate of the asymptotic variance of regression quantiles that leads any resulting Wald-type test or confidence region to behave as well in large samples as its infeasible counterpart in which the true conditional response densities are embedded. We give explicit guidance on implementing the new variance estimator to control adaptively the size of any resulting Wald-type test. Monte Carlo evidence indicates the potential of our approach to deliver powerful tests of heterogeneity of quantile treatment effects in covariates with good size performance over different quantile levels, data-generating processes and sample sizes. We also include an empirical example. Supplementary material is available online.
Date:	2018–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:1807.06977&r=ecm

Estimation of Dynamic Stochastic Frontier Model using Likelihood-based Approaches

By:	Lai, Hung-pin; Kumbhakar, Subal C.
Abstract:	Almost all the existing panel stochastic frontier models treat technical efficiency as static. Consequently there is no mechanism by which an inefficient producer can improve its efficiency over time. The main objective of this paper is to propose a panel stochastic frontier model that allows the dynamic adjustment of persistent technical inefficiency. The model also includes transient inefficiency which is assumed to be heteroscedastic. We consider three likelihood-based approaches to estimate the model: the full maximum likelihood (FML), pairwise composite likelihood (PCL) and quasi-maximum likelihood (QML) approaches. Moreover, we provide Monte Carlo simulation results to examine and compare the finite sample performances of the three above-mentioned likelihood-based estimators. Finally, we provide an empirical application to the dynamic model.
Keywords:	Technical inefficiency, panel data, copula, full maximum likelihood estimation, pairwise composite likelihood estimation, quasi-maximum likelihood estimation
JEL:	C23 C24 C51
Date:	2018–04–10
URL:	http://d.repec.org/n?u=RePEc:pra:mprapa:87830&r=ecm

PC COMPLEX: PC ALGORITHM FOR COMPLEX SURVEY DATA

By:	Daniela Marella
Abstract:	PC algorithm is one of the most known procedures for Bayesian networks structural learning. The structure is inferred carrying out several independence tests on a database and building a Bayesian network in agreement with the tests results. The PC algorithm is based on the assumption of independent and identically distributed observations. In practice, sample selection in surveys involves more complex sampling designs, then the standard test procedure is not valid even asymptotically. In order to avoid misleading results about the true causal structure the sample selection process must be taken into account in the structural learning process. In this paper, a modi ed version of the PC algorithm is proposed for inferring casual structure from complex survey data. It is based on resampling techniques for nite population. A simulation experiment showing the robustness with respect to departures from the assumptions and the good performance of the proposed algorithm is carried out.
Keywords:	Bayesian network; complex survey data; pseudo-population; structural learning.
JEL:	C10 C12 C18 C83
Date:	2018–07
URL:	http://d.repec.org/n?u=RePEc:rtr:wpaper:0240&r=ecm

Machine Learning for Dynamic Models of Imperfect Information and Semiparametric Moment Inequalities

By:	Vira Semenova
Abstract:	This paper develops estimation and inference tools for the structural parameter in the dynamic game with a high-dimensional state space under the assumption that the data are generated by a single Markov perfect equilibrium. The equilibrium assumption implies that the expected value function evaluated at the equilibrium strategy profile is not smaller than the expected value function evaluated at a feasible suboptimal alternative. The target identified set is defined as the set of parameters obeying this inequality restriction. I estimate the expected value function in the two-stage procedure. At the first stage, I estimate the law of motion of the state variable and the equilibrium policy function using modern machine learning methods. At the second stage, I construct the estimator of the expected value function as the sum of the naive plug-in estimator expected value function of ? and the bias correction term which removes the bias of the naive estimator. The proposed estimator of the identified set converges at the root-N rate to the true identified set and can be used to construct its confidence regions.
Date:	2018–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:1808.02569&r=ecm

Coverage Error Optimal Confidence Intervals

By:	Sebastian Calonico; Matias D. Cattaneo; Max H. Farrell
Abstract:	We propose a framework for ranking confidence interval estimators in terms of their uniform coverage accuracy. The key ingredient is the (existence and) quantification of the error in coverage of competing confidence intervals, uniformly over some empirically-relevant class of data generating processes. The framework employs the "check" function to quantify coverage error loss, which allows researchers to incorporate their preference in terms of over- and under-coverage, where confidence intervals attaining the best-possible uniform coverage error are minimax optimal. We demonstrate the usefulness of our framework with three distinct applications. First, we establish novel uniformly valid Edgeworth expansions for nonparametric local polynomial regression, offering some technical results that may be of independent interest, and use them to characterize the coverage error of and rank confidence interval estimators for the regression function and its derivatives. As a second application we consider inference in least squares linear regression under potential misspecification, ranking interval estimators utilizing uniformly valid expansions already established in the literature. Third, we study heteroskedasticity-autocorrelation robust inference to showcase how our framework can unify existing conclusions. Several other potential applications are mentioned.
Date:	2018–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:1808.01398&r=ecm

Cross Validation Based Model Selection via Generalized Method of Moments

By:	Junpei Komiyama; Hajime Shimao
Abstract:	Structural estimation is an important methodology in empirical economics, and a large class of structural models are estimated through the generalized method of moments (GMM). Traditionally, selection of structural models has been performed based on model fit upon estimation, which take the entire observed samples. In this paper, we propose a model selection procedure based on cross-validation (CV), which utilizes sample-splitting technique to avoid issues such as over-fitting. While CV is widely used in machine learning communities, we are the first to prove its consistency in model selection in GMM framework. Its empirical property is compared to existing methods by simulations of IV regressions and oligopoly market model. In addition, we propose the way to apply our method to Mathematical Programming of Equilibrium Constraint (MPEC) approach. Finally, we perform our method to online-retail sales data to compare dynamic market model to static model.
Date:	2018–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:1807.06993&r=ecm

Modeling joint probability distribution of yield curve parameters

By:	Jarek Duda; Ma{\l}gorzata Snarska
Abstract:	US Yield curve has recently collapsed to its most flattened level since subprime crisis and is close to the inversion. This fact has gathered attention of investors around the world and revived the discussion of proper modeling and forecasting yield curve, since changes in interest rate structure are believed to represent investors expectations about the future state of economy and have foreshadowed recessions in the United States. While changes in term structure of interest rates are relatively easy to interpret they are however very difficult to model and forecast due to no proper economic theory underlying such events. Yield curves are usually represented by multivariate sparse time series, at any point in time infinite dimensional curve is portrayed via relatively few points in a multivariate space of data and as a consequence multimodal statistical dependencies behind these curves are relatively hard to extract and forecast via typical multivariate statistical methods.We propose to model yield curves via reconstruction of joint probability distribution of parameters in functional space as a high degree polynomial. Thanks to adoption of an orthonormal basis, the MSE estimation of coefficients of a given function is an average over a data sample in the space of functions. Since such polynomial coefficients are independent and have cumulant-like interpretation ie.describe corresponding perturbation from an uniform joint distribution, our approach can also be extended to any d-dimensional space of yield curve parameters (also in neighboring times) due to controllable accuracy. We believe that this approach to modeling of local behavior of a sparse multivariate curved time series can complement prediction from standard models like ARIMA, that are using long range dependencies, but provide only inaccurate prediction of probability distribution, often as just Gaussian with constant width.
Date:	2018–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:1807.11743&r=ecm

On the Unbiased Asymptotic Normality of Quantile Regression with Fixed Effects

By:	Antonio F. Galvao; Jiaying Gu; Stanislav Volgushev
Abstract:	Nonlinear panel data models with fixed individual effects provide an important set of tools for describing microeconometric data. In a large class of such models (including probit, proportional hazard and quantile regression to name just a few) it is impossible to difference out individual effects, and inference is usually justified in a `large n large T' asymptotic framework. However, there is a considerable gap in the type of assumptions that are currently imposed in models with smooth score functions (such as probit, and proportional hazard) and quantile regression. In the present paper we show that this gap can be bridged and establish asymptotic unbiased normality for quantile regression panels under conditions on n,T that are very close to what is typically assumed in standard nonlinear panels. Our results considerably improve upon existing theory and show that quantile regression is applicable to the same type of panel data (in terms of n,T) as other commonly used nonlinear panel data models. Thorough numerical experiments confirm our theoretical findings.
Date:	2018–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:1807.11863&r=ecm

Modeling Systemic Risk with Markov Switching Graphical SUR Models

By:	Daniele Bianchi; Monica Billio; Roberto Casarin; Massimo Guidolin
Abstract:	We propose a Markov Switching Graphical Seemingly Unrelated Regression (MS-GSUR) model to investigate time-varying systemic risk based on a range of multi-factor asset pricing models. Methodologically, we develop a Markov Chain Monte Carlo (MCMC) scheme in which latent states are identified on the basis of a novel weighted eigenvector centrality measure. An empirical application to the constituents of the S&P100 index shows that cross-firm connectivity significantly increased over the period 1999-2003 and during the financial crisis in 2008-2009. Finally, we provide evidence that firm-level centrality does not correlate with market values and it is instead positively linked to realized financial losses. Keywords: Markov Regime-Switching, Weighted Eigenvector Centrality, Graphical Models, MCMC, Systemic Risk, Network Connectivity JEL codes: C11, C15, C32, C58
Date:	2018
URL:	http://d.repec.org/n?u=RePEc:igi:igierp:626&r=ecm

Gaussian stochastic volatility models: Large deviation, moderate deviation, and central limit scaling regimes

By:	Archil Gulisashvili
Abstract:	In this paper, we provide a unified approach to various scaling regimes associated with Gaussian stochastic volatility models. The evolution of volatility in such a model is described by a stochastic process that is a nonnegative continuous function of a continuous Gaussian process. If the process in the previous description exhibits fractional features, then the model is called a Gaussian fractional stochastic volatility model. Important examples of fractional volatility processes are fractional Brownian motion, the Riemann-Liouville fractional Brownian motion, and the fractional Ornstein-Uhlenbeck process. If the volatility process admits a Volterra type representation, then the model is called a Volterra type Gaussian stochastic volatility model. The scaling regimes associated with a Gaussian stochastic volatility model are split into three groups: the large deviation group, the moderate deviation group, and the central limit group. We prove a sample path large deviation principle for the log-price process in a Volterra type Gaussian stochastic volatility model, and a sample path moderate deviation principle for the same process in a Gaussian stochastic volatility model. We also study the asymptotic behavior of the distribution function of the log-price, call pricing functions, and the implied volatility in mixed scaling regimes. It is shown that the asymptotic formulas for the above-mentioned quantities exhibit discontinuities on the boundaries, where the moderate deviation regime becomes the large deviation or the central limit regime. It is also shown that the large deviation tail estimates are locally uniform.
Date:	2018–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:1808.00421&r=ecm

Multi-Dimensional Sorting in the Data

By:	Ilse Lindenlaub (Yale University); Fabien Postel-Vinay (University College London)
Abstract:	When heterogeneous workers sort into heterogeneous jobs on the labor market, `how many' and especially `which' skills and job attributes matter for this choice? Based on our theory of multi-dimensional sorting under random search (Lindenlaub, Postel-Vinay 2017), this paper first develops an empirical test of how many heterogeneity dimensions matter for sorting: If the data is well-approximated by N-dimensional worker and job types, any two workers with the same type should face the same job ladder and therefore the same job acceptance sets. Conversely, if the assumption of N-dimensional heterogeneity is not justified but instead $M>N$ dimensions matter for sorting, then approximating different M-dimensional worker types by the same N-dimensional worker type will produce job ladder heterogeneity within those N-dimensional worker types, revealing misspecification of worker's attributes. To assess the accuracy of this test, we first implement it via simulations, where we simulate data from a large class of multi-dimensional models that complies with our theory. We then apply model selection methods to regressions of employment-to-employment indicators (as a proxy for job acceptance sets) on a large set of potential skills and job attributes in order to recover the true dimensionality of worker and job characteristics. We show that the model selection methods quite accurately reveal the `true' worker and job heterogeneity that matters for sorting. We then implement this test on US data at different points in time, which at each given point delivers a set of worker and job attributes that matters for labor market sorting and allows us to construct the multivariate distributions of skills and job attributes in the data. Second, we propose an application of multi-dimensional sorting to the observed slow-down in US labor market dynamics. We estimate the search model with multi-dimensional types developed in (Lindenlaub, Postel-Vinay 2017) at different points in time, using our constructed skill and job attribute distributions as inputs for the estimation. We then use the estimated model to decompose the slow-down in UE and EE flows in the part that is driven by (i) changes in the multivariate skill and job distributions, (ii) changes in technology, and (iii) changes in search frictions.
Date:	2018
URL:	http://d.repec.org/n?u=RePEc:red:sed018:1239&r=ecm

Learning to Average Predictively over Good and Bad: Comment on: Using Stacking to Average Bayesian Predictive Distributions

By:	Lennart (L.F.) Hoogerheide (VU University Amsterdam); Herman (H.K.) van Dijk (Erasmus University, Norges Bank)
Abstract:	We suggest to extend the stacking procedure for a combination of predictive densities, proposed by Yao et al in the journal Bayesian Analysis to a setting where dynamic learning occurs about features of predictive densities of possibly misspecified models. This improves the averaging process of good and bad model forecasts. We summarise how this learning is done in economics and finance using mixtures. We also show that our proposal can be extended to combining forecasts and policies. The technical tools necessary for the implementation refer to filtering methods from nonlinear time series and we show their connection with machine learning. We illustrate our suggestion using results from Basturk et al based on financial data about US portfolios from 1928 until 2015.
Keywords:	Bayesian learning; predictive density combinations
JEL:	C11 C15
Date:	2018–08–08
URL:	http://d.repec.org/n?u=RePEc:tin:wpaper:20180063&r=ecm

Limit Theorems for Factor Models

By:	Stanislav Anatolyev; Anna Mikusheva
Abstract:	This paper establishes some asymptotic results such as central limit theorems and consistency of variance estimation in factor models. We consider a setting common to modern macroeconomic and financial models where many counties/regions/macro-variables/assets are observed for many time periods, and when estimation of a global parameter includes aggregation of a cross-section of heterogeneous micro-parameters estimated separately for each entity. We establish a central limit theorem for quantities involving both cross-sectional and time series aggregation, as well as for quadratic forms in time-aggregated errors. We also study sufficient conditions when one can consistently estimate the asymptotic variance. These results are useful for making inferences in two-step estimation procedures related to factor models. We avoid structural modeling of cross-sectional dependence but impose time-series independence.
Date:	2018–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:1807.06338&r=ecm

This nep-ecm issue is ©2018 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.