
on Econometrics 
By:  Roberto Casarin (Department of Economics, University of Venice Cà Foscari); Fausto Corradin (Department of Economics, University of Venice Cà Foscari); Francesco Ravazzolo (Free University of BozenBolzano); Domenico Sartore (Department of Economics, University of Venice Cà Foscari) 
Abstract:  Factor models (FM) are now widely used for forecasting with large set of time series. Another class of models, which can be easily estimated and used in a large dimensional setting, is multivariate autoregressive models (MAR), where independent autoregressive processes are assumed for the series in the panel. We compare the forecasting abilities of FM and MAR models when assuming both models are misspecified and the data generating process is a vector autoregressive model. We establish which conditions need to be satisfied for a FM to overperform MAR in terms of mean square forecasting error. The condition indicates in presence of misspecification that FM is not always overperforming MAR and that the FM predictive performance depends crucially on the parameter values of the data generating process. Building on the theoretical relationship between FM and MAR predictive performances, we provide a scoring rule which can be evaluated on the data to either select the model, or combine the models in forecasting exercises. Some numerical illustrations are provided both on simulated data and on welknown large economic datasets. The empirical results show that the frequency of the true positive signals is larger when FM and MAR forecasting performances differ substantially and it decreases as the horizon increases. 
Keywords:  Factor models, Large datasets, Multivariate autoregressive models, Forecasting, Scoring rules, VAR models. 
JEL:  C32 C52 C53 
Date:  2018 
URL:  http://d.repec.org/n?u=RePEc:ven:wpaper:2018:18&r=ecm 
By:  Matei Demetrescu; Antonio Rubia; Paulo M.M. Rodrigues 
Abstract:  A new class of tests for fractional integration in the time domain based on M estimation is developed. This approach offers more robust properties against nonGaussian errors than least squares or other estimation principles. The asymptotic properties of the tests are discussed under fairly general assumptions, and for different estimation approaches based on direct optimization of the M lossfunction and on iterated kstep and reweighted LS numeric algorithms. Monte Carlo simulations illustrate the good finite sample performance of the new tests and an application to daily volatility of several stock market indices shows the empirical relevance of the new tests. 
JEL:  C12 C22 
Date:  2018 
URL:  http://d.repec.org/n?u=RePEc:ptu:wpaper:w201817&r=ecm 
By:  Guido Imbens; Konrad Menzel 
Abstract:  The bootstrap, introduced by Efron (1982), has become a very popular method for estimating variances and constructing confidence intervals. A key insight is that one can approximate the properties of estimators by using the empirical distribution function of the sample as an approximation for the true distribution function. This approach views the uncertainty in the estimator as coming exclusively from sampling uncertainty. We argue that for causal estimands the uncertainty arises entirely, or partially, from a different source, corresponding to the stochastic nature of the treatment received. We develop a bootstrap procedure that accounts for this uncertainty, and compare its properties to that of the classical bootstrap. 
JEL:  C01 C31 
Date:  2018–07 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:24833&r=ecm 
By:  Juan Carlos Escanciano; Chuan Goh 
Abstract:  Regression quantiles have asymptotic variances that depend on the conditional densities of the response variable given regressors. This paper develops a new estimate of the asymptotic variance of regression quantiles that leads any resulting Waldtype test or confidence region to behave as well in large samples as its infeasible counterpart in which the true conditional response densities are embedded. We give explicit guidance on implementing the new variance estimator to control adaptively the size of any resulting Waldtype test. Monte Carlo evidence indicates the potential of our approach to deliver powerful tests of heterogeneity of quantile treatment effects in covariates with good size performance over different quantile levels, datagenerating processes and sample sizes. We also include an empirical example. Supplementary material is available online. 
Date:  2018–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1807.06977&r=ecm 
By:  Lai, Hungpin; Kumbhakar, Subal C. 
Abstract:  Almost all the existing panel stochastic frontier models treat technical efficiency as static. Consequently there is no mechanism by which an inefficient producer can improve its efficiency over time. The main objective of this paper is to propose a panel stochastic frontier model that allows the dynamic adjustment of persistent technical inefficiency. The model also includes transient inefficiency which is assumed to be heteroscedastic. We consider three likelihoodbased approaches to estimate the model: the full maximum likelihood (FML), pairwise composite likelihood (PCL) and quasimaximum likelihood (QML) approaches. Moreover, we provide Monte Carlo simulation results to examine and compare the finite sample performances of the three abovementioned likelihoodbased estimators. Finally, we provide an empirical application to the dynamic model. 
Keywords:  Technical inefficiency, panel data, copula, full maximum likelihood estimation, pairwise composite likelihood estimation, quasimaximum likelihood estimation 
JEL:  C23 C24 C51 
Date:  2018–04–10 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:87830&r=ecm 
By:  Daniela Marella 
Abstract:  PC algorithm is one of the most known procedures for Bayesian networks structural learning. The structure is inferred carrying out several independence tests on a database and building a Bayesian network in agreement with the tests results. The PC algorithm is based on the assumption of independent and identically distributed observations. In practice, sample selection in surveys involves more complex sampling designs, then the standard test procedure is not valid even asymptotically. In order to avoid misleading results about the true causal structure the sample selection process must be taken into account in the structural learning process. In this paper, a modi ed version of the PC algorithm is proposed for inferring casual structure from complex survey data. It is based on resampling techniques for nite population. A simulation experiment showing the robustness with respect to departures from the assumptions and the good performance of the proposed algorithm is carried out. 
Keywords:  Bayesian network; complex survey data; pseudopopulation; structural learning. 
JEL:  C10 C12 C18 C83 
Date:  2018–07 
URL:  http://d.repec.org/n?u=RePEc:rtr:wpaper:0240&r=ecm 
By:  Vira Semenova 
Abstract:  This paper develops estimation and inference tools for the structural parameter in the dynamic game with a highdimensional state space under the assumption that the data are generated by a single Markov perfect equilibrium. The equilibrium assumption implies that the expected value function evaluated at the equilibrium strategy profile is not smaller than the expected value function evaluated at a feasible suboptimal alternative. The target identified set is defined as the set of parameters obeying this inequality restriction. I estimate the expected value function in the twostage procedure. At the first stage, I estimate the law of motion of the state variable and the equilibrium policy function using modern machine learning methods. At the second stage, I construct the estimator of the expected value function as the sum of the naive plugin estimator expected value function of ? and the bias correction term which removes the bias of the naive estimator. The proposed estimator of the identified set converges at the rootN rate to the true identified set and can be used to construct its confidence regions. 
Date:  2018–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1808.02569&r=ecm 
By:  Sebastian Calonico; Matias D. Cattaneo; Max H. Farrell 
Abstract:  We propose a framework for ranking confidence interval estimators in terms of their uniform coverage accuracy. The key ingredient is the (existence and) quantification of the error in coverage of competing confidence intervals, uniformly over some empiricallyrelevant class of data generating processes. The framework employs the "check" function to quantify coverage error loss, which allows researchers to incorporate their preference in terms of over and undercoverage, where confidence intervals attaining the bestpossible uniform coverage error are minimax optimal. We demonstrate the usefulness of our framework with three distinct applications. First, we establish novel uniformly valid Edgeworth expansions for nonparametric local polynomial regression, offering some technical results that may be of independent interest, and use them to characterize the coverage error of and rank confidence interval estimators for the regression function and its derivatives. As a second application we consider inference in least squares linear regression under potential misspecification, ranking interval estimators utilizing uniformly valid expansions already established in the literature. Third, we study heteroskedasticityautocorrelation robust inference to showcase how our framework can unify existing conclusions. Several other potential applications are mentioned. 
Date:  2018–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1808.01398&r=ecm 
By:  Junpei Komiyama; Hajime Shimao 
Abstract:  Structural estimation is an important methodology in empirical economics, and a large class of structural models are estimated through the generalized method of moments (GMM). Traditionally, selection of structural models has been performed based on model fit upon estimation, which take the entire observed samples. In this paper, we propose a model selection procedure based on crossvalidation (CV), which utilizes samplesplitting technique to avoid issues such as overfitting. While CV is widely used in machine learning communities, we are the first to prove its consistency in model selection in GMM framework. Its empirical property is compared to existing methods by simulations of IV regressions and oligopoly market model. In addition, we propose the way to apply our method to Mathematical Programming of Equilibrium Constraint (MPEC) approach. Finally, we perform our method to onlineretail sales data to compare dynamic market model to static model. 
Date:  2018–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1807.06993&r=ecm 
By:  Jarek Duda; Ma{\l}gorzata Snarska 
Abstract:  US Yield curve has recently collapsed to its most flattened level since subprime crisis and is close to the inversion. This fact has gathered attention of investors around the world and revived the discussion of proper modeling and forecasting yield curve, since changes in interest rate structure are believed to represent investors expectations about the future state of economy and have foreshadowed recessions in the United States. While changes in term structure of interest rates are relatively easy to interpret they are however very difficult to model and forecast due to no proper economic theory underlying such events. Yield curves are usually represented by multivariate sparse time series, at any point in time infinite dimensional curve is portrayed via relatively few points in a multivariate space of data and as a consequence multimodal statistical dependencies behind these curves are relatively hard to extract and forecast via typical multivariate statistical methods.We propose to model yield curves via reconstruction of joint probability distribution of parameters in functional space as a high degree polynomial. Thanks to adoption of an orthonormal basis, the MSE estimation of coefficients of a given function is an average over a data sample in the space of functions. Since such polynomial coefficients are independent and have cumulantlike interpretation ie.describe corresponding perturbation from an uniform joint distribution, our approach can also be extended to any ddimensional space of yield curve parameters (also in neighboring times) due to controllable accuracy. We believe that this approach to modeling of local behavior of a sparse multivariate curved time series can complement prediction from standard models like ARIMA, that are using long range dependencies, but provide only inaccurate prediction of probability distribution, often as just Gaussian with constant width. 
Date:  2018–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1807.11743&r=ecm 
By:  Antonio F. Galvao; Jiaying Gu; Stanislav Volgushev 
Abstract:  Nonlinear panel data models with fixed individual effects provide an important set of tools for describing microeconometric data. In a large class of such models (including probit, proportional hazard and quantile regression to name just a few) it is impossible to difference out individual effects, and inference is usually justified in a `large n large T' asymptotic framework. However, there is a considerable gap in the type of assumptions that are currently imposed in models with smooth score functions (such as probit, and proportional hazard) and quantile regression. In the present paper we show that this gap can be bridged and establish asymptotic unbiased normality for quantile regression panels under conditions on n,T that are very close to what is typically assumed in standard nonlinear panels. Our results considerably improve upon existing theory and show that quantile regression is applicable to the same type of panel data (in terms of n,T) as other commonly used nonlinear panel data models. Thorough numerical experiments confirm our theoretical findings. 
Date:  2018–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1807.11863&r=ecm 
By:  Daniele Bianchi; Monica Billio; Roberto Casarin; Massimo Guidolin 
Abstract:  We propose a Markov Switching Graphical Seemingly Unrelated Regression (MSGSUR) model to investigate timevarying systemic risk based on a range of multifactor asset pricing models. Methodologically, we develop a Markov Chain Monte Carlo (MCMC) scheme in which latent states are identified on the basis of a novel weighted eigenvector centrality measure. An empirical application to the constituents of the S&P100 index shows that crossfirm connectivity significantly increased over the period 19992003 and during the financial crisis in 20082009. Finally, we provide evidence that firmlevel centrality does not correlate with market values and it is instead positively linked to realized financial losses. Keywords: Markov RegimeSwitching, Weighted Eigenvector Centrality, Graphical Models, MCMC, Systemic Risk, Network Connectivity JEL codes: C11, C15, C32, C58 
Date:  2018 
URL:  http://d.repec.org/n?u=RePEc:igi:igierp:626&r=ecm 
By:  Archil Gulisashvili 
Abstract:  In this paper, we provide a unified approach to various scaling regimes associated with Gaussian stochastic volatility models. The evolution of volatility in such a model is described by a stochastic process that is a nonnegative continuous function of a continuous Gaussian process. If the process in the previous description exhibits fractional features, then the model is called a Gaussian fractional stochastic volatility model. Important examples of fractional volatility processes are fractional Brownian motion, the RiemannLiouville fractional Brownian motion, and the fractional OrnsteinUhlenbeck process. If the volatility process admits a Volterra type representation, then the model is called a Volterra type Gaussian stochastic volatility model. The scaling regimes associated with a Gaussian stochastic volatility model are split into three groups: the large deviation group, the moderate deviation group, and the central limit group. We prove a sample path large deviation principle for the logprice process in a Volterra type Gaussian stochastic volatility model, and a sample path moderate deviation principle for the same process in a Gaussian stochastic volatility model. We also study the asymptotic behavior of the distribution function of the logprice, call pricing functions, and the implied volatility in mixed scaling regimes. It is shown that the asymptotic formulas for the abovementioned quantities exhibit discontinuities on the boundaries, where the moderate deviation regime becomes the large deviation or the central limit regime. It is also shown that the large deviation tail estimates are locally uniform. 
Date:  2018–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1808.00421&r=ecm 
By:  Ilse Lindenlaub (Yale University); Fabien PostelVinay (University College London) 
Abstract:  When heterogeneous workers sort into heterogeneous jobs on the labor market, `how many' and especially `which' skills and job attributes matter for this choice? Based on our theory of multidimensional sorting under random search (Lindenlaub, PostelVinay 2017), this paper first develops an empirical test of how many heterogeneity dimensions matter for sorting: If the data is wellapproximated by Ndimensional worker and job types, any two workers with the same type should face the same job ladder and therefore the same job acceptance sets. Conversely, if the assumption of Ndimensional heterogeneity is not justified but instead $M>N$ dimensions matter for sorting, then approximating different Mdimensional worker types by the same Ndimensional worker type will produce job ladder heterogeneity within those Ndimensional worker types, revealing misspecification of worker's attributes. To assess the accuracy of this test, we first implement it via simulations, where we simulate data from a large class of multidimensional models that complies with our theory. We then apply model selection methods to regressions of employmenttoemployment indicators (as a proxy for job acceptance sets) on a large set of potential skills and job attributes in order to recover the true dimensionality of worker and job characteristics. We show that the model selection methods quite accurately reveal the `true' worker and job heterogeneity that matters for sorting. We then implement this test on US data at different points in time, which at each given point delivers a set of worker and job attributes that matters for labor market sorting and allows us to construct the multivariate distributions of skills and job attributes in the data. Second, we propose an application of multidimensional sorting to the observed slowdown in US labor market dynamics. We estimate the search model with multidimensional types developed in (Lindenlaub, PostelVinay 2017) at different points in time, using our constructed skill and job attribute distributions as inputs for the estimation. We then use the estimated model to decompose the slowdown in UE and EE flows in the part that is driven by (i) changes in the multivariate skill and job distributions, (ii) changes in technology, and (iii) changes in search frictions. 
Date:  2018 
URL:  http://d.repec.org/n?u=RePEc:red:sed018:1239&r=ecm 
By:  Lennart (L.F.) Hoogerheide (VU University Amsterdam); Herman (H.K.) van Dijk (Erasmus University, Norges Bank) 
Abstract:  We suggest to extend the stacking procedure for a combination of predictive densities, proposed by Yao et al in the journal Bayesian Analysis to a setting where dynamic learning occurs about features of predictive densities of possibly misspecified models. This improves the averaging process of good and bad model forecasts. We summarise how this learning is done in economics and finance using mixtures. We also show that our proposal can be extended to combining forecasts and policies. The technical tools necessary for the implementation refer to filtering methods from nonlinear time series and we show their connection with machine learning. We illustrate our suggestion using results from Basturk et al based on financial data about US portfolios from 1928 until 2015. 
Keywords:  Bayesian learning; predictive density combinations 
JEL:  C11 C15 
Date:  2018–08–08 
URL:  http://d.repec.org/n?u=RePEc:tin:wpaper:20180063&r=ecm 
By:  Stanislav Anatolyev; Anna Mikusheva 
Abstract:  This paper establishes some asymptotic results such as central limit theorems and consistency of variance estimation in factor models. We consider a setting common to modern macroeconomic and financial models where many counties/regions/macrovariables/assets are observed for many time periods, and when estimation of a global parameter includes aggregation of a crosssection of heterogeneous microparameters estimated separately for each entity. We establish a central limit theorem for quantities involving both crosssectional and time series aggregation, as well as for quadratic forms in timeaggregated errors. We also study sufficient conditions when one can consistently estimate the asymptotic variance. These results are useful for making inferences in twostep estimation procedures related to factor models. We avoid structural modeling of crosssectional dependence but impose timeseries independence. 
Date:  2018–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1807.06338&r=ecm 