
on Econometrics 
By:  Maria Kyriacou (University of Southampton); Peter C.B. Phillips (University of Auckland  Yale University); Francesca Rossi (Department of Economics (University of Verona)) 
Abstract:  Spatial units typically vary over many of their characteristics, introducing potential unobserved heterogeneity which invalidates commonly used homoskedasticity conditions. In the presence of unobserved heteroskedasticity, standard methods based on the (quasi)likelihood function generally produce inconsistent estimates of both the spatial parameter and the coefficients of the exogenous regressors. A robust generalized method of moments estimator as well as a modified likelihood method have been proposed in the literature to address this issue. The present paper constructs an alternative indirect inference approach which relies on a simple ordinary least squares procedure as its starting point. Heteroskedasticity is accommodated by utilizing a new version of continuous updating that is applied within the indirect inference procedure to take account of the parametrization of the variancecovariance matrix of the disturbances. Finite sample performance of the new estimator is assessed in a Monte Carlo study and found to offer advantages over existing methods. The approach is implemented in an empirical application to house price data in the Boston area, where it is found that spatial effects in house price determination are much more significant under robustification to heterogeneity in the equation errors. 
Keywords:  Spatial autoregression; Unknown heteroskedasticity; Indirect inference; Robust methods; Weights matrix. 
JEL:  C13 C15 C21 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:ver:wpaper:15/2019&r=all 
By:  Didier Nibbering 
Abstract:  The number of parameters in a standard multinomial choice model increases linearly with the number of choice alternatives and number of explanatory variables. Since many modern applications involve large choice sets with categorical explanatory variables, which enter the model as large sets of binary dummies, the number of parameters easily approaches the sample size. This paper proposes a new method for datadriven parameter clustering over outcome categories and explanatory dummy categories in a multinomial probit setting. A Dirichlet process mixture encourages parameters to cluster over the categories, which favours a parsimonious model specification without a priori imposing model restrictions. An application to a dataset of holiday destinations shows a decrease in parameter uncertainty, an enhancement of the parameter interpretability, and an increase in predictive performance, relative to a standard multinomial choice model. 
Keywords:  large choice sets, Dirichlet process prior, multinomial probit model, highdimensional models 
JEL:  C11 C14 C25 C35 C51 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:201919&r=all 
By:  Patrick Leung; Catherine S. Forbes; Gael M Martin; Brendan McCabe 
Abstract:  We investigate the impact of filter choice on forecast accuracy in state space models. The filters are used both to estimate the posterior distribution of the parameters, via a particle marginal MetropolisHastings (PMMH) algorithm, and to produce draws from the filtered distribution of the final state. Multiple filters are entertained, including two new datadriven methods. Simulation exercises are used to document the performance of each PMMH algorithm, in terms of computation time and the efficiency of the chain. We then produce the forecast distributions for the onestepahead value of the observed variable, using a fixed number of particles and Markov chain draws. Despite distinct differences in efficiency, the filters yield virtually identical forecasting accuracy, with this result holding under both correct and incorrect specification of the model. This invariance of forecast performance to the specification of the filter also characterizes an empirical analysis of S&P500 daily returns. 
Keywords:  Bayesian prediction, particle MCMC; nonGaussian time series, state space models, unbiased likelihood estimation, sequential Monte Carlo. 
JEL:  C11 C22 C58 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:201922&r=all 
By:  Shi, Chengchun; Lu, Wenbin; Song, Rui 
Abstract:  Statistical relational learning is primarily concerned with learning and inferring relationships between entities in largescale knowledge graphs. Nickel et al. (2011) proposed a RESCAL tensor factorization model for statistical relational learning, which achieves better or at least comparable results on common benchmark data sets when compared to other stateoftheart methods. Given a positive integer s, RESCAL computes an sdimensional latent vector for each entity. The latent factors can be further used for solving relational learning tasks, such as collective classification, collective entity resolution and linkbased clustering. The focus of this paper is to determine the number of latent factors in the RESCAL model. Due to the structure of the RESCAL model, its loglikelihood function is not concave. As a result, the corresponding maximum likelihood estimators (MLEs) may not be consistent. Nonetheless, we design a specific pseudometric, prove the consistency of the MLEs under this pseudometric and establish its rate of convergence. Based on these results, we propose a general class of information criteria and prove their model selection consistencies when the number of relations is either bounded or diverges at a proper rate of the number of entities. Simulations and real data examples show that our proposed information criteria have good finite sample properties. 
Keywords:  information criteria; knowledge graph; model selection consistency; RESCAL model; statistical relational learning; tensor factorization 
JEL:  C1 
Date:  2019–02–01 
URL:  http://d.repec.org/n?u=RePEc:ehl:lserod:102110&r=all 
By:  Licht, Adrian; Escribano, Álvaro; Blazsek, Szabolcs 
Abstract:  In this paper, new SeasonalQVAR (quasivector autoregressive) and Markov switching (MS) SeasonalQVAR (MSSeasonalQVAR) models are introduced. SeasonalQVAR is an outlierrobust scoredriven state space model, which is an alternative to classical multivariate Gaussian models (e.g. basic structural model; SeasonalVARMA). Conditions of the maximum likelihood estimator and impulse response functions are shown. Dynamic relationships between world crude oil production and US industrial production are studied for the period of 1973 to 2019. Statistical performances of alternative models are analyzed. MSSeasonalQVAR identies structural changes and extreme observations in the dataset. MSSeasonalQVAR is superior to SeasonalQVAR and, and both are superior to Gaussian alternatives. 
Keywords:  Markov RegimeSwitching Models; ScoreDriven Multivariate Stochastic Location And Stochastic Seasonality Models; Score Models; Dynamic Conditional; United States Industrial Production; World Crude Oil Production 
JEL:  C52 C51 C32 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:cte:werepe:29030&r=all 
By:  Ruoxuan Xiong; Markus Pelger 
Abstract:  This paper develops the inferential theory for latent factor models estimated from large dimensional panel data with missing observations. We estimate a latent factor model by applying principal component analysis to an adjusted covariance matrix estimated from partially observed panel data. We derive the asymptotic distribution for the estimated factors, loadings and the imputed values under a general approximate factor model. The key application is to estimate counterfactual outcomes in causal inference from panel data. The unobserved control group is modeled as missing values, which are inferred from the latent factor model. The inferential theory for the imputed values allows us to test for individual treatment effects at any time. We apply our method to portfolio investment strategies and find that around 14% of their average returns are significantly reduced by the academic publication of these strategies. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.08273&r=all 
By:  Jushan Bai; Sung Hoon Choi; Yuan Liao 
Abstract:  This paper considers generalized least squares (GLS) estimation for linear panel data models. By estimating the large error covariance matrix consistently, the proposed feasible GLS (FGLS) estimator is robust to heteroskedasticity, serial correlation, and crosssectional correlation. It is more efficient than the OLS estimator. To control serial correlation, we employ the banding method. To control crosssectional correlation, without knowing the clusters, we suggest using the thresholding method. We establish the consistency of the proposed estimator. A Monte Carlo study is considered. The proposed method is applied to an empirical application. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.09004&r=all 
By:  Fei Liu; Jiti Gao; Yanrong Yang 
Abstract:  Panel data subject to heterogeneity in both crosssectional and timeserial directions are commonly encountered across social and scientific fields. To address this problem, we propose a class of timevarying panel data models with individualspecific regression coefficients and interactive common factors. This results in a model capable of describing heterogeneous panel data in terms of timevaryingness in the timeserial direction and individualspecific coefficients among crosssections. Another striking generality of this proposed model relies on its compatibility with endogeneity in the sense of interactive common factors. Model estimation is achieved through a novel duple leastsquares (DLS) iteration algorithm, which implements two leastsquares estimation recursively. Its unified ability in estimation is nicely illustrated according to flexible applications on various cases with exogenous or endogenous common factors. Established asymptotic theory for DLS estimators benefits practitioners by demonstrating effectiveness of iteration in eliminating estimation bias gradually along with iterative steps. We further show that our model and estimation perform well on simulated data in various scenarios as well as an OECD healthcare expenditure dataset. The timevariation and heterogeneity among crosssections are confirmed by our analysis. 
Keywords:  crosssectional dependence, duple LS iteration, endogeniety, nonparametric kernel estimation. 
JEL:  C14 C23 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:201924&r=all 
By:  Anastasios Panagiotelis; Puwasala Gamakumara; George Athanasopoulos; Rob J Hyndman 
Abstract:  A geometric interpretation is developed for socalled reconciliation methodologies used to forecast time series that adhere to known linear constraints. In particular, a general framework is established nesting many existing popular reconciliation methods within the class of projections. This interpretation facilitates the derivation of novel results that explain why and how reconciliation via projection is guaranteed to improve forecast accuracy with respect to a specific class of loss functions. The result is also demonstrated empirically. The geometric interpretation is further used to provide a new proof that forecast reconciliation results in unbiased forecasts provided the initial base forecasts are also unbiased. Approaches for dealing with biased base forecasts are proposed and explored in an extensive empirical study on Australian tourism flows. Overall, the method of biascorrecting before carrying out reconciliation is shown to outperform alternatives that only biascorrect or only reconcile forecasts. 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:201918&r=all 
By:  Niko Hauzenberger; Florian Huber; Gary Koop; Luca Onorante 
Abstract:  In this paper, we write the timevarying parameter regression model involving K explanatory variables and T observations as a constant coefficient regression model with TK explanatory variables. In contrast with much of the existing literature which assumes coefficients to evolve according to a random walk, this specification does not restrict the form that the timevariation in coefficients can take. We develop computationally efficient Bayesian econometric methods based on the singular value decomposition of the TK regressors. In artificial data, we find our methods to be accurate and much faster than standard approaches in terms of computation time. In an empirical exercise involving inflation forecasting using a large number of predictors, we find our methods to forecast better than alternative approaches and document different patterns of parameter change than are found with approaches which assume random walk evolution of parameters. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.10779&r=all 
By:  Yinchu Zhu 
Abstract:  In this paper, we consider the problem of learning models with a latent factor structure. The focus is to find what is possible and what is impossible if the usual strong factor condition is not imposed. We study the minimax rate and adaptivity issues in two problems: pure factor models and panel regression with interactive fixed effects. For pure factor models, if the number of factors is known, we develop adaptive estimation and inference procedures that attain the minimax rate. However, when the number of factors is not specified a priori, we show that there is a tradeoff between validity and efficiency: any confidence interval that has uniform validity for arbitrary factor strength has to be conservative; in particular its width is bounded away from zero even when the factors are strong. Conversely, any datadriven confidence interval that does not require as an input the exact number of factors (including weak ones) and has shrinking width under strong factors does not have uniform coverage and the worstcase coverage probability is at most 1/2. For panel regressions with interactive fixed effects, the tradeoff is much better. We find that the minimax rate for learning the regression coefficient does not depend on the factor strength and propose a simple estimator that achieves this rate. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.10382&r=all 
By:  Florian Gunsilius 
Abstract:  Partial identification approaches have seen a sharp increase in interest in econometrics due to improved flexibility and robustness compared to pointidentification approaches. However, formidable computational requirements of existing approaches often offset these undeniable advantagesparticularly in general instrumental variable models with continuous variables. This article introduces a computationally tractable method for estimating bounds on functionals of counterfactual distributions in continuous instrumental variable models. Its potential applications include randomized trials with imperfect compliance, the evaluation of social programs and, more generally, simultaneous equations models. The method does not require functional form restrictions a priori, but can incorporate parametric or nonparametric assumptions into the estimation process. It proceeds by solving an infinite dimensional program on the paths of a system of counterfactual stochastic processes in order to obtain the counterfactual bounds. A novel "sampling of paths" approach provides the practical solution concept and probabilistic approximation guarantees. As a demonstration of its capabilities, the method provides informative nonparametric bounds on household expenditures under the sole assumption that expenditure is continuous, showing that partial identification approaches can yield informative bounds under minimal assumptions. Moreover, it shows that additional monotonicity assumptions lead to considerably tighter bounds, which constitutes a novel assessment of the identificatory strength of such nonparametric assumptions in a unified framework. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.09502&r=all 
By:  Thiyanga S. Talagala; Feng Li; Yanfei Kang 
Abstract:  This paper introduces a novel metalearning algorithm for time series forecasting. The efficient Bayesian multivariate surface regression approach is used to model forecast error as a function of features calculated from the time series. The minimum predicted forecast error is then used to identify an individual model or combination of models to produce forecasts. In general, the performance of any metalearner strongly depends on the reference dataset used to train the model. We further examine the feasibility of using GRATIS (a featurebased time series simulation approach) in generating a realistic time series collection to obtain a diverse collection of time series for our reference set. The proposed framework is tested using the M4 competition data and is compared against several benchmarks and other commonly used forecasting approaches. The new approach obtains performance comparable to the second and the third rankings of the M4 competition. 
Keywords:  tme series, metalearning, mixture autoregressive models, surface regression, M4 competition 
JEL:  C10 C14 C22 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:201921&r=all 
By:  Paul Levine (University of Surrey and CIMS); Joseph Pearlman (City University); Stephen Wright (Birkbeck College); Bo Yang (Swansea University) 
Abstract:  How informative is a time series representation of a given vector of observables about the structural shocks and impulse response functions in a DSGE model? In this paper we refer to this econometrician's problem as “Einvertibility" and consider the corresponding information problem of the agents in the assumed DGP, the DSGE model, which we refer to as “Ainvertibility" We consider how the general nature of the agents' signal extraction problem under imperfect information impacts on the econometrician's problem of attempting to infer the nature of structural shocks and associated impulse responses from the data. We also examine a weaker condition of recoverability. A general conclusion is that validating a DSGE model by comparing its impulse response functions with those of a data VAR is more problematic when we drop the common assumption in the literature that agents have perfect information as an endowment. We develop measures of approximate fundamentalness for both perfect and imperfect information cases and illustrate our results using analytical and numerical examples. 
JEL:  C11 C18 C32 E32 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:sur:surrec:1619&r=all 
By:  Matteo Barigozzi; Matteo Luciani 
Abstract:  This paper considers estimation of large dynamic factor models with common and idiosyncratic trends by means of the Expectation Maximization algorithm, implemented jointly with the Kalman smoother. We show that, as the crosssectional dimension $n$ and the sample size $T$ diverge to infinity, the common component for a given unit estimated at a given point in time is $\min(\sqrt n,\sqrt T)$consistent. The case of local levels and/or local linear trends trends is also considered. By means of a MonteCarlo simulation exercise, we compare our approach with estimators based on principal component analysis. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.09841&r=all 
By:  Milan Kumar Das; Anindya Goswami; Sharan Rajani 
Abstract:  We have developed a statistical technique to test the model assumption of binary regime switching extension of the geometric L\'{e}vy process (GLP) by proposing a new discriminating statistics. The statistics is sensitive to the transition kernel of the regime switching model. With this statistics, given a time series data, one can test the hypothesis on the nature of regime switching. Furthermore, we have implemented this statistics for testing the regime switching hypothesis with Indian sectoral indices and have reported the result here. The result shows a clear indication of presence of multiple regimes in the data. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.10606&r=all 
By:  Weilun Zhou; Jiti Gao; David Harris; Hsein Kew 
Abstract:  This paper studies a semiparametric singleindex predictive regression model with multiple nonstationary predictors that exhibit comovement behaviour. Orthogonal series expansion is employed to approximate the unknown link function in the model and the estimator is derived from an optimization under constraint. The main finding includes two types of superconsistency rates for the estimators of the index parameter. The central limit theorem is established for a plugin estimator of the unknown link function. In the empirical studies, we provide ample evidence in favor of nonlinear predictability of the stock return using four pairs of nonstationary predictors. 
Keywords:  predictive regression, singleindex model, Hermite orthogonal estimation, dual superconsistency rates, comoving predictors. 
JEL:  C13 C14 C32 C51 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:201925&r=all 
By:  Nathalie Gimenes; Emmanuel Guerre 
Abstract:  This paper introduces a version of the interdependent value model of Milgrom and Weber (1982), where the signals are given by an index gathering signal shifters observed by the econometrician and private ones specific to each bidders. The model primitives are shown to be nonparametrically identified from firstprice auction bids under a testable mild rank condition. Identification holds for all possible signal values. This allows to consider a wide range of counterfactuals where this is important, as expected revenue in secondprice auction. An estimation procedure is briefly discussed. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.10646&r=all 
By:  Charpentier; Arthur; Mussard; Stephane; Tea Ouraga 
Abstract:  A principal component analysis based on the generalized Gini correlation index is proposed (Gini PCA). The Gini PCA generalizes the standard PCA based on the variance. It is shown, in the Gaussian case, that the standard PCA is equivalent to the Gini PCA. It is also proven that the dimensionality reduction based on the generalized Gini correlation matrix, that relies on cityblock distances, is robust to outliers. Monte Carlo simulations and an application on cars data (with outliers) show the robustness of the Gini PCA and provide different interpretations of the results compared with the variance PCA. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.10133&r=all 
By:  Amit Gandhi; JeanFrançois Houde 
Abstract:  We study the estimation of substitution patterns within the discrete choice framework developed by Berry (1994) and Berry, Levinsohn, and Pakes (1995). Our objective, is to illustrate the consequences of using weak instruments in this nonlinear GMM context, and propose a new class of instruments that can be used to estimate a large family of models with aggregate data. We argue that relevant instruments should reflect the (exogenous) degree of differentiation of each product in a market (Differentiation IVs), and provide a series of examples to illustrate the performance of simple instrument functions. 
JEL:  C35 C36 L13 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:26375&r=all 
By:  Yaya, OlaOluwa S; Ogbonna, Ephraim A; Furuoka, Fumitaka; GilAlana, Luis A. 
Abstract:  This paper proposes a nonlinear unit root test based on the artificial neural networkaugmented DickeyFuller (ANNADF) test for testing hysteresis in unemployment. In this new unit root test, the linear, quadratic and cubic components of the neural network process are used to capture the nonlinearity in the timeseries data. Fractional integration methods, based on linear and nonlinear trends are also used in the paper. By considering five European countries such as France, Italy, Netherland, Sweden, and the United Kingdom, the empirical findings indicate that there is still hysteresis in these countries. Among batteries of unit root tests applied, both the ARNNADF and fractional integration tests fail to reject the hypothesis of unemployment hysteresis in all the countries. 
Keywords:  Unit root process; Nonlinearity; Neuron network: Timeseries; Hysteresis; Unemployment; Europe; Labour market. 
JEL:  C22 
Date:  2019–10–19 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:96621&r=all 
By:  Lu Bai; Lixin Cui; Lixiang Xu; Yue Wang; Zhihong Zhang; Edwin R. Hancock 
Abstract:  In this work, we develop a novel framework to measure the similarity between dynamic financial networks, i.e., timevarying financial networks. Particularly, we explore whether the proposed similarity measure can be employed to understand the structural evolution of the financial networks with time. For a set of timevarying financial networks with each vertex representing the individual time series of a different stock and each edge between a pair of time series representing the absolute value of their Pearson correlation, our start point is to compute the commute time matrix associated with the weighted adjacency matrix of the network structures, where each element of the matrix can be seen as the enhanced correlation value between pairwise stocks. For each network, we show how the commute time matrix allows us to identify a reliable set of dominant correlated time series as well as an associated dominant probability distribution of the stock belonging to this set. Furthermore, we represent each original network as a discrete dominant Shannon entropy time series computed from the dominant probability distribution. With the dominant entropy time series for each pair of financial networks to hand, we develop a similarity measure based on the classical dynamic time warping framework, for analyzing the financial timevarying networks. We show that the proposed similarity measure is positive definite and thus corresponds to a kernel measure on graphs. The proposed kernel bridges the gap between graph kernels and the classical dynamic time warping framework for multiple financial time series analysis. Experiments on timevarying networks extracted through New York Stock Exchange (NYSE) database demonstrate the effectiveness of the proposed approach. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.09153&r=all 