
on Econometrics 
By:  Dogan, Osman; Taspinar, Suleyman 
Abstract:  In this study, we consider Bayesian methods for the estimation of a sample selection model with spatially correlated disturbance terms. We design a set of Markov chain Monte Carlo (MCMC) algorithms based on the method of data augmentation. The natural parameterization for the covariance structure of our model involves an unidentified parameter that complicates posterior analysis. The unidentified parameter  the variance of the disturbance term in the selection equation  is handled in different ways in these algorithms to achieve identification for other parameters. The Bayesian estimator based on these algorithms can account for the selection bias and the full covariance structure implied by the spatial correlation. We illustrate the implementation of these algorithms through a simulation study. 
Keywords:  Spatial dependence, Spatial sample selection model, Bayesian analysis, Data augmentation 
JEL:  C13 C21 C31 
Date:  2016–12–16 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:82829&r=ecm 
By:  Slawa Rokicki; Jessica Cohen; Gunther Fink; Joshua Salomon; Mary Beth Landrum 
Abstract:  Differenceindifferences (DID) estimation has become increasingly popular as an approach to evaluate the effect of a grouplevel policy on individuallevel outcomes. Several statistical methodologies have been proposed to correct for the withingroup correlation of model errors resulting from the clustering of data. Little is known about how well these corrections perform with the often small number of groups observed in health research using longitudinal data. First, we review the most commonly used modelling solutions in DID estimation for panel data, including generalized estimating equations (GEE), permutation tests, clustered standard errors (CSE), wild cluster bootstrapping, and aggregation. Second, we compare the empirical coverage rates and power of these methods using a Monte Carlo simulation study in scenarios in which we vary the degree of error correlation, the group size balance, and the proportion of treated groups. Third, we provide an empirical example using the Survey of Health, Ageing and Retirement in Europe (SHARE). When the number of groups is small, CSE are systematically biased downwards in scenarios when data are unbalanced or when there is a low proportion of treated groups. This can result in overrejection of the null even when data are composed of up to 50 groups. Aggregation, permutation tests, biasadjusted GEE and wild cluster bootstrap produce coverage rates close to the nominal rate for almost all scenarios, though GEE may suffer from low power. In DID estimation with a small number of groups, analysis using aggregation, permutation tests, wild cluster bootstrap, or biasadjusted GEE is recommended. 
Keywords:  Differenceindifferences; Clustered standard errors; Inference; Monte Carlo simulation; GEE 
JEL:  C18 C52 I10 
Date:  2018–01 
URL:  http://d.repec.org/n?u=RePEc:qub:charms:1801&r=ecm 
By:  Taspinar, Suleyman; Dogan, Osman; Bera, Anil K. 
Abstract:  In this study, we formulate the adjusted gradient tests when the alternative model used to construct tests deviates from the true data generating process for a spatial dynamic panel data model (SDPD). Following Bera et. al. (2010), we introduce these adjusted gradient tests along with the standard ones within a GMM framework. These tests can be used to detect the presence of (i) the contemporaneous spatial lag terms, (ii) the time lag term, and (iii) the spatial time lag terms in an higher order SDPD model. These adjusted tests have two advantages: (i) their null asymptotic distribution is a central chisquared distribution irrespective of the misspecified alternative model, and (ii) their test statistics are computationally simple and require only the ordinary leastsquares (OLS) estimates from a nonspatial twoway panel data model. We investigate the finite sample size and power properties of these tests through Monte Carlo studies. Our results indicates that the adjusted gradient tests have good finite sample properties. 
Keywords:  Spatial Dynamic Panel Data Model, SDPD, GMM, Robust LM Tests, GMM Gradient Tests, Inference 
JEL:  C13 C21 C31 
Date:  2017 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:82830&r=ecm 
By:  Harin, Alexander 
Abstract:  A forbidden zones theorem is deduced in the present article. Its consequences and applications are preliminary considered. The following statement is proven: if some nonzero lower bound exists for the variance of a random variable, that takes on values in a finite interval, then nonzero bounds or forbidden zones exist for its expectation near the boundaries of the interval. The article is motivated by the need of rigorous theoretical support for the practical analysis that has been performed for the influence of scattering and noise in the behavioral economics, decision sciences, utility and prospect theories. If a noise can be one of possible causes of the above lower bound on the variance, then it can cause or broaden out such forbidden zones. So the theorem can provide new possibilities for mathematical description of the influence of such a noise. The considered forbidden zones can evidently lead to some biases in measurements. 
Keywords:  probability; variance; noise; utility theory; prospect theory; behavioral economics; decision sciences; measurement; 
JEL:  C02 C1 D8 D81 
Date:  2018–01–29 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:84248&r=ecm 
By:  Pavlo Mozharovskyi (CREST; ENSAI; Université Bretagne Loire); Julie Josse (CMAP; Ecole polytechnique); François Husson (IRMAR; Applied Mathematics Unit; Agrocampus Ouest) 
Abstract:  The presented methodology for single imputation of missing values borrows the idea from data depth — a measure of centrality defined for an arbitrary point of the space with respect to a probability distribution or a data cloud. This consists in iterative maximization of the depth of each observation with missing values, and can be employed with any properly defined statistical depth function. On each single iteration, imputation is narrowed down to optimization of quadratic, linear, or quasiconcave function being solved analytically, by linear programming, or the NelderMead method, respectively. Being able to grasp the underlying data topology, the procedure is distribution free, allows to impute close to the data, preserves prediction possibilities different to local imputation methods (knearest neighbors, random forest), and has attractive robustness and asymptotic properties under elliptical symmetry. It is shown that its particular case — when using Mahalanobis depth — has direct connection to well known treatments for multivariate normal model, such as iterated regression or regularized PCA. The methodology is extended to the multiple imputation for data stemming from an elliptically symmetric distribution. Simulation and real data studies positively contrast the procedure with existing popular alternatives. The method has been implemented as an Rpackage. 
Keywords:  Elliptical symmetry, Outliers, Tukey depth, Zonoid depth, Nonparametric imputation, Convex optimization 
Date:  2017–12–14 
URL:  http://d.repec.org/n?u=RePEc:crs:wpaper:201772&r=ecm 
By:  Richard T. Carson (Department of Economics, University of California); Mikołaj Czajkowski (Faculty of Economic Sciences, University of Warsaw) 
Abstract:  We show a substantive problem exists with the widelyused ratio of coefficients approach to calculating willingness to pay (WTP) from choice models. The correctly calculated standard error for WTP using this approach is shown to always be infinity. A variant of this problem has long been recognized for mixed logit models. We show it occurs even in simple models like the conditional logit used as a baseline reference specification. It occurs because the standard error for the cost parameter implies some possibility that the true parameter value is arbitrarily close to zero. We propose a simple yet elegant way to overcome this problem by reparameterizing the coefficient of the (negative) cost variable to enforce the theoretically correct (and empirically almost always found) positive coefficient using an exponential transformation of the original parameter. This reparameterization enforces the desired restriction that nopart of the confidence region for original cost parameter spans zero. With it the confidence interval for WTP is now finite and well behaved. Our proposed model is straightforward to implement using readily available software. Its loglikelihood value is the same as the usual baseline discrete choice model and we recommend its use as the new standard baseline reference model. 
Keywords:  conditional logit, confidence intervals, contingent valuation delta method, discrete choice experiment, KrinskyRobb, multinomial logit, probit, welfare measures 
JEL:  C01 C15 C18 Q0 
Date:  2018 
URL:  http://d.repec.org/n?u=RePEc:war:wpaper:201804&r=ecm 
By:  Federico Crudu 
Abstract:  This paper introduces a novel method to estimate linear models when explanatory variables are observed with error and many proxies are available. The empirical Euclidean likelihood principle is used to combine the information that comes from the various mismeasured variables. We show that the proposed estimator is consistent and asymptotically normal. In a Monte Carlo study we show that our method is able to efficiently use the information in the available proxies, both in terms of precision of the estimator and in terms of statistical power. An application to the effect of police on crime suggests that measurement errors in the police variable induce substantial attenuation bias. Our approach, on the other hand, yields large estimates in absolute value with high precision, in accordance with the results put forward by the recent literature. 
Keywords:  data combination, empirical Euclidean likelihood, errorsinvariables, instrumental variables. 
JEL:  C13 C26 C30 C36 
Date:  2017–01 
URL:  http://d.repec.org/n?u=RePEc:usi:wpaper:774&r=ecm 
By:  Douglas Patterson; Melvin Hinich; Denisa Roberts 
Abstract:  This article develops a statistical test for the null hypothesis of strict stationarity of a discrete time stochastic process. When the null hypothesis is true, the second order cumulant spectrum is zero at all the discrete Fourier frequency pairs present in the principal domain of the cumulant spectrum. The test uses a frame (window) averaged sample estimate of the second order cumulant spectrum to build a test statistic that has an asymptotic complex standard normal distribution. We derive the test statistic, study the size and power properties of the test, and demonstrate its implementation with intraday stock market return data. The test has conservative size properties and good power to detect varying variance and unit root in the presence of varying variance. 
Date:  2018–01 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1801.06727&r=ecm 
By:  Emanuele Bacchiocchi; Andrea Bastianin; Alessandro Missale; Eduardo Rossi 
Abstract:  We develop a new VAR model for structural analysis with mixedfrequency data. The MIDASSVAR model allows to identify structural dynamic links exploiting the information contained in variables sampled at different frequencies. It also provides a general framework to test homogeneous frequencybased representations versus mixedfrequency data models. A set of Monte Carlo experiments suggests that the test performs well both in terms of size and power. The MIDASSVAR is then used to study how monetary policy and financial market volatility impact on the dynamics of gross capital inflows to the US. While no relation is found when using standard quarterly data, exploiting the variability present in the series within the quarter shows that the effect of an interest rate shock is greater the longer the time lag between the month of the shock and the end of the quarter 
Date:  2018–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1802.00793&r=ecm 
By:  Hiroyuki Kasahara; Katsumi Shimotsu 
Abstract:  Markov regime switching models have been used in numerous empirical studies in economics and finance. However, the asymptotic distribution of the likelihood ratio test statistic for testing the number of regimes in Markov regime switching models has been an unresolved problem. This paper derives the asymptotic distribution of the likelihood ratio test statistic for testing the null hypothesis of $M_0$ regimes against the alternative hypothesis of $M_0 + 1$ regimes for any $M_0 \geq 1$ both under the null hypothesis and under local alternatives. We show that the contiguous alternatives converge to the null hypothesis at a rate of $n^{1/8}$ in regime switching models with normal density. The asymptotic validity of the parametric bootstrap is also established. 
Date:  2018–01 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1801.06862&r=ecm 
By:  Olivier Collier (Modal'X; Université ParisNanterre;CREST; ENSAE); Arnak Dalalyan (Modal'X; Université ParisNanterre;CREST; ENSAE) 
Abstract:  Assume that we observe a sample of size n composed of pdimensional signals, each signal having independent entries drawn from a scaled Poisson distribution with an unknown intensity. We are interested in estimating the sum of the n unknown intensity vectors, under the assumption that most of them coincide with a given "background" signal. The number s of pdimensional signals different from the background signal plays the role of sparsity and the goal is to leverage this sparsity assumption in order to improve the quality of estimation as compared to the naive estimator that computes the sum of the observed signals. We first introduce the group hard thresholding estimator and analyze its mean squared error measured by the squared Euclidean norm. We establish a nonasymptotic upper bound showing that the risk is at most of the order of thetha^2(sp + s^2 * sqrt(p)) log^3/2(np). We then establish lower bounds on the minimax risk over a properly defined class of collections of ssparse signals. These lower bounds match with the upper bound, up to logarithmic terms, when the dimension p is fixed or of larger order than s^2. In the case where the dimension p increases but remains of smaller order than s^2, our results show a gap between the lower and the upper bounds, which can be up to order sqrt(p). 
Keywords:  Nonasymptotic minimax estimation, linear functional, groupsparsity, thresholding, Poisson processes 
Date:  2017–12–05 
URL:  http://d.repec.org/n?u=RePEc:crs:wpaper:201719&r=ecm 
By:  Alexandra Carpentier (Institut für Mathematik, Universität Potsdam); Olga Klopp (ESSEC Business School ; CREST); Matthias Löffler (University of Cambridge, Statistical Laboratory, Centre for Mathematical Sciences) 
Abstract:  In the present note we consider the problem of constructing honest and adaptive confidence sets for the matrix completion problem. For the Bernoulli model with known variance of the noise we provide a realizable method for constructing con dence sets that adapt to the unknown rank of the true matrix. 
Keywords:  low rank recovery, confidence sets, adaptivity, matrix completion 
Date:  2017–12–08 
URL:  http://d.repec.org/n?u=RePEc:crs:wpaper:201741&r=ecm 
By:  Fischer, Thomas; Krauss, Christopher; Treichel, Alex 
Abstract:  We present a comprehensive simulation study to assess and compare the performance of popular machine learning algorithms for time series prediction tasks. Specifically, we consider the following algorithms: multilayer perceptron (MLP), logistic regression, naïve Bayes, knearest neighbors, decision trees, random forests, and gradientboosting trees. These models are applied to time series from eight data generating processes (DGPs)  reflecting different linear and nonlinear dependencies (base case). Additional complexity is introduced by adding discontinuities and varying degrees of noise. Our findings reveal that advanced machine learning models are capable of approximating the optimal forecast very closely in the base case, with nonlinear models in the lead across all DGPs  particularly the MLP. By contrast, logistic regression is remarkably robust in the presence of noise, thus yielding the most favorable accuracy metrics on raw data, prior to preprocessing. When introducing adequate preprocessing techniques, such as first differencing and local outlier factor, the picture is reversed, and the MLP as well as other nonlinear techniques once again become the modeling techniques of choice. 
Date:  2018 
URL:  http://d.repec.org/n?u=RePEc:zbw:iwqwdp:022018&r=ecm 
By:  Iv\'an Fern\'andezVal; Aico van Vuuren; Francis Vella 
Abstract:  We consider identification and estimation of nonseparable sample selection models with censored selection rules. We employ a control function approach and discuss different objects of interest based on (1) local effects conditional on the control function, and (2) global effects obtained from integration over ranges of values of the control function. We provide conditions under which these objects are appropriate for the total population. We also present results regarding the estimation of counterfactual distributions. We derive conditions for identification for these different objects and suggest strategies for estimation. We also provide the associated asymptotic theory. These strategies are illustrated in an empirical investigation of the determinants of female wages and wage growth in the United Kingdom. 
Date:  2018–01 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1801.08961&r=ecm 
By:  J. Eduardo VeraVald\'es 
Abstract:  The fractional difference operator remains to be the most popular mechanism to generate long memory due to the existence of efficient algorithms for their simulation and forecasting. Nonetheless, there is no theoretical argument linking the fractional difference operator with the presence of long memory in real data. In this regard, one of the most predominant theoretical explanations for the presence of long memory is crosssectional aggregation of persistent micro units. Yet, the type of processes obtained by crosssectional aggregation differs from the one due to fractional differencing. Thus, this paper develops fast algorithms to generate and forecast long memory by crosssectional aggregation. Moreover, it is shown that the antipersistent phenomenon that arises for negative degrees of memory in the fractional difference literature is not present for crosssectionally aggregated processes. Pointedly, while the autocorrelations for the fractional difference operator are negative for negative degrees of memory by construction, this restriction does not apply to the crosssectional aggregated scheme. We show that this has implications for long memory tests in the frequency domain, which will be misspecified for crosssectionally aggregated processes with negative degrees of memory. Finally, we assess the forecast performance of highorder $AR$ and $ARFIMA$ models when the long memory series are generated by crosssectional aggregation. Our results are of interest to practitioners developing forecasts of long memory variables like inflation, volatility, and climate data, where aggregation may be the source of long memory. 
Date:  2018–01 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1801.06677&r=ecm 
By:  Martin Feldkircher (Oesterreichische Nationalbank (OeNB)); Florian Huber (Department of Economics, Vienna University of Economics and Business); Gregor Kastner (Department of Mathematics and Statistics, Vienna University of Economics and Business) 
Abstract:  We assess the relationship between model size and complexity in the timevarying parameter VAR framework via thorough predictive exercises for the Euro Area, the United Kingdom and the United States. It turns out that sophisticated dynamics through drifting coefficients are important in small data sets while simpler models tend to perform better in sizeable data sets. To combine best of both worlds, novel shrinkage priors help to mitigate the curse of dimensionality, resulting in competitive forecasts for all scenarios considered. Furthermore, we discuss dynamic model selection to improve upon the best performing individual model for each point in time. 
Keywords:  Globallocal shrinkage priors, density predictions, hierarchical modeling, stochastic volatility, dynamic model selection 
JEL:  C11 C30 C53 E52 
Date:  2018–01 
URL:  http://d.repec.org/n?u=RePEc:wiw:wiwwuw:wuwp260&r=ecm 