
on Econometrics 
By:  Pua, Andrew Adrian Yu; Fritsch, Markus; Schnurbus, Joachim 
Abstract:  We propose an instrumental variables (IV) estimator based on nonlinear (in param eters) moment conditions for estimating linear dynamic panel data models and derive the large sample properties of the estimator. We assume that the only explanatory variable in the model is one lag of the dependent variable and consider the setting where the absolute value of the true lag parameter is smaller or equal to one, the cross section dimension is large, and the time series dimension is either fixed or large. Estimation of the lag parameter involves solving a quadratic equation and we find that the lag parameter is point identified in the unit root case; otherwise, two distinct roots (solutions) result. We propose a selection rule that identifies the consistent root asymptotically in the latter case and derive the asymptotic distribution of the estimator for the unit root case and for the case when the absolute value of the lag parameter is smaller than one. 
Keywords:  panel data,linear dynamic model,quadratic moment conditions,instrumental variables,large sample properties 
JEL:  C23 C26 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:zbw:upadbr:b3719&r=all 
By:  Zheng Fang; Juwon Seo 
Abstract:  This paper presents a general and uniformly valid procedure for conducting inference on shape restrictions. The key insight we exploit is that common shape restrictions in economics often form convex cones, a simple and yet elegant structure that has been barely harnessed in the literature. Based on a monotonicity property afforded by such a geometric structure, we develop a bootstrap test that is computationally convenient to implement. In particular, unlike many studies in similar nonstandard settings, the procedure dispenses with set estimation, and the critical values are obtained in a way as simple as computing the test statistic. Moreover, by appealing to the machinery of strong approximations, we accommodate nonparametric settings where estimators of the parameter of interest may not admit asymptotic distributions. We establish asymptotic uniform size control of our test, and characterize classes of alternatives against which it has nontrivial power. Since the test entails a tuning parameter (due to the inherent nonstandard nature of the problem), we propose a datadriven choice and prove its validity. Monte Carlo simulations confirm that our test works well. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.07689&r=all 
By:  Beaumont, Paul; Smallwood, Aaron 
Abstract:  Despite a recent proliferation of research using cyclical long memory, surprisingly little is known regarding the asymptotic properties of likelihoodbased methods. Estimators have been studied in both the time and frequency domains for the Gegenbauer autoregressive moving average process (GARMA). However, a full set of asymptotic results for all parameters has only been proposed by Chung (1996a,b), who present somewhat tenuous results without an initial consistency proof. In this paper, we review the GARMA process and the properties of frequency and time domain likelihoodbased estimators using Monte Carlo analysis. The results demonstrate the strong efficacy of both estimators and generally sup port the proposed theory of Chung for the parameter governing the cycle length. Important caveats await. The results show that asymptotic confidence bands can be unreliable in very small samples under weak long memory, and the distribution theory under the null of an infinitely long cycle appears to be unusable. Possible solutions are proposed, including the use of narrower confidence bands and the application of theory under the alternative of finite cycles. 
Keywords:  long memory, GARMA, CSS estimator, Whittle estimator 
JEL:  C22 C4 C40 C5 
Date:  2019–09–30 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:96313&r=all 
By:  Hecq, Alain; Issler, João Victor; Telg, Sean 
Abstract:  The mixed causalnoncausal autoregressive (MAR) model has been proposed to estimate time series processes involving explosive roots in the autoregressive part, as it allows for stationary forward and backward solutions. Possible exogenous variables are substituted into the error term to ensure the univariate MAR structure of the variable of interest. To study the impact of fundamental exogenous variables directly, we instead consider a MARX representation which allows for the inclusion of exogenous regressors. We argue that, contrary to MAR models, MARX models might be identified using secondorder properties. The asymptotic distribution of the MARX parameters is derived assuming a class of nonGaussian densities. We assume a Student’s tlikelihood to derive closed form solutions of the corresponding standard errors. By means of Monte Carlo simulations, we evaluate the accuracy of MARX model selection based on information criteria. We examine the influence of the U.S. exchange rate and industrial production index on several commodity prices. 
Date:  2019–10–10 
URL:  http://d.repec.org/n?u=RePEc:fgv:epgewp:810&r=all 
By:  Fritsch, Markus 
Abstract:  The linear dynamic panel data model provides a possible avenue to deal with unobservable individualspecific heterogeneity and dynamic relationships in panel data. The model structure renders standard estimation techniques inconsistent. Estimation and inference can, however, be carried out with the generalized method of moments (GMM) by suitably aggregating population orthogonality conditions directly deduced from the underlying modeling assumptions. Different variations of these assumptions are proposed in the literature  often lacking a thorough discussion of the implications for estimation and inference. This paper aims to enhance the understanding of the assumptions and their interplay by connecting the assumptions and the conditions required to establish identification and consistency, derive the asymptotic properties, and carry out inference for the GMM estimator. 
Keywords:  GMM,linear dynamic panel data model,identi cation,large sample properties,inference 
JEL:  C10 C23 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:zbw:upadbr:b3619&r=all 
By:  Jason Anastasopoulos 
Abstract:  The regression discontinuity design (RDD) has become the "gold standard" for causal inference with observational data. Local average treatment effects (LATE) for RDDs are often estimated using local linear regressions with pretreatment covariates typically added to increase the efficiency of treatment effect estimates, but their inclusion can have large impacts on LATE point estimates and standard errors, particularly in small samples. In this paper, I propose a principled, efficiencymaximizing approach for covariate adjustment of LATE in RDDs. This approach allows researchers to combine contextspecific, substantive insights with automated model selection via a novel adaptive lasso algorithm. When combined with currently existing robust estimation methods, this approach improves the efficiency of LATE RDD with pretreatment covariates. The approach will be implemented in a forthcoming R package, AdaptiveRDD which can be used to estimate and compare treatment effects generated by this approach with extant approaches. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.06381&r=all 
By:  Pua, Andrew Adrian Yu; Fritsch, Markus; Schnurbus, Joachim 
Abstract:  We study the estimation of the lag parameter of linear dynamic panel data models with first order dynamics based on the quadratic Ahn and Schmidt (1995) moment conditions. Our contribution is twofold: First, we show that extending the standard assumptions by mean stationarity and time series homoscedasticity and employing these assumptions in estimation restores standard asymptotics and mitigates the nonstandard distributions found in the literature. Second, we consider an IV estimator based on the quadratic moment conditions that consistently identifies the true population parameter under standard assumptions. Standard asymptotics hold for the estimator when the cross section dimension is large and the time series dimension is finite. We also suggest a datadriven approach to obtain standard errors and confidence intervals that preserves the time series dependence structure in the data. 
Keywords:  panel data,linear dynamic model,quadratic moment conditions,root selection,standard asymptotics,inference 
JEL:  C18 C23 C26 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:zbw:upadbr:b3819&r=all 
By:  Atefeh Zamani; Hossein Haghbin; Maryam Hashemi; Rob J Hyndman 
Abstract:  Functional autoregressive models are popular for functional time series analysis, but the standard formulation fails to address seasonal behaviour in functional time series data. To overcome this shortcoming, we introduce seasonal functional autoregressive time series models. For the model of order one, we derive sufficient stationarity conditions and limiting behavior, and provide estimation and prediction methods. Some properties of the general order P model are also presented. The merits of these models are demonstrated using simulation studies and via an application to real data. 
Keywords:  functional time series analysis, seasonal functional autoregressive model, central limit theorem, prediction, estimation 
JEL:  C32 C14 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:201916&r=all 
By:  Jushan Bai; Sung Hoon Choi; Yuan Liao 
Abstract:  This paper develops a new standarderror estimator for linear panel data models. The proposed estimator is robust to heteroskedasticity, serial correlation, and crosssectional correlation of unknown form. Serial correlation is controlled by the NeweyWest method. To control crosssectional correlations, we propose to use the thresholding method, without assuming the clusters to be known. We establish the consistency of the proposed estimator. Monte Carlo simulations show the method works well. An empirical application is considered. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.07406&r=all 
By:  David M. Kaplan (Department of Economics, University of Missouri) 
Abstract:  Bias and variance help measure how bad (or good) an estimator is. When considering a single estimate, minimizing variance plus squared bias (i.e., mean squared error) is optimal in a certain sense. Sometimes a smoothing parameter is explicitly chosen to produce such an optimal estimator. However, important parameters in economics are often estimated multiple times, in many studies over many years, collectively contributing to a public body of evidence. From this perspective, the bias of each single estimate is relatively more important, even if mean squared error minimization remains the goal. This suggests some tension between the single best estimate a paper can report and the estimate that contributes most to the public good. Simulations compare instrumental variables and linear regression, as well as different levels of smoothing for instrumental variables quantile regression. 
Keywords:  bias, mean squared error, metaanalysis, optimal estimation, science 
JEL:  C44 C52 
Date:  2019–09–27 
URL:  http://d.repec.org/n?u=RePEc:umc:wpaper:1911&r=all 
By:  Florian Ziel; Kevin Berk 
Abstract:  In recent years, probabilistic forecasting is an emerging topic, which is why there is a growing need of suitable methods for the evaluation of multivariate predictions. We analyze the sensitivity of the most common scoring rules, especially regarding quality of the forecasted dependency structures. Additionally, we propose scoring rules based on the copula, which uniquely describes the dependency structure for every probability distribution with continuous marginal distributions. Efficient estimation of the considered scoring rules and evaluation methods such as the DieboldMariano test are discussed. In detailed simulation studies, we compare the performance of the renowned scoring rules and the ones we propose. Besides extended synthetic studies based on recently published results we also consider a real data example. We find that the energy score, which is probably the most widely used multivariate scoring rule, performs comparably well in detecting forecast errors, also regarding dependencies. This contradicts other studies. The results also show that a proposed copula score provides very strong distinction between models with correct and incorrect dependency structure. We close with a comprehensive discussion on the proposed methodology. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.07325&r=all 
By:  Joshua C.C. Chan 
Abstract:  Timevarying parameter VARs with stochastic volatility are routinely used for structural analysis and forecasting in settings involving a few macroeconomic variables. Applying these models to highdimensional datasets has proved to be challenging due to intensive computations and overparameterization concerns. We develop an efficient Bayesian sparsification method for a class of models we call hybrid TVPVARs  VARs with timevarying parameters in some equations but constant coefficients in others. Specifically, for each equation, the new method automatically decides (i) whether the VAR coefficients are constant or timevarying, and (ii) whether the error variance is constant or has a stochastic volatility specification. Using US datasets of various dimensions, we find evidence that the VAR coefficients and error variances in some, but not all, equations are time varying. These large hybrid TVPVARs also forecast better than standard benchmarks. 
Keywords:  large vector autoregression, timevarying parameter, stochastic volatility, trend output growth, macroeconomic forecasting 
JEL:  C11 C52 E37 E47 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:een:camaaa:201977&r=all 
By:  Tetsuya Kaji 
Abstract:  This paper develops asymptotic theory of integrals of empirical quantile functions with respect to random weight functions, which is an extension of classical $L$statistics. They appear when sample trimming or Winsorization is applied to asymptotically linear estimators. The key idea is to consider empirical processes in the spaces appropriate for integration. First, we characterize weak convergence of empirical distribution functions and random weight functions in the space of bounded integrable functions. Second, we establish the delta method for empirical quantile functions as integrable functions. Third, we derive the delta method for $L$statistics. Finally, we prove weak convergence of their bootstrap processes, showing validity of nonparametric bootstrap. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.07572&r=all 
By:  Imma Valentina Curato; Simona Sanfelici 
Abstract:  We study the finite sample properties of the Fourier estimator of the integrated leverage effect in the presence of microstructure noise contamination. Our estimation strategy is related to a measure of the contemporaneous correlation between financial returns and their volatility increments. We do not prior assume that the aforementioned correlation is constant, as mainly done in the literature. We instead consider it as a stochastic process. In this framework, we show that the Fourier estimator is asymptotically unbiased but its mean squared error diverges when noisy highfrequency data are in use. This drawback of the estimator is further analyzed in a simulation study where a feasible estimation strategy is developed to tackle this problem. The paper concludes with an empirical study on the leverage effect patterns estimated using highfrequency data for the S&P 500 futures between January 2007 and December 2008. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.06660&r=all 
By:  Shanika L Wickramasuriya; Berwin A Turlach; Rob J Hyndman 
Abstract:  The sum of forecasts of a disaggregated time series are often required to equal the forecast of the aggregate. The least squares solution for finding coherent forecasts uses a reconciliation approach known as MinT, proposed by Wickramasuriya, Athanasopoulos and Hyndman (2019). The MinT approach and its variants do not guarantee that the coherent forecasts are nonnegative, even when all of the original forecasts are nonnegative in nature. This has become a serious issue in applications that are inherently nonnegative such as with sales data or tourism numbers. While overcoming this difficulty, we consider the analytical solution of MinT as a least squares minimization problem. The nonnegativity constraints are then imposed on the minimization problem to ensure that the coherent forecasts are strictly nonnegative. Considering the dimension and sparsity of the matrices involved, and the alternative representation of MinT, this constrained quadratic programming problem is solved using three algorithms. They are the block principal pivoting algorithm, projected conjugate gradient algorithm, and scaled gradient projection algorithm. A Monte Carlo simulation is performed to evaluate the computational performances of these algorithms. The results demonstrate that the block principal pivoting algorithm clearly outperforms the rest, and projected conjugate gradient is the second best. The superior performance of the block principal pivoting algorithm can be partially attributed to the alternative representation of the weight matrix in the MinT approach. An empirical investigation is carried out to assess the impact of imposing nonnegativity constraints on forecast reconciliation. It is observed that slight gains in forecast accuracy have occurred at the most disaggregated level. At the aggregated level slight losses are also observed. Although the gains or losses are negligible, the procedure plays an important role in decision and policy implementation processes. 
Keywords:  aggregation, Australian tourism, coherent forecasts, contemporaneous error correlation, forecast combinations, least squares, nonnegative, spatial correlations, reconciliation 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:201915&r=all 
By:  Jushan Bai; Serena Ng 
Abstract:  This paper suggests a factorbased imputation procedure that uses the factors estimated from a TALL block along with the rerotated loadings estimated from a WIDE block to impute missing values in a panel of data. Under a strong factor assumption, it is shown that the common component can be consistently estimated but there will be four different convergence rates. Reestimation of the factors from the imputed data matrix can accelerate convergence. A complete characterization of the sampling error is obtained without requiring regularization or imposing the missing at random assumption. Under the assumption that potential outcome has a factor structure, we provide a distribution theory for the estimated treatment effect on the treated. The relation between incoherence conditions used in matrix completions and the strong factor assumption is also discussed. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.06677&r=all 
By:  D.S.G. Pollock 
Abstract:  An autoregressive movingaverage model in discrete time is driven by a forcing function that is necessarily limited in frequency to the Nyquist value of _ radians per sampling interval. The linear stochastic model that is commonly regarded as the counterpart in continuous time of the autoregressive movingaverage model is driven by a forcing function that consists of the increments of a Wiener process. This function is unbounded in frequency. The disparity in the frequency contents of the two forcing functions creates diffculties in defining a correspondence between the discretetime and continuoustime models. These diffculties are alleviated when the continuoustime forcing function is limited in frequency by the Nyquist value. Then, there is an immediate onetoone correspondence been the discretetime autoregressive movingaverage model and its continuoustime counterpart, of which the parameters can be readily inferred from those of the discretetime model. These parameters can also serve as the starting point of an algorithm that seeks the parameters of the continuoustime model that is driven by a forcing function of unbounded frequencies. 
URL:  http://d.repec.org/n?u=RePEc:lec:leecon:19/10&r=all 
By:  Bharat Chandar; Ali Hortacsu; John List; Ian Muir; Jeffrey Wooldridge 
Abstract:  Field experiments conducted with the village, city, state, region, or even country as the unit of randomization are becoming commonplace in the social sciences. While convenient, subsequent data analysis may be complicated by the constraint on the number of clusters in treatment and control. Through a battery of Monte Carlo simulations, we examine best practices for estimating unitlevel treatment effects in clusterrandomized field experiments, particularly in settings that generate short panel data. In most settings we consider, unitlevel estimation with unit fixed effects and clusterlevel estimation weighted by the number of units per cluster tend to be robust to potentially problematic features in the data while giving greater statistical power. Using insights from our analysis, we evaluate the effect of a unique field experiment: a nationwide tipping field experiment across markets on the Uber app. Beyond the import of showing how tipping affects aggregate outcomes, we provide several insights on aspects of generating and analyzing clusterrandomized experimental data when there are constraints on the number of experimental units in treatment and control. 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:feb:natura:00681&r=all 
By:  D.S.G. Pollock 
Abstract:  The relationship between autoregressive movingaverage (ARMA) models in discrete time and the corresponding models in continuous time is examined in this paper. The linear stochastic models that are commonly regarded as the counterparts of the ARMA models are driven by a forcing function that consists of the increments of a Wiener Process. This function is unbounded in frequency. In cases where the periodogram of the data indicates that there is a clear upper bound to its frequency content, we propose an alternative frequency limited whitenoise forcing function. Then, there is a straightforward translation from the ARMA model to a differential equation, which is based of the principle of impulse invariance. Whenever there is no perceptible limit to the frequency content, the translation must be based on a principle of autocovariance equivalence. On the website of the author, there is a computer program that effects both of these discretetocontinuous translations. 
URL:  http://d.repec.org/n?u=RePEc:lec:leecon:19/07&r=all 
By:  Evan Munro; Serena Ng 
Abstract:  Data from surveys are increasingly available as the internet provides a new medium for conducting them. A typical survey consists of multiple questions, each with a menu of responses that are often categorical and qualitative in nature, and respondents are heterogeneous in both observed and unobserved ways. Existing methods that construct summary indices often ignore discreteness and do not provide adequately capture heterogeneity among individuals. We capture these features in a set of low dimensional latent variables using a Bayesian hierarchical latent class model that is adapted from probabilistic topic modeling of text data. An algorithm based on stochastic optimization is proposed to estimate a model for repeated surveys when conjugate priors are no longer available. Guidance on selecting the number of classes is also provided. The methodology is used in three applications, one to show how wealth indices can be constructed for developing countries where continuous data tend to be unreliable, and one to show that there is information in Michigan survey responses beyond the consumer sentiment index that is officially published. Using returns to education as the third example, we show how indices constructed from survey responses can be used to control for unobserved heterogeneity in individuals when good instruments are not available. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.04883&r=all 
By:  Kilian, Lutz 
Abstract:  Baumeister and Hamilton (2019a) assert that every critique of their work on oil markets by Kilian and Zhou (2019a) is without merit. In addition, they make the case that key aspects of the economic and econometric analysis in the widely used oil market model of Kilian and Murphy (2014) and its precursors are incorrect. Their critiques are also directed at other researchers who have worked in this area and, more generally, extend to research using structural VAR models outside of energy economics. The purpose of this paper is to help the reader understand what the real issues are in this debate. The focus is not only on correcting important misunderstandings in the recent literature, but on the substantive and methodological insights generated by this exchange, which are of broader interest to applied researchers. 
Keywords:  Bayesian inference; global real activity; IV estimation; Oil demand elasticity; oil price; oil supply elasticity; structural VAR 
JEL:  C36 C52 Q41 Q43 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:cpr:ceprdp:14047&r=all 
By:  Aureo de Paula 
Abstract:  This article provides a selective review on the recent literature on econometric models of network formation. The survey starts with a brief exposition on basic concepts and tools for the statistical description of networks. I then offer a review of dyadic models, focussing on statistical models on pairs of nodes and describe several developments of interest to the econometrics literature. The article also presents a discussion of nondyadic models where link formation might be influenced by the presence or absence of additional links, which themselves are subject to similar influences. This is related to the statistical literature on conditionally specified models and the econometrics of game theoretical models. I close with a (nonexhaustive) discussion of potential areas for further development. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.07781&r=all 
By:  Jangho Yang; Torsten Heinrich; Julian Winkler; Fran\c{c}ois Lafond; Pantelis Koutroumpis; J. Doyne Farmer 
Abstract:  Productivity levels and growth are extremely heterogeneous among firms. A vast literature has developed to explain the origins of productivity shocks, their dispersion, evolution and their relationship to the business cycle. We examine in detail the distribution of labor productivity levels and growth, and observe that they exhibit heavy tails. We propose to model these distributions using the four parameter L\'{e}vy stable distribution, a natural candidate deriving from the generalised Central Limit Theorem. We show that it is a better fit than several standard alternatives, and is remarkably consistent over time, countries and sectors. In all samples considered, the tail parameter is such that the theoretical variance of the distribution is infinite, so that the sample standard deviation increases with sample size. We find a consistent positive skewness, a markedly different behaviour between the left and right tails, and a positive relationship between productivity and size. The distributional approach allows us to test different measures of dispersion and find that productivity dispersion has slightly decreased over the past decade. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.05219&r=all 
By:  Aureo de Paula; Imran Rasul; Pedro Souza 
Abstract:  Social interactions determine many economic behaviors, but information on social ties does not exist in most publicly available and widely used datasets. We present results on the identification of social networks from observational panel data that contains no information on social ties between agents. In the context of a canonical social interactions model, we provide sufficient conditions under which the social interactions matrix, endogenous and exogenous social effect parameters are all globally identified. While this result is relevant across different estimation strategies, we then describe how highdimensional estimation techniques can be used to estimate the interactions model based on the Adaptive Elastic Net GMM method. We employ the method to study tax competition across US states. We find the identified social interactions matrix implies tax competition differs markedly from the common assumption of competition between geographically neighboring states, providing further insights for the longstanding debate on the relative roles of factor mobility and yardstick competition in driving tax setting behavior across states. Most broadly, our identification and application show the analysis of social interactions can be extended to economic realms where no network data exists. 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1910.07452&r=all 
By:  Mirela Miescu 
Abstract:  The paper investigates the effects of uncertainty shocks in emerging economies (EMEs). We construct a global uncertainty indicator as well as country uncertainty measures for fifteen relatively small emerging economies. We adopt an instrumental variable approach to identify exogenous uncertainty shocks in the EMEs. To deal with the data limitations specific to emerging countries, we develop a new Bayesian algorithm to estimate a proxy panel structural vector autoregressive (SVAR) model. We find that uncertainty shocks in EMEs cause severe falls in GDP and stock price indexes, generate inflation, depreciate the currency and are not followed by a subsequent overshoot in activity. Estimation implies considerable heterogeneity across economies in the response to uncertainty shocks which can be (in part) explained by country characteristics. 
Keywords:  Uncertainty shocks, proxy SVAR, Emerging economies, Panel data 
JEL:  C3 C11 E3 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:lan:wpaper:277077821&r=all 
By:  Florian Eckert; Rob J Hyndman; Anastasios Panagiotelis 
Abstract:  This paper conducts an extensive forecasting study on 13,118 time series measuring Swiss goods exports, grouped hierarchically by export destination and product category. We apply existing state of the art methods in forecast reconciliation and introduce a novel Bayesian reconciliation framework. This approach allows for explicit estimation of reconciliation biases, leading to several innovations: Prior judgment can be used to assign weights to specific forecasts and the occurrence of negative reconciled forecasts can be ruled out. Overall we find strong evidence that in addition to producing coherent forecasts, reconciliation also leads to improvements in forecast accuracy. 
Keywords:  hierarchical forecasting, Bayesian forecast reconciliation, Swiss exports, optimal forecast combination 
JEL:  C32 C53 E17 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:201914&r=all 
By:  Smith, Gary (Pomona College) 
Abstract:  Datamining is often used to discover patterns in Big Data. It is tempting believe that because an unearthed pattern is unusual it must be meaningful, but patterns are inevitable in Big Data and usually meaningless. The paradox of Big Data is that data mining is most seductive when there are a large number of variables, but a large number of variables exacerbates the perils of data mining. 
Keywords:  data mining, big data, machine learning 
Date:  2019–01–01 
URL:  http://d.repec.org/n?u=RePEc:clm:pomwps:1003&r=all 
By:  Sevvandi Kandanaarachchi; Rob J Hyndman 
Abstract:  This paper introduces DOBIN, a new approach to select a set of basis vectors tailored for outlier detection. DOBIN has a solid mathematical foundation and can be used as a dimension reduction tool for outlier detection tasks. We demonstrate the effectiveness of DOBIN on an extensive data repository, by comparing the performance of outlier detection methods using DOBIN and other bases. We further illustrate the utility of DOBIN as an outlier visualization tool. The R package dobin implements this basis construction. 
Keywords:  outlier detection, dimension reduction, outlier visualization, basis vectors 
JEL:  C14 C38 C88 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:201917&r=all 