
on Econometrics 
By:  Hong, Shengjie (School of Economics and Management, Tsinghua University); Su, Liangjun (School of Economics, Singapore Management University); Wang, Yaqi (School of Finance, Central University of Finance and Economics) 
Abstract:  This paper develops methods for statistical inferences in a partially identiﬁed nonparametric panel data model with endogeneity and interactive ﬁxed eﬀects. We consider the case where the number of crosssectional units (N) is large and the number of time series periods (T).as well as the number of unobserved common factors (R) are ﬁxed. Under some normalization rules, wecan concentrateout thelarge dimensional parameter vector of factor loadings and specify a set of conditional moment restriction that are involved with only the ﬁnite dimensional factor parameters along with the inﬁnite dimensional nonparametric component. For a conjectured restriction on the parameter, we consider testing the null hypothesis that the restriction is satisﬁed by at least one element in the identiﬁed set and propose a test statistic based on a novel martingale diﬀerence divergence (MDD) measure for the distance between a conditional expectation object and zero. We derive the limiting distribution of the resultant test statistic under the null and show that it is divergent at rateN under the global alternative based on the Uprocess theory. To obtain the critical values for our test, we propose a version of multiplier bootstrap and establish its asymptotic validity. Simulations demonstrate the ﬁnite sample properties of our inference procedure. We apply our method to study Engel curves for major nondurable expenditures in China by using a panel dataset from the China Family Panel Studies (CFPS). 
Keywords:  Endogeneity; Gaussian chaos process; martingale diﬀerence divergence; multiplier bootstrap; nonparametric IV; partial identiﬁcation; Uprocesses 
JEL:  C12 C14 C23 C26 
Date:  2019–03–27 
URL:  http://d.repec.org/n?u=RePEc:ris:smuesw:2019_014&r=all 
By:  Frédérique Bec (THEMA  Théorie économique, modélisation et applications  UCP  Université de Cergy Pontoise  Université ParisSeine  CNRS  Centre National de la Recherche Scientifique); Heino Bohn Nielsen (Department of Economics  University of Copenhagen  KU  University of Copenhagen = Københavns Universitet); Sarra Saïdi (THEMA  Théorie économique, modélisation et applications  UCP  Université de Cergy Pontoise  Université ParisSeine  CNRS  Centre National de la Recherche Scientifique) 
Abstract:  This paper stresses the bimodality of the widely used Student's t likelihood function applied in modelling Mixed causalnoncausal AutoRegressions (MAR). It first shows that a local maximum is very often to be found in addition to the global Maximum Likelihood Estimator (MLE), and that standard estimation algorithms could end up in this local maximum. It then shows that the issue becomes more salient as the causal root of the process approaches unity from below. The consequences are important as the local maximum estimated roots are typically interchanged , attributing the noncausal one to the causal component and viceversa, which severely changes the interpretation of the results. The properties of unit root tests based on this Student's t MLE of the backward root are obviously affected as well. To circumvent this issues, this paper proposes an estimation strategy which i) increases noticeably the probability to end up in the global MLE and ii) retains the maximum relevant for the unit root test against a MAR stationary alternative. An application to Brent crude oil price illustrates the relevance of the proposed approach. Keywords: Mixed autoregression, noncausal autoregression, maximum likelihood estimation, unit root test, Brent crude oil price. 
Date:  2019–07–06 
URL:  http://d.repec.org/n?u=RePEc:hal:wpaper:hal02175760&r=all 
By:  Herwartz, Helmut; Lange, Alexander; Maxand, Simone 
Abstract:  Structural vector autoregressive analysis aims to trace the contemporaneous linkages among (macroeconomic) variables back to underlying orthogonal structural shocks. In homoskedastic Gaussian models the identification of these linkages deserves external and typically notdatabased information. Statistical data characteristics (e.g, heteroskedasticity or nonGaussian independent components) allow for unique identification. Studying distinct covariance changes and distributional frameworks, we compare alternative datadriven identification procedures and identification by means of sign restrictions. The application of sign restrictions results in estimation biases as a reflection of censored sampling from a space of covariance decompositions. Statistical identification schemes are robust under distinct data structures to some extent. The detection of independent components appears most flexible unless the underlying shocks are (close to) Gaussianity. For analyzing linkages among the US business cycle and distinct sources of uncertainty we benefit from simulationbased evidence to point at two most suitable identification schemes. We detect a unidirectional effect of financial uncertainty on real economic activity and mutual causality between macroeconomic uncertainty and business cycles. 
Keywords:  independent components,heteroskedasticity,model selection,nonGaussianity,structural shocks 
JEL:  C32 E00 E32 E44 G01 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:zbw:cegedp:375&r=all 
By:  Escribano, Álvaro; Blazsek, Szabolcs; Ayala, Astrid 
Abstract:  Dynamic conditional score (DCS) models with timevarying shape parameters provide a exible method for volatility measurement. The new models are estimated by using the maximum likelihood (ML) method, conditions of consistency and asymptotic normality of ML are presented, and Monte Carlo simulation experiments are used to study the precision of ML. Daily data from the Standard & Poor's 500 (S&P 500) for the period of 1950 to 2017 are used. The performances of DCS models with constant and dynamic shape parameters are compared. Insample statistical performance metrics and outofsample valueatrisk backtesting support the use of DCS models with dynamic shape. 
Keywords:  Outliers; ValueAtRisk; ScoreDriven Shape Parameters; Dynamic Conditional Score Models 
JEL:  C58 C52 C22 
Date:  2019–07–19 
URL:  http://d.repec.org/n?u=RePEc:cte:werepe:28638&r=all 
By:  Huanjun Zhu; Vasilis Sarafidis; Mervyn Silvapulle 
Abstract:  This paper develops new tests against a structural break in panel data models with common factors when T is fixed, where T denotes the number of observations over time. For this class of models, the available tests against a structural break are valid only under the assumption that T is ‘large’. However, this may be a stringent requirement; more commonly so in datasets with annual time frequency, in which case the sample may cover a relatively long period even if T is not large. The proposed approach builds upon existing GMM methodology and develops Distancetype and LMtype tests for detecting a structural break, both when the breakpoint is known as well as when it is unknown. The proposed methodology permits weak exogeneity and/or endogeneity of the regressors. In a simulation study, the method performed well, both in terms of size and power, as well as in terms of successfully locating the time of the structural break. The method is illustrated by testing the socalled ‘Gibrat’s Law’, using a dataset from 4,128 financial institutions, each one observed for the period 20022014. 
Keywords:  Method of moments, unobserved heterogeneity, breakpoint detection, fixed T asymptotics. 
Date:  2019–07–09 
URL:  http://d.repec.org/n?u=RePEc:wyi:wpaper:002481&r=all 
By:  Stefano Tonellato (Department of Economics, University Of Venice Cà Foscari) 
Abstract:  It is well known that a wide class of bayesian nonparametric priors lead to the representation of the distribution of the observable variables as a mixture density with an infinite number of components, and that such a representation induces a clustering structure in the observations. However, cluster identification is not straightforward a posteriori and some postprocessing is usually required. In order to circumvent label switching, pairwise posterior similarity has been introduced, and it has been used in order to either apply classical clustering algorithms or estimate the underlying partition by minimising a suitable loss function. This paper proposes to map observations on a weighted undirected graph, where each node represents a sample item and edge weights are given by the posterior pairwise similarities. It will be shown how, after building a particular random walk on such a graph, it is possible to apply a community detection algorithm, known as map equation method, by optimising the description length of the partition. A relevant feature of this method is that it allows for both the quantification of the posterior uncertainty of the classification and the selection of variables to be used for classification purposes. 
Keywords:  Dirichlet process priors, mixture models, community detection, entropy, variable selection 
JEL:  C11 C38 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:ven:wpaper:2019:20&r=all 
By:  Valentina Corradi; Daniel Gutknecht 
Abstract:  This paper provides a unified approach for detecting sample selection in nonparametric conditional mean and quantile functions. In fact, as sample selection leads to a loss of point identification in the nonparametric quantile case, our tests are of particular relevance when interest lies in the conditional distribution. Our testing strategy consists of a twostep procedure: the first test is an omitted predictor test, where the omitted variable is the propensity score. This test has power against generic $\sqrt{n}$alternatives, and failure to reject the null implies no selection. By contrast, as any omnibus test, we cannot distinguish between a rejection due to genuine selection or to generic misspecification, when the omitted variable is correlated with the propensity score. Under the maintained assumption of no selection, our second test is therefore designed to detect misspecification. This is achieved by a localized version of the first test, using only individuals with propensity score close to one. Although the second step requires `identification at infinity', we can allow for cases of irregular identification. Finally, our testing procedure does not require any parametric assumptions on neither the outcome nor the selection equation(s), and all our results in the conditional quantile case hold uniformly across quantile ranks in a compact set. We apply our procedure to test for selection in log hourly wages of females and males in the UK using the UK Family Expenditure Survey. 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1907.07412&r=all 
By:  Michael Cai (Northwestern University); Marco Del Negro (FRB New York); Edward Herbst (Federal Reserve Board); Ethan Matlin (FRB New York); Reca Sarfati (FRB New York); Frank Schorfheide (Department of Economics, University of Pennsylvania) 
Abstract:  This paper illustrates the usefulness of sequential Monte Carlo (SMC) methods in approximating DSGE model posterior distributions. We show how the tempering schedule can be chosen adaptively, explore the benefits of an SMC variant we call generalized tempering for \online" estimation, and provide examples of multimodal posteriors that are well captured by SMC methods. We then use the online estimation of the DSGE model to compute pseudooutofsample density forecasts of DSGE models with and without financial frictions and document the benefits of conditioning DSGE model forecasts on nowcasts of macroeconomic variables and interest rate expectations. We also study whether the predictive ability of DSGE models changes when we use priors that are substantially looser than those that are commonly adopted in the literature. 
Keywords:  Adaptive algorithms, Bayesian inference, density forecasts, online estimation, sequential Monte Carlo methods 
JEL:  C11 C32 C53 E32 E37 E52 
Date:  2019–07–22 
URL:  http://d.repec.org/n?u=RePEc:pen:papers:19014&r=all 
By:  Victor H. Aguiar; Nail Kashaev 
Abstract:  We propose a framework for doing sharp nonparametric welfare analysis in discrete choice models with unobserved variation in choice sets. We recover jointly the distribution of choice sets and the distribution of preferences. To achieve this we use panel data on choices and assume nestedness of the latent choice sets. Nestedness means that choice sets of different decision makers are ordered by inclusion. It may be satisfied, for instance, when it is the result of either a search process or unobserved feasibility. Using variation of the uncovered choice sets we show how to do ordinal (nonparametric) welfare comparisons. When one is willing to make additional assumptions about preferences, we show how to nonparametrically identify the ranking over average utilities in the standard multinomial choice setting. 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1907.04853&r=all 
By:  Angela BittoNemling; Annalisa Cadonna; Sylvia Fr\"uhwirthSchnatter; Peter Knaus 
Abstract:  Timevarying parameter (TVP) models are widely used in time series analysis to flexibly deal with processes which gradually change over time. However, the risk of overfitting in TVP models is well known. This issue can be dealt with using appropriate globallocal shrinkage priors, which pull timevarying parameters towards static ones. In this paper, we introduce the R package shrinkTVP (Knaus, BittoNemling, Cadonna, and Fr\"uhwirthSchnatter 2019), which provides a fully Bayesian implementation of shrinkage priors for TVP models, taking advantage of recent developments in the literature, in particular that of Bitto and Fr\"uhwirthSchnatter (2019). The package shrinkTVP allows for posterior simulation of the parameters through an efficient Markov Chain Monte Carlo (MCMC) scheme. Moreover, summary and visualization methods, as well as the possibility of assessing predictive performance through log predictive density scores (LPDSs), are provided. The computationally intensive tasks have been implemented in C++ and interfaced with R. The paper includes a brief overview of the models and shrinkage priors implemented in the package. Furthermore, core functionalities are illustrated, both with simulated and real data. 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1907.07065&r=all 
By:  Marcel Bräutigam (LabEx MMEDII  UCP  Université de Cergy Pontoise  Université ParisSeine, ESSEC Business School  Essec Business School, LPSM UMR 8001  Laboratoire de Probabilités, Statistique et Modélisation  UPD7  Université Paris Diderot  Paris 7  SU  Sorbonne Université  CNRS  Centre National de la Recherche Scientifique); Marie Kratz (SID  Information Systems, Decision Sciences and Statistics Department  Essec Business School, LabEx MMEDII  UCP  Université de Cergy Pontoise  Université ParisSeine) 
Abstract:  In this note, we build upon the asymptotic theory for GARCH processes, considering the general class of augmented GARCH(p, q) processes. Our contribution is to complement the wellknown univariate asymptotics by providing a bivariate functional central limit theorem between the sample quantile and the rth absolute centred sample moment. This extends existing results in the case of identically and independently distributed random variables. We show that the conditions for the convergence of the estimators in the univariate case suffice even for the joint bivariate asymptotics. We illustrate the general results with various specific examples from the class of augmented GARCH(p, q) processes and show explicitly under which conditions on the moments and parameters of the process the joint asymptotics hold. 
Keywords:  asymptotic distribution,(sample) variance,functional central limit theorem,(augmented) GARCH,correlation,(sample) quantile,measure of dispersion,(sample) mean absolute deviation 
Date:  2019–06–29 
URL:  http://d.repec.org/n?u=RePEc:hal:wpaper:hal02176276&r=all 
By:  Ke, Miao (School of Economics, Singapore Management University); Su, Liangjun (School of Economics, Singapore Management University); Wang, Wendun (Econometric Institute, Erasmus University Rotterdam and Tinbergen Institute) 
Abstract:  In this paper, we consider the least squares estimation of a panel structure threshold regression (PSTR) model where both the slope coeﬃcients and threshold parameters may exhibit latent group structures. We study the asymptotic properties of the estimators of the latent group structure and the slope and threshold coeﬃcients. We show that we can estimate the latent group structure correctly with probability approaching 1 and the estimators of the slope and threshold coeﬃcients are asymptotically equivalent to the infeasible estimators that are obtained as if the true group structures were known. We study likelihoodratiobased inferences on the groupspeciﬁc threshold parameters under the shrinkingthresholdeﬀect framework. We also propose two speciﬁcation tests: one tests whether the threshold parameters are homogenous across groups, and the other tests whether the threshold eﬀects are present. When the number of latent groups is unknown, we propose a BICtype information criterion to determine the number of groups in the data. Simulations demonstrate that our estimators and tests perform reasonably well in ﬁnite samples. We apply our model to revisit the relationship between capital market imperfection and the investment behavior of ﬁrms and to examine the impact of bank deregulation on income inequality. We document a large degree of heterogeneous eﬀects in both applications that cannot be captured by conventional panel threshold regressions. 
Keywords:  Classiﬁcation; Dynamic panel; Latent group structures; Panel structure model; Panel threshold regression. 
JEL:  C23 C24 C33 
Date:  2019–07–11 
URL:  http://d.repec.org/n?u=RePEc:ris:smuesw:2019_013&r=all 
By:  Joshua C. C. Chan 
Abstract:  Large Bayesian VARs are now widely used in empirical macroeconomics. One popular shrinkage prior in this setting is the natural conjugate prior as it facilitates posterior simulation and leads to a range of useful analytical results. This is, however, at the expense of modelling exibility, as it rules out crossvariable shrinkage – i.e. shrinking coefficients on lags of other variables more aggressively than those on own lags. We develop a prior that has the best of both worlds: it can accommodate crossvariable shrinkage, while maintaining many useful analytical results, such as a closedform expression of the marginal likelihood. This new prior also leads to fast posterior simulation  for a BVAR with 100 variables and 4 lags, obtaining 10,000 posterior draws takes less than half a minute on a standard desktop. In a forecasting exercise, we show that a datadriven asymmetric prior outperforms two useful benchmarks: a datadriven symmetric prior and a subjective asymmetric prior. 
Keywords:  shrinkage prior, forecasting, marginal likelihood, optimal hyperparameters, structural VAR 
JEL:  C11 C52 E37 E47 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:een:camaaa:201951&r=all 
By:  Nguyen, T.H.A; ThomasAgnan, Christine; Laurent, Thibault; RuizGazen, Anne 
Abstract:  In an election, the vote shares by party on a given subdivision of a territory form a vector with positive components adding up to 1 called a composition. Using a conventional multiple linear regression model to explain this vector by some factors is not adapted for at least two reasons. The first one is the existence of the constraint on the sum of the components and the second one is the assumption of statistical independence across territorial units which may be questionable due to potential spatial autocorrelation. We develop a simultaneous spatial autoregressive model for compositional data which allows for both spatial correlation and correlations across equations. We propose an estimation method based on twostage and threestage least squares. We illustrate the method with simulations and with a data set from the 2015 French departmental election. 
Keywords:  multivariate spatial autocorrelation; spatial weight matrix; threestage least squares;twostage least squares; simplex; electoral data 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:tse:wpaper:123213&r=all 
By:  Gregory Cox; Xiaoxia Shi 
Abstract:  We propose a new test for inequalities that is simple and uniformly valid. The test compares the likelihood ratio statistic to a chisquared critical value, where the degrees of freedom is the rank of the active inequalities. This test requires no tuning parameters or simulations, and therefore is computationally fast, even with many inequalities. Further, it does not require an estimate of the number of binding or closetobinding inequalities. To show that this test is uniformly valid, we establish a new bound on the probability of translations of cones under the multivariate normal distribution that may be of independent interest. The leading application of our test is inference in moment inequality models. We also consider testing affine inequalities in the multivariate normal model and testing nonlinear inequalities in general asymptotically normal models. 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1907.06317&r=all 
By:  Andrew J. Patton; Brian M. Weller 
Abstract:  Clustering methods such as kmeans have found widespread use in a variety of applications. This paper proposes a formal testing procedure to determine whether a null hypothesis of a single cluster, indicating homogeneity of the data, can be rejected in favor of multiple clusters. The test is simple to implement, valid under relatively mild conditions (including nonnormality, and heterogeneity of the data in aspects beyond those in the clustering analysis), and applicable in a range of contexts (including clustering when the time series dimension is small, or clustering on parameters other than the mean). We verify that the test has good size control in finite samples, and we illustrate the test in applications to clustering vehicle manufacturers and U.S. mutual funds. 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1907.07582&r=all 
By:  Lee, DaeJin; Durbán Reguera, María Luz; Carballo González, Alba 
Abstract:  Prediction of outofsample values is a problem of interest in any regression model. In the context of penalized smooth mixed model regression Carballo et al. (2017) have proposed a general framework for prediction in additive models without interaction terms. The aim of this paper is to extend this work, based on the methodology proposed in Currie et al. (2004), to models that include interaction terms, i.e. prediction is needed in multidimensional setting. Our approach fits the data and predicts the new observations simultaneously and uses constraints to ensure a coherent fit or to impose further restrictions on the predictions. We also develop this methodology for the so called smoothANOVA models which allow us to include interaction terms that can be decomposed as a sum of several smooth functions. To illustrate the methodology two real data sets are used, one to predict log mortality rates in the Spanish population and another to predict aboveground biomass in Populus trees as a smooth function of height and diameter. We examine the performance of the interaction models in comparison to the SmoothANOVA models (both models with and without the restriction the fit has to be maintained) through a simulation study. 
Keywords:  Mixed Models; PSplines; Penalized Regression; Prediction 
Date:  2019–07–19 
URL:  http://d.repec.org/n?u=RePEc:cte:wsrepe:28630&r=all 
By:  Alexis Bogroff (University Paris 1 PanthéonSorbonne); Dominique Guégan (University Paris 1 PanthéonSorbonne; labEx ReFi France; University Ca’ Foscari Venice) 
Abstract:  An extensive list of risks relative to big data frameworks and their use through models of artificial intelligence is provided along with measurements and implementable solutions. Bias, interpretability and ethics are studied in depth, with several interpretations from the point of view of developers, companies and regulators. Reflexions suggest that fragmented frameworks increase the risks of models misspecification, opacity and bias in the result. Domain experts and statisticians need to be involved in the whole process as the business objective must drive each decision from the data extraction step to the final activatable prediction. We propose an holistic and original approach to take into account the risks encountered all along the implementation of systems using artificial intelligence from the choice of the data and the selection of the algorithm, to the decision making. 
Keywords:  Artificial Intelligence, Bias, Big Data, Ethics, Governance, Interpretability, Regulation, Risk 
JEL:  C4 C5 C6 C8 D8 G28 G38 K2 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:ven:wpaper:2019:19&r=all 
By:  Bucci, Andrea 
Abstract:  Accurately forecasting multivariate volatility plays a crucial role for the financial industry. The CholeskyArtificial Neural Networks specification here presented provides a twofold advantage for this topic. On the one hand, the use of the Cholesky decomposition ensures positive definite forecasts. On the other hand, the implementation of artificial neural networks allows to specify nonlinear relations without any particular distributional assumption. Outofsample comparisons reveal that Artificial neural networks are not able to strongly outperform the competing models. However, longmemory detecting networks, like Nonlinear Autoregressive model process with eXogenous input and long shortterm memory, show improved forecast accuracy respect to existing econometric models. 
Keywords:  Neural Networks; Machine Learning; Stock market volatility; Realized Volatility 
JEL:  C22 C45 C53 G17 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:95137&r=all 
By:  Fredrik S\"avje 
Abstract:  The paper demonstrates that the matching estimator is not generally consistent for the average treatment effect of the treated when the matching is done without replacement using propensity scores. To achieve consistency, practitioners must either assume that no unit exists with a propensity score greater than onehalf or assume that there is no confounding among such units. Illustrations suggest that the result applies also to matching using other metrics as long as it is done without replacement. 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1907.07288&r=all 
By:  Riccardo Marcaccioli; Giacomo Livan 
Abstract:  Countless natural and social multivariate systems are studied through sets of simultaneous and timespaced measurements of the observables that drive their dynamics, i.e., through sets of time series. Typically, this is done via hypothesis testing: the statistical properties of the empirical time series are tested against those expected under a suitable null hypothesis. This is a very challenging task in complex interacting systems, where statistical stability is often poor due to lack of stationarity and ergodicity. Here, we describe an unsupervised, datadriven framework to perform hypothesis testing in such situations. This consists of a statistical mechanical theory  derived from first principles  for ensembles of time series designed to preserve, on average, some of the statistical properties observed on an empirical set of time series. We showcase its possible applications on a set of stock market returns from the NYSE. 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1907.04925&r=all 
By:  Magnus Wiese; Robert Knobloch; Ralf Korn; Peter Kretschmer 
Abstract:  Modeling financial time series by stochastic processes is a challenging task and a central area of research in financial mathematics. In this paper, we break through this barrier and present Quant GANs, a datadriven model which is inspired by the recent success of generative adversarial networks (GANs). Quant GANs consist of a generator and discriminator function which utilize temporal convolutional networks (TCNs) and thereby achieve to capture longerranging dependencies such as the presence of volatility clusters. Furthermore, the generator function is explicitly constructed such that the induced stochastic process allows a transition to its riskneutral distribution. Our numerical results highlight that distributional properties for small and large lags are in an excellent agreement and dependence properties such as volatility clusters, leverage effects, and serial autocorrelations can be generated by the generator function of Quant GANs, demonstrably in high fidelity. 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1907.06673&r=all 