
on Econometrics 
By:  Yicong Lin; Hanno Reuvers 
Abstract:  This paper develops the asymptotic theory of a Fully Modified Generalized Least Squares (FMGLS) estimator for multivariate cointegrating polynomial regressions. Such regressions allow for deterministic trends, stochastic trends and integer powers of stochastic trends to enter the cointegrating relations. Our fully modified estimator incorporates: (1) the direct estimation of the inverse autocovariance matrix of the multidimensional errors, and (2) second order bias corrections. The resulting estimator has the intuitive interpretation of applying a weighted least squares objective function to filtered data series. Moreover, the required second order bias corrections are convenient byproducts of our approach and lead to standard asymptotic inference. The FMGLS framework also provides two new KPSS tests for the null of cointegration. A comprehensive simulation study shows good performance of the FMGLS estimator and the related tests. As a practical illustration, we test the Environmental Kuznets Curve (EKC) hypothesis for six early industrialized countries. The more efficient and more powerful FMGLS approach raises important questions concerning the standard model specification for EKC analysis. 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1908.02552&r=all 
By:  Mingli Chen; Kengo Kato; Chenlei Leng 
Abstract:  Data in the form of networks are increasingly available in a variety of areas, yet statistical models allowing for parameter estimates with desirable statistical properties for sparse networks remain scarce. To address this, we propose the Sparse $\beta$Model (S$\beta$M), a new network model that interpolates the celebrated Erd\H{o}sR\'enyi model and the $\beta$model that assigns one different parameter to each node. By a novel reparameterization of the $\beta$model to distinguish global and local parameters, our S$\beta$M can drastically reduce the dimensionality of the $\beta$model by requiring some of the local parameters to be zero. We derive the asymptotic distribution of the maximum likelihood estimator of the S$\beta$M when the support of the parameter vector is known. When the support is unknown, we formulate a penalized likelihood approach with the $\ell_0$penalty. Remarkably, we show via a monotonicity lemma that the seemingly combinatorial computational problem due to the $\ell_0$penalty can be overcome by assigning nonzero parameters to those nodes with the largest degrees. We further show that a $\beta$min condition guarantees our method to identify the true model and provide excess risk bounds for the estimated parameters. The estimation procedure enjoys good finite sample properties as shown by simulation studies. The usefulness of the S$\beta$M is further illustrated via the analysis of a microfinance take up example. 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1908.03152&r=all 
By:  Morten Ø. Nielsen (Queen's University and CREATES); WonKi Seo (Department of Economics, Queen's University); Dakyung Seong (University of California, Davis) 
Abstract:  We propose a statistical testing procedure to determine the number of stochastic trends of cointegrated functional time series taking values in the Hilbert space of squareintegrable functions defined on a compact interval. Our test is based on a variance ratio statistic, adapted to a possibly infinitedimensional setting. We derive the asymptotic null distribution and prove consistency of the test. Monte Carlo simulations show good performance of our test and provide some evidence that it outperforms the existing testing procedure. We apply our methodology to three empirical examples: agespecific US employment rates, Australian temperature curves, and Ontario electricity demand. 
Keywords:  cointegration, functional data, nonstationary, stochastic trends, variance ratio 
JEL:  C32 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:qed:wpaper:1420&r=all 
By:  James G. MacKinnon (Queen's University); Matthew D. Webb (Carleton University) 
Abstract:  We discuss when and how to deal with possibly clustered errors in linear regression models. Specifically, we discuss situations in which a regression model may plausibly be treated as having error terms that are arbitrarily correlated within known clusters but uncorrelated across them. The methods we discuss include various covariance matrix estimators, possibly combined with various methods of obtaining critical values, several bootstrap procedures, and randomization inference. Special attention is given to models with few treated clusters and clusters that vary in size, where inference may be problematic. Two empirical examples and a simulation experiment illustrate the methods we discuss and the concerns we raise. 
Keywords:  clustered data, clusterrobust variance estimator, CRVE, wild cluster bootstrap, robust inference 
JEL:  C15 C21 C23 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:qed:wpaper:1421&r=all 
By:  Yanqin Fan; Fang Han; Wei Li; XiaoHua Zhou 
Abstract:  The family of rank estimators, including Han's maximum rank correlation (Han, 1987) as a notable example, has been widely exploited in studying regression problems. For these estimators, although the linear index is introduced for alleviating the impact of dimensionality, the effect of large dimension on inference is rarely studied. This paper fills this gap via studying the statistical properties of a larger family of Mestimators, whose objective functions are formulated as Uprocesses and may be discontinuous in increasing dimension setup where the number of parameters, $p_{n}$, in the model is allowed to increase with the sample size, $n$. First, we find that often in estimation, as $p_{n}/n\rightarrow 0$, $(p_{n}/n)^{1/2}$ rate of convergence is obtainable. Second, we establish Bahadurtype bounds and study the validity of normal approximation, which we find often requires a much stronger scaling requirement than $p_{n}^{2}/n\rightarrow 0.$ Third, we state conditions under which the numerical derivative estimator of asymptotic covariance matrix is consistent, and show that the step size in implementing the covariance estimator has to be adjusted with respect to $p_{n}$. All theoretical results are further backed up by simulation studies. 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1908.05255&r=all 
By:  Taras Bodnar; Holger Dette; Nestor Parolya; Erik Thors\'en 
Abstract:  Optimal portfolio selection problems are determined by the (unknown) parameters of the data generating process. If an investor want to realise the position suggested by the optimal portfolios he/she needs to estimate the unknown parameters and to account the parameter uncertainty into the decision process. Most often, the parameters of interest are the population mean vector and the population covariance matrix of the asset return distribution. In this paper we characterise the exact sampling distribution of the estimated optimal portfolio weights and their characteristics by deriving their sampling distribution which is present in terms of a stochastic representation. This approach possesses several advantages, like (i) it determines the sampling distribution of the estimated optimal portfolio weights by expressions which could be used to draw samples from this distribution efficiently; (ii) the application of the derived stochastic representation provides an easy way to obtain the asymptotic approximation of the sampling distribution. The later property is used to show that the highdimensional asymptotic distribution of optimal portfolio weights is a multivariate normal and to determine its parameters. Moreover, a consistent estimator of optimal portfolio weights and their characteristics is derived under the highdimensional settings. Via an extensive simulation study, we investigate the finitesample performance of the derived asymptotic approximation and study its robustness to the violation of the model assumptions used in the derivation of the theoretical results. 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1908.04243&r=all 
By:  Greene, W.H.;; Harris, M.N.;; Knott, R.;; Rice, N.; 
Abstract:  Anchoring vignettes have been proposed as a way to correct for differential item functioning when individuals selfassess their health, or other aspects of their circumstances on an ordered categorical scale. The model relies on two key underlying assumptions of response consistency and vignette equivalence. Adopting a modified specification of the boundary equations in the compound hierarchical ordered probit model this paper develops joint and separate tests of these assumptions based on a score approach. Monte Carlo simulations show that the tests have good size and power properties in finite samples. We provide an application of the test to data from the Survey of Health, Aging and Retirement in Europe (SHARE, using selfreported data on pain. The tests are easy to implement, only requiring estimation of the restricted model under the null hypothesis. 
Keywords:  ordered response models; anchoring vignettes; differential item functioning; selfassessments; score test; CHOPIT; 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:yor:hectdg:19/18&r=all 
By:  Qingliang Fan; YuChin Hsu; Robert P. Lieli; Yichong Zhang 
Abstract:  Given the unconfoundedness assumption, we propose new nonparametric estimators for the reduced dimensional conditional average treatment effect (CATE) function. In the first stage, the nuisance functions necessary for identifying CATE are estimated by machine learning methods, allowing the number of covariates to be comparable to or larger than the sample size. This is a key feature since identification is generally more credible if the full vector of conditioning variables, including possible transformations, is highdimensional. The second stage consists of a lowdimensional kernel regression, reducing CATE to a function of the covariate(s) of interest. We consider two variants of the estimator depending on whether the nuisance functions are estimated over the full sample or over a holdout sample. Building on Belloni at al. (2017) and Chernozhukov et al. (2018), we derive functional limit theory for the estimators and provide an easytoimplement procedure for uniform inference based on the multiplier bootstrap. The empirical application revisits the effect of maternal smoking on a baby's birth weight as a function of the mother's age. 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1908.02399&r=all 
By:  Michael Griebel; Florian Heiss; Jens Oettershagen; Constantin Weiser 
Abstract:  Empirical economic research frequently applies maximum likelihood estimation in cases where the likelihood function is analytically intractable. Most of the theoretical literature focuses on maximum simulated likelihood (MSL) estimators, while empirical and simulation analyzes often find that alternative approximation methods such as quasiMonte Carlo simulation, Gaussian quadrature, and integration on sparse grids behave considerably better numerically. This paper generalizes the theoretical results widely known for MSL estimators to a general set of maximum approximated likelihood (MAL) estimators. We provide general conditions for both the model and the approximation approach to ensure consistency and asymptotic normality. We also show specific examples and finitesample simulation results. 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1908.04110&r=all 
By:  Harvey, A.; Hurn, S.; Thiele, S. 
Abstract:  Circular observations pose special problems for time series modeling. This article shows how the scoredriven approach, developed primarily in econometrics, provides a natural solution to the difficulties and leads to a coherent and unified methodology for estimation, model selection and testing. The new methods are illustrated with hourly data on wind direction. 
Keywords:  Autoregression, circular data, dynamic conditional score model, von Mises distribution, wind direction 
JEL:  C22 
Date:  2019–08–12 
URL:  http://d.repec.org/n?u=RePEc:cam:camdae:1971&r=all 
By:  Bryan S. Graham; Fengshi Niu; James L. Powell 
Abstract:  We study nonparametric estimation of density functions for undirected dyadic random variables (i.e., random variables defined for all n\overset{def}{\equiv}\tbinom{N}{2} unordered pairs of agents/nodes in a weighted network of order N). These random variables satisfy a local dependence property: any random variables in the network that share one or two indices may be dependent, while those sharing no indices in common are independent. In this setting, we show that density functions may be estimated by an application of the kernel estimation method of Rosenblatt (1956) and Parzen (1962). We suggest an estimate of their asymptotic variances inspired by a combination of (i) Newey's (1994) method of variance estimation for kernel estimators in the "monadic" setting and (ii) a variance estimator for the (estimated) density of a simple network first suggested by Holland and Leinhardt (1976). More unusual are the rates of convergence and asymptotic (normal) distributions of our dyadic density estimates. Specifically, we show that they converge at the same rate as the (unconditional) dyadic sample mean: the square root of the number, N, of nodes. This differs from the results for nonparametric estimation of densities and regression functions for monadic data, which generally have a slower rate of convergence than their corresponding sample mean. 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1907.13630&r=all 
By:  Angelo Mele; Lingxin Hao; Joshua Cape; Carey E. Priebe 
Abstract:  In many applications of network analysis, it is important to distinguish between observed and unobserved factors affecting network structure. To this end, we develop spectral estimators for both unobserved blocks and the effect of covariates in stochastic blockmodels. Our main strategy is to reformulate the stochastic blockmodel estimation problem as recovery of latent positions in a generalized random dot product graph. On the theoretical side, we establish asymptotic normality of our estimators for the subsequent purpose of performing inference. On the applied side, we show that computing our estimator is much faster than standard variational expectationmaximization algorithms and scales well for large networks. The results in this paper provide a foundation to estimate the effect of observed covariates as well as unobserved latent community structure on the probability of link formation in networks. 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1908.06438&r=all 
By:  Tamás Krisztin; Philipp Piribauer (WIFO) 
Abstract:  In this paper we propose a Bayesian estimation approach for a spatial autoregressive logit specification. Our approach relies on recent advances in Bayesian computing, making use of PólyaGamma sampling for Bayesian Markovchain Monte Carlo algorithms. The proposed specification assumes that the involved logodds of the model follow a spatial autoregressive process. PólyaGamma sampling involves a computationally efficient treatment of the spatial autoregressive logit model, allowing for extensions to the existing baseline specification in an elegant and straightforward way. In a Monte Carlo study we demonstrate that our proposed approach significantly outperforms existing spatial autoregressive probit specifications both in terms of parameter precision and computational time. The paper moreover illustrates the performance of the proposed spatial autoregressive logit specification using panEuropean regional data on foreign direct investments. Our empirical results highlight the importance of accounting for spatial dependence when modelling European regional FDI flows. 
Keywords:  Spatial autoregressive logit, Bayesian MCMC estimation, FDI flows, European regions 
Date:  2019–08–19 
URL:  http://d.repec.org/n?u=RePEc:wfo:wpaper:y:2019:i:586&r=all 
By:  Daiki Maki; Yasushi Ota 
Abstract:  This study examines statistical performance of tests for timevarying properties under misspecified conditional mean and variance. When we test for timevarying properties of the conditional mean in the case in which data have no timevarying mean but have timevarying variance, asymptotic tests have size distortions. This is improved by the use of a bootstrap method. Similarly, when we test for timevarying properties of the conditional variance in the case in which data have timevarying mean but no timevarying variance, asymptotic tests have large size distortions. This is not improved even by the use of bootstrap methods. We show that tests for timevarying properties of the conditional mean by the bootstrap are robust regardless of the timevarying variance model, whereas tests for timevarying properties of the conditional variance do not perform well in the presence of misspecified timevarying mean. 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1907.12107&r=all 
By:  Nabil KaziTani (SAF  Laboratoire de Sciences Actuarielle et Financière  UCBL  Université Claude Bernard Lyon 1  Université de Lyon); Didier Rullière (SAF  Laboratoire de Sciences Actuarielle et Financière  UCBL  Université Claude Bernard Lyon 1  Université de Lyon) 
Abstract:  In this paper, we investigate the link between the joint law of a ddimensional random vector and the law of some of its multivariate marginals. We introduce and focus on a class of distributions, that we call projective, for which we give detailed properties. This allows us to obtain necessary conditions for a given construction to be projective. We illustrate our results by proposing some theoretical projective distributions, as elliptical distributions or a new class of distribution having given bivariate margins. In the case where the data do not necessarily correspond to a projective distribution, we also explain how to build proper distributions while checking that the distance to the prescribed projections is small enough. 
Keywords:  Copulas,Multidimensional marginals,Elliptical Distributions 
Date:  2019–08–07 
URL:  http://d.repec.org/n?u=RePEc:hal:journl:hal01575169&r=all 
By:  Zhentao Shi; Jingyi Huang 
Abstract:  Policy evaluation is central to economic data analysis, but economists mostly work with observational data in view of limited opportunities to carry out controlled experiments. In the potential outcome framework, the panel data approach (Hsiao, Ching and Wan, 2012) constructs the counterfactual by exploiting the correlation between crosssectional units in panel data. The choice of crosssectional control units, a key step in their implementation, is nevertheless unresolved in datarich environment when many possible controls are at the researcher's disposal. We propose the forward selection method to choose control units, and establish validity of postselection inference. Our asymptotic framework allows the number of possible controls to grow much faster than the time dimension. The easytoimplement algorithms and their theoretical guarantee extend the panel data approach to big data settings. Monte Carlo simulations are conducted to demonstrate the finite sample performance of the proposed method. Two empirical examples illustrate the usefulness of our procedure when many controls are available in realworld applications. 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1908.05894&r=all 
By:  Elena Krasnokutskaya; Kyungchul Song; Xun Tang 
Abstract:  We propose a new method for studying environments with unobserved individual heterogeneity. Based on modelimplied pairwise inequalities, the method classifies individuals in the sample into groups defined by discrete unobserved heterogeneity with unknown support. We establish conditions under which the groups are identified and consistently estimated through our method. We show that the method performs well in finite samples through Monte Carlo simulation. We then apply the method to estimate a model of lowprice procurement auctions with unobserved bidder heterogeneity, using data from the California highway procurement market. 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1908.01272&r=all 
By:  Timo Dimitriadis; Julie Schnaitmann 
Abstract:  In this paper, we introduce new forecast encompassing tests for the risk measure Expected Shortfall (ES). Forecasting and forecast evaluation techniques for the ES are rapidly gaining attention through the recently introduced Basel III Accords, which stipulate the use of the ES as primary market risk measure for the international banking regulations. Encompassing tests generally rely on the existence of strictly consistent loss functions for the functionals under consideration, which do not exist for the ES. However, our encompassing tests are based on recently introduced loss functions and an associated regression framework which considers the ES jointly with the corresponding Value at Risk (VaR). This setup facilitates several testing specifications which allow for both, joint tests for the ES and VaR and standalone tests for the ES. We present asymptotic theory for our encompassing tests and verify their finite sample properties through various simulation setups. In an empirical application, we utilize the encompassing tests in order to demonstrate the superiority of forecast combination methods for the ES for the IBM stock. 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1908.04569&r=all 
By:  Emanuele Russo; Neil FosterMcGregor; Bart Verpagen 
Abstract:  In this paper we investigate whether long run time series of income per capita are better described by a trendstationary model with few structural changes or by unit root processes in which permanent stochastic shocks are responsible for the observed growth discontinuities. To this purpose, we develop a methodology to test the null of a generic I(1) process versus a set of stationary alternatives with structural breaks. Differently from other tests in the literature, the number of structural breaks under the alternative hypothesis is treated as an unknown (up to some ex ante determined maximum). Critical values are obtained via Monte Carlo simulations and finite sample size and power properties of the test are reported. An application is provided for a group of advanced and developing countries in the Maddison dataset, also using bootstrapped critical values. As compared to previous findings in the literature, less evidence is found against the unit root hypothesis. Failures to reject the I(1) null are particularly strong for a set of developing countries considered. Finally, even less rejections are found when relaxing the assumption of Gaussian shocks. 
Keywords:  Longrun growth; structural breaks; unit roots. 
Date:  2019–08–22 
URL:  http://d.repec.org/n?u=RePEc:ssa:lemwps:2019/29&r=all 
By:  Ying, Jiahui; Shonkwiler, J. Scott 
Keywords:  Environmental Economics and Policy 
Date:  2019–06–25 
URL:  http://d.repec.org/n?u=RePEc:ags:aaea19:290818&r=all 
By:  Emmanuel Guerre; Yao Luo 
Abstract:  We consider nonparametric identification of independent private value firstprice auction models, in which the analyst only observes winning bids. Our benchmark model assumes an exogenous number of bidders $N$. We show that, if the bidders observe $N$, the resulting discontinuities in the winning bid density can be used to identify the distribution of $N$. The private value distribution can be identified in a second step. A second class of models considers endogenouslydetermined $N$, due to a reserve price or an entry cost. If bidders observe $N$, these models are also identifiable using winning bid discontinuities. If bidders cannot observe $N$, however, identification is not possible unless the analyst observes an instrument which affects the reserve price or entry cost. Lastly, we derive some testable restrictions for whether bidders observe the number of competitors and whether endogenous participation is due to a reserve price or entry cost. An application to USFS timber auction data illustrates the usefulness of our theoretical results for competition analysis, showing that nearly one bid out of three can be non competitive. It also suggests that the risk aversion bias caused by a mismeasured competition can be large. 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1908.05476&r=all 
By:  Michael Pfarrhofer 
Abstract:  This paper investigates the timevarying impacts of international macroeconomic uncertainty shocks. We use a global vector autoregressive (GVAR) specification with drifting coefficients and factor stochastic volatility in the errors to model six economies jointly. The measure of uncertainty is constructed endogenously by estimating a scalar driving the innovation variances of the latent factors, and is included also in the mean of the process. To achieve regularization, we use Bayesian techniques for estimation, and introduce a set of hierarchical globallocal shrinkage priors. The adopted priors center the model on a constant parameter specification with homoscedastic errors, but allow for timevariation if suggested by likelihood information. Moreover, we assume coefficients across economies to be similar, but provide sufficient flexibility via the hierarchical prior for countryspecific idiosyncrasies. The results point towards pronounced real and financial effects of uncertainty shocks in all countries, with differences across economies and over time. 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1908.06325&r=all 
By:  TaHsin Li 
Abstract:  Nonlinear dynamic volatility has been observed in many financial time series. The recently proposed quantile periodogram offers an alternative way to examine this phenomena in the frequency domain. The quantile periodogram is constructed from trigonometric quantile regression of time series data at different frequencies and quantile levels. It is a useful tool for quantilefrequency analysis (QFA) of nonlinear serial dependence. This paper introduces a number of spectral divergence metrics based on the quantile periodogram for diagnostic checks of financial time series models and modelbased discriminant analysis. The parametric bootstrapping technique is employed to compute the $p$values of the metrics. The usefulness of the proposed method is demonstrated empirically by a case study using the daily log returns of the S\&P 500 index over three periods of time together with their GARCHtype models. The results show that the QFA method is able to provide additional insights into the goodness of fit of these financial time series models that may have been missed by conventional tests. The results also show that the QFA method offers a more informative way of discriminant analysis for detecting regime changes in time series. 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1908.02545&r=all 
By:  Geng, Huayan; Zhou, Qiankun 
Keywords:  Research Methods/ Statistical Methods 
Date:  2019–06–25 
URL:  http://d.repec.org/n?u=RePEc:ags:aaea19:291212&r=all 
By:  JeanJacques Forneron 
Abstract:  This paper develops an approach to detect identification failures in a large class of moment condition models. This is achieved by introducing a quasiJacobian matrix which is asymptotically singular under higherorder local identification as well as weak/set identification; in these settings, standard asymptotics are not valid. Under (semi)strong identification, where standard asymptotics are valid, this matrix is asymptotically equivalent to the usual Jacobian matrix. After rescaling, it is thus asymptotically nonsingular. Together, these results imply that the eigenvalues of the quasiJacobian can detect potential local and global identification failures. Furthermore, the quasiJacobian is informative about the span of the identification failure. This information permits twostep identification robust subvector inference without any a priori knowledge of the underlying identification structure. MonteCarlo simulations and empirical applications illustrate the results. 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1907.13093&r=all 
By:  Adrien Auclert; Bence Bardóczy; Matthew Rognlie; Ludwig Straub 
Abstract:  We propose a general and highly efficient method for solving and estimating general equilibrium heterogeneousagent models with aggregate shocks in discrete time. Our approach relies on the rapid computation and composition of sequencespace Jacobians—the derivatives of perfectforesight equilibrium mappings between aggregate sequences around the steady state. We provide a fast algorithm for computing Jacobians for heterogeneous agents, a technique to substantially reduce dimensionality, a rapid procedure for likelihoodbased estimation, a determinacy condition for the sequence space, and a method to solve nonlinear perfectforesight transitions. We apply our methods to three canonical heterogeneousagent models: a neoclassical model, a New Keynesian model with one asset, and a New Keynesian model with two assets. 
JEL:  C63 E21 E32 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:26123&r=all 
By:  Daiki Maki; Yasushi Ota 
Abstract:  This study compares statistical properties of ARCH tests that are robust to the presence of the misspecified conditional mean. The approaches employed in this study are based on two nonparametric regressions for the conditional mean. First is the ARCH test using NadayaraWatson kernel regression. Second is the ARCH test using the polynomial approximation regression. The two approaches do not require specification of the conditional mean and can adapt to various nonlinear models, which are unknown a priori. Accordingly, they are robust to misspecified conditional mean models. Simulation results show that ARCH tests based on the polynomial approximation regression approach have better statistical properties than ARCH tests using NadayaraWatson kernel regression approach for various nonlinear models. 
Date:  2019–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1907.12752&r=all 
By:  Ningyuan Chen; Guillermo Gallego; Zhuodong Tang 
Abstract:  We show the equivalence of discrete choice models and the class of binary choice forests, which are random forest based on binary choice trees. This suggests that standard machine learning techniques based on random forest can serve to estimate discrete choice model with an interpretable output. This is confirmed by our data driven result that states that random forest can accurately predict the choice probability of any discrete choice model. Our framework has unique advantages: it can capture behavioral patterns such as irrationality or sequential searches; it handles nonstandard formats of training data that result from aggregation; it can measure product importance based on how frequently a random customer would make decisions depending on the presence of the product; it can also incorporate price information. Our numerical results show that binary choice forest can outperform the best parametric models with much better computational times. 
Date:  2019–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1908.01109&r=all 