nep-ecm New Economics Papers
on Econometrics
Issue of 2017‒07‒16
fourteen papers chosen by
Sune Karlsson
Örebro universitet

  2. Double/Debiased Machine Learning for Treatment and Structural Parameters By Victor Chernozhukov; Denis Chetverikov; Mert Demirer; Esther Duflo; Christian Hansen; Whitney Newey; James Robins
  3. Dynamic Quantile Function Models By Wilson Ye Chen; Gareth W. Peters; Richard H. Gerlach; Scott A. Sisson
  4. What Happens When Econometrics and Psychometrics Collide? An Example Using the PISA Data By Jerrim, John; Lopez-Agudo, Luis Alejandro; Marcenaro-Gutierrez, Oscar D.; Shure, Dominique
  5. Seasonal long memory in intraday volatility and trading volume of Dow Jones stocks By Voges, Michelle; Leschinski, Christian; Sibbertsen, Philipp
  6. "Asymptotic Properties of the Maximum Likelihood Estimator in Regime Switching Econometric Models" By Hiroyuki Kasahara; Katsumi Shimotsu
  7. "Testing the Order of Multivariate Normal Mixture Models" By Hiroyuki Kasahara; Katsumi Shimotsu
  8. Endogeneity in Semiparametric Threshold Regression By Andros Kourtellos; Thanasis Stengos; Yiguo Sun
  9. Pitfalls in the Development of Falsification Tests: An Illustration from the Recent Minimum Wage Literature By Clemens, Jeffrey
  10. Bayesian Realized-GARCH Models for Financial Tail Risk Forecasting Incorporating Two-sided Weibull Distribution By Chao Wang; Qian Chen; Richard Gerlach
  11. Semi-parametric Bayesian Forecasting withan Application to Stochastic Volatility By Fabian Goessling; Martina Danielova Zaharieva
  12. A Generalized Approach to Indeterminacy in Linear Rational Expectations Models By Bianchi, Francesco; Nicolò, Giovanni
  13. Forecasting football match results in national league competitions using score-driven time series models By Siem Jan S.J. Koopman; Rutger Lit
  14. Should We Combine Difference In Differences with Conditioning on Pre-Treatment Outcomes? By Chabé-Ferret, Sylvain

  1. By: Davide De Gaetano
    Abstract: The aim of this paper is to propose a bias correction of the estimation of the long run fourth order moment in the CUSUM of squares test proposed by Sansó et al. (2004) for the detection of structural breaks in financial data. The correction is made by using the stationary bootstrap proposed by Politis and Romano (1994). The choice of this resampling technique is justified by the stationarity and weak dependence of the time series under the assumptions which ensure the existence of the limiting distribution of the test statistic, under the null hypothesis. Some Monte Carlo experiments have been implemented in order to evaluate the effect of the proposed bias correction considering two particular data generating processes, the GARCH(1,1) and the log-normal stochastic volatility. The effectiveness of the bias correction has been evaluated also on real data sets.
    Keywords: CUSUM of squares test, Structural breaks, Bias correction.
    JEL: C12 C58 G17
    Date: 2017–07
  2. By: Victor Chernozhukov; Denis Chetverikov; Mert Demirer; Esther Duflo; Christian Hansen; Whitney Newey; James Robins
    Abstract: We revisit the classic semiparametric problem of inference on a low dimensional parameter θ_0 in the presence of high-dimensional nuisance parameters η_0. We depart from the classical setting by allowing for η_0 to be so high-dimensional that the traditional assumptions, such as Donsker properties, that limit complexity of the parameter space for this object break down. To estimate η_0, we consider the use of statistical or machine learning (ML) methods which are particularly well-suited to estimation in modern, very high-dimensional cases. ML methods perform well by employing regularization to reduce variance and trading off regularization bias with overfitting in practice. However, both regularization bias and overfitting in estimating η_0 cause a heavy bias in estimators of θ_0 that are obtained by naively plugging ML estimators of η_0 into estimating equations for θ_0. This bias results in the naive estimator failing to be N^(-1/2) consistent, where N is the sample size. We show that the impact of regularization bias and overfitting on estimation of the parameter of interest θ_0 can be removed by using two simple, yet critical, ingredients: (1) using Neyman-orthogonal moments/scores that have reduced sensitivity with respect to nuisance parameters to estimate θ_0, and (2) making use of cross-fitting which provides an efficient form of data-splitting. We call the resulting set of methods double or debiased ML (DML). We verify that DML delivers point estimators that concentrate in a N^(-1/2)-neighborhood of the true parameter values and are approximately unbiased and normally distributed, which allows construction of valid confidence statements. The generic statistical theory of DML is elementary and simultaneously relies on only weak theoretical requirements which will admit the use of a broad array of modern ML methods for estimating the nuisance parameters such as random forests, lasso, ridge, deep neural nets, boosted trees, and various hybrids and ensembles of these methods. We illustrate the general theory by applying it to provide theoretical properties of DML applied to learn the main regression parameter in a partially linear regression model, DML applied to learn the coefficient on an endogenous variable in a partially linear instrumental variables model, DML applied to learn the average treatment effect and the average treatment effect on the treated under unconfoundedness, and DML applied to learn the local average treatment effect in an instrumental variables setting. In addition to these theoretical applications, we also illustrate the use of DML in three empirical examples.
    JEL: C01
    Date: 2017–06
  3. By: Wilson Ye Chen; Gareth W. Peters; Richard H. Gerlach; Scott A. Sisson
    Abstract: We offer a novel way of thinking about the modelling of the time-varying distributions of financial asset returns. Borrowing ideas from symbolic data analysis, we consider data representations beyond scalars and vectors. Specifically, we consider a quantile function as an observation, and develop a new class of dynamic models for quantile-function-valued (QF-valued) time series. In order to make statistical inferences and account for parameter uncertainty, we propose a method whereby a likelihood function can be constructed for QF-valued data, and develop an adaptive MCMC sampling algorithm for simulating from the posterior distribution. Compared to modelling realised measures, modelling the entire quantile functions of intra-daily returns allows one to gain more insight into the dynamic structure of price movements. Via simulations, we show that the proposed MCMC algorithm is effective in recovering the posterior distribution, and that the posterior means are reasonable point estimates of the model parameters. For empirical studies, the new model is applied to analysing one-minute returns of major international stock indices. Through quantile scaling, we further demonstrate the usefulness of our method by forecasting one-step-ahead the Value-at-Risk of daily returns.
    Date: 2017–07
  4. By: Jerrim, John (University College London); Lopez-Agudo, Luis Alejandro (University of Malaga); Marcenaro-Gutierrez, Oscar D. (University of Malaga); Shure, Dominique (University College London)
    Abstract: International large-scale assessments such as PISA are increasingly being used to benchmark the academic performance of young people across the world. Yet many of the technicalities underpinning these datasets are misunderstood by applied researchers, who sometimes fail to take their complex sample and test designs into account. The aim of this paper is to generate a better understanding amongst economists about how such databases are created, and what this implies for the empirical methodologies one should (or should not) apply. We explain how some of the modelling strategies preferred by economists seem to be at odds with the complex test design, and provide clear advice on the types of robustness tests that are therefore needed when analyzing these datasets. In doing so, we hope to generate a better understanding of international large-scale education databases, and promote better practice in their use.
    Keywords: sample design, test design, PISA, weights, replicate weights, plausible values
    JEL: I20 C18 C10 C55
    Date: 2017–06
  5. By: Voges, Michelle; Leschinski, Christian; Sibbertsen, Philipp
    Abstract: It is well known that intraday volatilities and trading volumes exhibit strong seasonal features. These seasonalities are usually modeled using dummy variables or deterministic functions. Here, we propose a test for seasonal long memory with a known frequency. Using this test, we show that deterministic seasonality is an accurate model for the DJIA index but not for the component stocks. These still exhibit significant and persistent periodicity after seasonal de-meaning so that more evolved seasonal long memory models are required to model their behavior.
    Keywords: Intraday Volatility; Trading Volume; Seasonality; Long Memory
    JEL: C12 C22 C58 G12 G15
    Date: 2017–06
  6. By: Hiroyuki Kasahara (Vancouver School of Economics, University of British Columbia); Katsumi Shimotsu (Faculty of Economics, The University of Tokyo)
    Abstract: Markov regime switching models have been widely used in numerous empirical applications in economics and finance. However, the asymptotic distribution of the maximum likelihood estimator (MLE) has not been proven for some empirically popular Markov regime switching models. In particular, the asymptotic distribution of the MLE has been unknown for models in which the regime-specific density depends on both the current and the lagged regimes, which include the seminal model of Hamilton (1989) and the switching ARCH model of Hamilton and Susmel (1994). This paper shows the asymptotic normality of the MLE and the consistency of the asymptotic covariance matrix estimate of these models.
    Date: 2017–05
  7. By: Hiroyuki Kasahara (Vancouver School of Economics, University of British Columbia); Katsumi Shimotsu (Faculty of Economics, The University of Tokyo)
    Abstract: Testing the number of components in multivariate normal mixture models is a long-standing challenge. This paper develops a likelihood-based test of the null hypothesis of M 0 components against the alternative hypothesis of M 0 + 1 components. We derive a local quadratic approximation of the likelihood ratio statistic in terms of the polynomials of the parameters. Based on this quadratic approximation, we propose an EM test of the null hypothesis of M 0 components against the alternative hypothesis of M 0 + 1 components, and derive the asymptotic distribution of the proposed test statistic. The simulations show that the proposed test has good finite sample size and power properties.
    Date: 2017–03
  8. By: Andros Kourtellos (Department of Economics, University of Cyprus, Cyprus; The Rimini Centre for Economic Analysis); Thanasis Stengos (Department of Economics and Finance, University of Guelph, Canada; The Rimini Centre for Economic Analysis); Yiguo Sun (Department of Economics and Finance, University of Guelph, Canada)
    Abstract: In this paper, we investigate semiparametric threshold regression models with endogenous threshold variables based on a nonparametric control function approach. Using a series approximation we propose a two-step estimation method for the threshold parameter. For the regression coefficients we consider least-squares estimation in the case of exogenous regressors and two-stage least-squares estimation in the case of endogenous regressors. We show that our estimators are consistent and derive their asymptotic distribution for weakly dependent data. Furthermore, we propose a test for the endogeneity of the threshold variable, which is valid regardless of whether the threshold effect is zero or not. Finally, we assess the performance of our methods using a Monte Carlo simulation.
    Keywords: control function, series estimation, threshold regression
    JEL: C14 C24 C51
    Date: 2017–07
  9. By: Clemens, Jeffrey
    Abstract: This paper examines a ``falsification test'' from the recent minimum wage literature. The analysis illustrates several pitfalls associated with developing and interpreting such exercises, which are increasingly common in applied empirical work. Clemens and Wither (2014) present evidence that minimum wage increases contributed to the magnitude of employment declines among low-skilled groups during the Great Recession. Zipperer (2016) presents regressions that he interprets as falsification tests for Clemens and Wither's baseline regression. He interprets his results as evidence that Clemens and Wither's estimates are biased. In this paper, I demonstrate that Zipperer's falsification tests are uninformative for their intended purpose. The properties of clustered robust standard errors do not carry over from Clemens and Wither's baseline specification (27 treatment states drawn from 50) to Zipperer's falsification tests (3 or 5 ``placebo treatment'' states drawn from 23). Confidence intervals calculated using a setting-appropriate permutation test extend well beyond the tests' point estimates. Further, I show that the sub-samples to which Zipperer's procedure assigns ``placebo treatment status'' were disproportionately affected by severe housing crises. His test's point estimates are highly sensitive to the exclusion of the most extreme housing crisis experiences from the sample. An inspection of data on the housing market, prime aged employment, overall unemployment rates, and aggregate income per capita reveals the test's premise that regional neighbors form reasonable counterfactuals to be incorrect in this setting.
    Keywords: Falsification Test; Program Evaluation; Minimum Wage
    JEL: C18 J2 J3
    Date: 2017–06–14
  10. By: Chao Wang (Discipline of Business Analytics, The University of Sydney); Qian Chen (HSBC Business School, Peking University); Richard Gerlach (Discipline of Business Analytics, The University of Sydney)
    Abstract: The realized GARCH framework is extended to incorporate the two-sided Weibull distribution, for the purpose of volatility and tail risk forecasting in a financial time series. Further, the realized range, as a competitor for realized variance or daily returns, is employed in the realized GARCH framework. Further, sub-sampling and scaling methods are applied to both the realized range and realized variance, to help deal with inherent micro-structure noise and inefficiency. An adaptive Bayesian Markov Chain Monte Carlo method is developed and employed for estimation and forecasting, whose properties are assessed and compared with maximum likelihood, via a simulation study. Compared to a range of well-known parametric GARCH, GARCH with two-sided Weibull distribution and realized GARCH models, tail risk forecasting results across 7 market index return series and 2 individual assets clearly favor the realized GARCH models incorporating two-sided Weibull distribution, especially models employing the sub-sampled realized variance and sub-sampled realized range, over a six year period that includes the global financial crisis.
    Date: 2017–07
  11. By: Fabian Goessling; Martina Danielova Zaharieva
    Abstract: We propose a new and highly exible Bayesian sampling algorithm for non-linear state space models under non-parametric distributions. The estimation framework combines a particle filtering and smoothing algorithm for the latent process with a Dirichlet process mixture model for the error term of the observable variables. In particular, we overcome the problem of constraining the models by transformations or the need for conjugate distributions. We use the Chinese restaurant representation of the Dirichlet process mixture, which allows for a parsimonious and generally applicable sampling algorithm. Thus, our estimation algorithm combines a pseudo marginal Metropolis Hastings scheme with a marginalized hierarchical semi-parametric model. We test our approach for several nested model specifications using simulated data and provide density forecasts. Furthermore, we carry out a real data example using S&P 500 returns.
    Keywords: Bayesian Nonparametrics, Particle Filtering, Stochastic Volatility, MCMC, Forecasting
    Date: 2017–07
  12. By: Bianchi, Francesco; Nicolò, Giovanni
    Abstract: We propose a novel approach to deal with the problem of indeterminacy in Linear Rational Expectations models. The method consists of augmenting the original model with a set of auxiliary exogenous equations that are used to provide the adequate number of explosive roots in presence of indeterminacy. The solution in this expanded state space, if it exists, is always determinate, and is identical to the indeterminate solution of the original model. The proposed approach accommodates determinacy and any degree of indeterminacy, and it can be implemented even when the boundaries of the determinacy region are unknown. Thus, the researcher can estimate the model by using standard packages without restricting the estimates to a certain area of the parameter space. We apply our method to simulated and actual data from a prototypical New-Keynesian model for both regions of the parameter space. We show that our method successfully recovers the true parameter values independent of the initial values.
    Keywords: Bayesian methods.; General Equilibrium; Indeterminacy; Solution method
    JEL: C19 C51 C62 C63
    Date: 2017–07
  13. By: Siem Jan S.J. Koopman (VU Amsterdam, The Netherlands; CREATES, Aarhus University, Denmark; Tinbergen Institute, The Netherlands); Rutger Lit (VU Amsterdam, The Netherlands)
    Abstract: We develop a new dynamic multivariate model for the analysis and the forecasting of football match results in national league competitions. The proposed dynamic model is based on the score of the predictive observation mass function for a high-dimensional panel of weekly match results. Our main interest is to forecast whether the match result is a win, a loss or a draw for each team. To deliver such forecasts, the dynamic model can be based on three different dependent variables: the pairwise count of the number of goals, the difference between the number of goals, or the category of the match result (win, loss, draw). The different dependent variables require different distributional assumptions. Furthermore, different dynamic model specifications can be considered for generating the forecasts. We empirically investigate which dependent variable and which dynamic model specification yield the best forecasting results. In an extensive forecasting study, we consider match results from six large European football competitions and we validate the precision of the forecasts for a period of seven years for each competition. We conclude that our preferred dynamic model for pairwise counts delivers the most precise forecasts and outperforms benchmark and other competing models.
    Keywords: Football; Forecasting; Score-driven models; Bivariate Poisson; Skellam; Ordered probit; Probabilistic loss function
    JEL: C32
    Date: 2017–07–05
  14. By: Chabé-Ferret, Sylvain
    Abstract: Applied researchers often combine Difference In Differences (DID) with conditioning on pre-treatment outcomes when the Parallel Trend Assumption (PTA) fails. I examine both the theoretical and empirical basis for this approach. I show that the theoretical argument that both methods combine their strengths – DID differencing out the permanent confounders while conditioning on pre-treatment outcomes captures the transitory ones – is incorrect. Worse, conditioning on pre-treatment outcomes might increase the bias of DID. Simulations of a realistic model of earnings dynamics and selection in a Job Training Program (JTP) show that this bias can be sizable in practice. Revisiting empirical studies comparing DID with RCTs, I also find that conditioning on pre-treatment outcomes increases the bias of DID. Taken together, these results suggest that we should not combine DID with conditioning on pre-treatment outcomes but rather use DID conditioning on covariates that are fixed over time. When the PTA fails, DID applied symmetrically around the treatment date performs well in simulations and when compared with RCTs. Matching on several observations of pre-treatment outcomes also performs well in simulations, but evidence on its empirical performance is lacking.
    Keywords: Difference in Differences - Matching - Selection Model - Treatment Effects.
    JEL: C21 C23
    Date: 2017–06

This nep-ecm issue is ©2017 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.