nep-ecm New Economics Papers
on Econometrics
Issue of 2019‒09‒09
twenty-one papers chosen by
Sune Karlsson
Örebro universitet

  1. A Doubly Corrected Robust Variance Estimator for Linear GMM By Jungbin Hwang; Byunghoon Kang; Seojeong Lee
  2. Bayesian Inference for Markov-switching Skewed Autoregressive Models By Stéphane Lhuissier
  3. Theory of Weak Identification in Semiparametric Models By Tetsuya Kaji
  4. Direct and indirect effects under sample selection and outcome attrition By Huber, Martin; Solovyeva, Anna
  5. Linear Quantile Regression and Endogeneity Correction By Christophe Muller
  6. Subsampling Sequential Monte Carlo for Static Bayesian Models By Gunawan, David; Dang, Khue-Dung; Quiroz, Matias; Kohn, Robert; Tran, Minh-Ngoc
  7. Bias and Consistency in Three-way Gravity Models By Martin Weidner; Thomas Zylkin
  8. Analyzing Commodity Futures Using Factor State-Space Models with Wishart Stochastic Volatility By Tore Selland Kleppe; Roman Liesenfeld; Guilherme Valle Moura; Atle Oglend
  9. Fourier transform MCMC, heavy tailed distributions and geometric ergodicity By Denis Belomestny; Leonid Iosipoi
  10. Nonparametric Analysis of Random Utility Models: Computational Tools for Statistical Testing By Bram De Rock; Laurens Cherchye; Bart Smeulders
  11. An introduction to flexible methods for policy evaluation By Huber, Martin
  12. A New Proposal of Applications of Statistical Depth Functions in Causal Analysis of Socio-Economic Phenomena Based on Official Statistics -- A Study of EU Agricultural Subsidies and Digital Developement in Poland By Kosiorowski Daniel; Jerzy P. Rydlewski
  13. QCNN: Quantile Convolutional Neural Network By G\'abor Petneh\'azi
  14. Rethinking travel behavior modeling representations through embeddings By Francisco C. Pereira
  15. Predicting Returns With Text Data By Zheng Tracy Ke; Bryan T. Kelly; Dacheng Xiu
  16. A Simple Solution to the Problem of Independence of Irrelevant Alternatives in Choo and Siow Marriage Market Model By Gutierrez, Federico H.
  17. Vector Autoregressive Moving Average Model with Scalar Moving Average By Du Nguyen
  18. Robust Inference about Conditional Tail Features: A Panel Data Approach By Yuya Sasaki; Yulong Wang
  19. Does the Estimation of the Propensity Score by Machine Learning Improve Matching Estimation? The Case of Germany's Programmes for Long Term Unemployed By Goller, Daniel; Lechner, Michael; Moczall, Andreas; Wolff, Joachim
  20. A review of causal mediation analysis for assessing direct and indirect treatment effects By Huber, Martin
  21. A Review of Changepoint Detection Models By Yixiao Li; Gloria Lin; Thomas Lau; Ruochen Zeng

  1. By: Jungbin Hwang; Byunghoon Kang; Seojeong Lee
    Abstract: We propose a new finite sample corrected variance estimator for the linear generalized method of moments (GMM) including the one-step, two-step, and iterated estimators. Our formula additionally corrects for the over-identification bias in variance estimation on top of the commonly used finite sample correction of Windmeijer (2005) which corrects for the bias from estimating the efficient weight matrix, so is doubly corrected. Formal stochastic expansions are derived to show the proposed double correction estimates the variance of some higher-order terms in the expansion. In addition, the proposed double correction provides robustness to misspecification of the moment condition. In contrast, the conventional variance estimator and the Windmeijer correction are inconsistent under misspecification. That is, the proposed double correction formula provides a convenient way to obtain improved inference under correct specification and robustness against misspecification at the same time.
    Date: 2019–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1908.07821&r=all
  2. By: Stéphane Lhuissier
    Abstract: We examine Markov-switching autoregressive models where the commonly used Gaussian assumption for disturbances is replaced with a skew-normal distribution. This allows us to detect regime changes not only in the mean and the variance of a specified time series, but also in its skewness. A Bayesian framework is developed based on Markov chain Monte Carlo sampling. Our informative prior distributions lead to closed-form full conditional posterior distributions, whose sampling can be efficiently conducted within a Gibbs sampling scheme. The usefulness of the methodology is illustrated with a real-data example from U.S. stock markets.
    Keywords: Regime switching, Skewness, Gibbs-sampler, time series analysis, upside and downside risks.
    JEL: C01 C11 C2 G11
    Date: 2019
    URL: http://d.repec.org/n?u=RePEc:bfr:banfra:726&r=all
  3. By: Tetsuya Kaji
    Abstract: We provide general formulation of weak identification in semiparametric models and an efficiency concept. Weak identification occurs when a parameter is weakly regular, i.e., when it is locally homogeneous of degree zero. When this happens, consistent or equivariant estimation is shown to be impossible. We then show that there exists an underlying regular parameter that fully characterizes the weakly regular parameter. While this parameter is not unique, concepts of sufficiency and minimality help pin down a desirable one. If estimation of minimal sufficient underlying parameters is inefficient, it introduces noise in the corresponding estimation of weakly regular parameters, whence we can improve the estimators by local asymptotic Rao-Blackwellization. We call an estimator weakly efficient if it does not admit such improvement. New weakly efficient estimators are presented in linear IV and nonlinear regression models. Simulation of a linear IV model demonstrates how 2SLS and optimal IV estimators are improved.
    Date: 2019–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1908.10478&r=all
  4. By: Huber, Martin; Solovyeva, Anna
    Abstract: This paper considers the evaluation of direct and indirect treatment effects, also known as mediation analysis, when outcomes are only observed for a subpopulation due to sample selection or outcome attrition. For identification, we combine sequential conditional independence assumptions on the assignment of the treatment and the mediator, i.e. the variable through which the indirect effect operates, with either selection on observables/missing at random or instrumental variable assumptions on the outcome attrition process. We derive expressions for the effects of interest that are based on inverse probability weighting by specific treatment, mediator, and/or selection propensity scores. We also provide a brief simulation study and an empirical illustration based on U.S. Project STAR data that assesses the direct effect and indirect effect (via absenteeism) of smaller kindergarten classes on math test scores.
    Keywords: Causal mechanisms; direct effects; indirect effects; causal channels; mediation analysis; causal pathways; sample selection; attrition; outcome nonresponse; inverse probability weighting; propensity score
    JEL: C21 I21
    Date: 2018–10–22
    URL: http://d.repec.org/n?u=RePEc:fri:fribow:fribow00496&r=all
  5. By: Christophe Muller (AMSE - Aix-Marseille Sciences Economiques - EHESS - École des hautes études en sciences sociales - AMU - Aix Marseille Université - ECM - Ecole Centrale de Marseille - CNRS - Centre National de la Recherche Scientifique)
    Abstract: The main two methods of endogeneity correction for linear quantile regressions with their advantages and drawbacks are reviewed and compared. Then, we discuss opportunities of alleviating the constant effect restriction of the fitted-value approach by relaxing identification conditions.
    Keywords: Two-Stage Estimation,Quantile Regression,Fitted-Value Approach,Endogeneity
    Date: 2019–07
    URL: http://d.repec.org/n?u=RePEc:hal:wpaper:halshs-02272874&r=all
  6. By: Gunawan, David (School of Economics, UNSW Business School, University of New South Wales, ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS).); Dang, Khue-Dung (School of Economics, UNSW Business School, University of New South Wales, ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS).); Quiroz, Matias (School of Economics, UNSW Business School, University of New South Wales, ARC Centre of Excellence for Mathematical, Statistical Frontiers (ACEMS) and Research Division.); Kohn, Robert (School of Economics, UNSW Business School, University of New South Wales, ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS).); Tran, Minh-Ngoc (ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS) and Discipline of Business Analytics, University)
    Abstract: We show how to speed up Sequential Monte Carlo (SMC) for Bayesian inference in large data problems by data subsampling. SMC sequentially updates a cloud of particles through a sequence of distributions, beginning with a distribution that is easy to sample from such as the prior and ending with the posterior distribution. Each update of the particle cloud consists of three steps: reweighting, resampling, and moving. In the move step, each particle is moved using a Markov kernel and this is typically the most computation- ally expensive part, particularly when the dataset is large. It is crucial to have an efficient move step to ensure particle diversity. Our article makes two important contributions. First, in order to speed up the SMC computation, we use an approximately unbiased and efficient annealed likelihood estimator based on data subsampling. The subsampling approach is more memory effi- cient than the corresponding full data SMC, which is an advantage for parallel computation. Second, we use a Metropolis within Gibbs kernel with two con- ditional updates. A Hamiltonian Monte Carlo update makes distant moves for the model parameters, and a block pseudo-marginal proposal is used for the particles corresponding to the auxiliary variables for the data subsampling. We demonstrate the usefulness of the methodology for estimating three gen- eralized linear models and a generalized additive model with large datasets.
    Keywords: Hamiltonian Monte Carlo; Large datasets; Likelihood annealing
    JEL: C11 C15
    Date: 2019–04–01
    URL: http://d.repec.org/n?u=RePEc:hhs:rbnkwp:0371&r=all
  7. By: Martin Weidner; Thomas Zylkin
    Abstract: We study the incidental parameter problem in "three-way" Poisson Pseudo-Maximum Likelihood ("PPML") gravity models recently recommended for identifying the effects of trade policies. Despite the number and variety of fixed effects this model entails, we confirm it is consistent for small $T$ and we show it is in fact the only estimator among a wide range of PML gravity estimators that is generally consistent in this context when $T$ is small. At the same time, asymptotic confidence intervals in fixed-$T$ panels are not correctly centered at the true point estimates, and cluster-robust variance estimates used to construct standard errors are generally biased as well. We characterize each of these biases analytically and show both numerically and empirically that they are salient even for real-data settings with a large number of countries. We also offer practical remedies that can be used to obtain more reliable inferences of the effects of trade policies and other time-varying gravity variables.
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1909.01327&r=all
  8. By: Tore Selland Kleppe; Roman Liesenfeld; Guilherme Valle Moura; Atle Oglend
    Abstract: We propose a factor state-space approach with stochastic volatility to model and forecast the term structure of future contracts on commodities. Our approach builds upon the dynamic 3-factor Nelson-Siegel model and its 4-factor Svensson extension and assumes for the latent level, slope and curvature factors a Gaussian vector autoregression with a multivariate Wishart stochastic volatility process. Exploiting the conjugacy of the Wishart and the Gaussian distribution, we develop a computationally fast and easy to implement MCMC algorithm for the Bayesian posterior analysis. An empirical application to daily prices for contracts on crude oil with stipulated delivery dates ranging from one to 24 months ahead show that the estimated 4-factor Svensson model with two curvature factors provides a good parsimonious representation of the serial correlation in the individual prices and their volatility. It also shows that this model has a good out-of-sample forecast performance.
    Date: 2019–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1908.07798&r=all
  9. By: Denis Belomestny; Leonid Iosipoi
    Abstract: Markov Chain Monte Carlo methods become increasingly popular in applied mathematics as a tool for numerical integration with respect to complex and high-dimensional distributions. However, application of MCMC methods to heavy tailed distributions and distributions with analytically intractable densities turns out to be rather problematic. In this paper, we propose a novel approach towards the use of MCMC algorithms for distributions with analytically known Fourier transforms and, in particular, heavy tailed distributions. The main idea of the proposed approach is to use MCMC methods in Fourier domain to sample from a density proportional to the absolute value of the underlying characteristic function. A subsequent application of the Parseval's formula leads to an efficient algorithm for the computation of integrals with respect to the underlying density. We show that the resulting Markov chain in Fourier domain may be geometrically ergodic even in the case of heavy tailed original distributions. We illustrate our approach by several numerical examples including multivariate elliptically contoured stable distributions.
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1909.00698&r=all
  10. By: Bram De Rock; Laurens Cherchye; Bart Smeulders
    Abstract: Kitamura and Stoye (2018) recently proposed a nonparametric statistical test for random utility models of consumer behavior. The test is formulated in terms of linear inequality constraints and a quadratic objective function. While the nonparametric test is conceptually appealing, its practical implementation is computationally challenging. In this note, we develop a column generation approach to operationalize the test. We show that these novel computational tools generate considerable computational gains in practice, which substantially increases the empirical usefulness of Kitamura and Stoye’s statistical test.
    Keywords: computational tools; statistical testing
    Date: 2019–08
    URL: http://d.repec.org/n?u=RePEc:eca:wpaper:2013/292215&r=all
  11. By: Huber, Martin
    Abstract: This chapter covers different approaches to policy evaluation for assessing the causal effect of a treatment or intervention on an outcome of interest. As an introduction to causal inference, the discussion starts with the experimental evaluation of a randomized treatment. It then reviews evaluation methods based on selection on observables (assuming a quasi-random treatment given observed covariates), instrumental variables (inducing a quasi-random shift in the treatment), difference-in-differences and changes-in-changes (exploiting changes in outcomes over time), as well as regression discontinuities and kinks (using changes in the treatment assignment at some threshold of a running variable). The chapter discusses methods particularly suited for data with many observations for a flexible (i.e. semi- or nonparametric) modeling of treatment effects, and/or many (i.e. high dimensional) observed covariates by applying machine learning to select and control for covariates in a data-driven way. This is not only useful for tackling confounding by controlling for instance for factors jointly affecting the treatment and the outcome, but also for learning effect heterogeneities across subgroups defined upon observable covariates and optimally targeting those groups for which the treatment is most effective.
    Keywords: Policy evaluation; treatment effects; machine learning; experiment; selection on observables; instrument; difference-indifferences; changes-in-changes; regression discontinuity design; regression kink design
    JEL: C21 C26 C29
    Date: 2019–08–12
    URL: http://d.repec.org/n?u=RePEc:fri:fribow:fribow00504&r=all
  12. By: Kosiorowski Daniel; Jerzy P. Rydlewski
    Abstract: Results of a convincing causal statistical inference related to socio-economic phenomena are treated as especially desired background for conducting various socio-economic programs or government interventions. Unfortunately, quite often real socio-economic issues do not fulfill restrictive assumptions of procedures of causal analysis proposed in the literature. This paper indicates certain empirical challenges and conceptual opportunities related to applications of procedures of data depth concept into a process of causal inference as to socio-economic phenomena. We show, how to apply a statistical functional depths in order to indicate factual and counterfactual distributions commonly used within procedures of causal inference. The presented framework is especially useful in a context of conducting causal inference basing on official statistics, i.e., basing on already existing databases. Methodological considerations related to extremal depth, modified band depth, Fraiman-Muniz depth, and multivariate Wilcoxon sum rank statistic are illustrated by means of example related to a study of an impact of EU direct agricultural subsidies on a digital development in Poland in a period of 2012-2019.
    Date: 2019–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1908.11099&r=all
  13. By: G\'abor Petneh\'azi
    Abstract: A dilated causal one-dimensional convolutional neural network architecture is proposed for quantile regression. The model can forecast any arbitrary quantile, and it can be trained jointly on multiple similar time series. An application to Value at Risk forecasting shows that QCNN outperforms linear quantile regression and constant quantile estimates.
    Date: 2019–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1908.07978&r=all
  14. By: Francisco C. Pereira
    Abstract: This paper introduces the concept of travel behavior embeddings, a method for re-representing discrete variables that are typically used in travel demand modeling, such as mode, trip purpose, education level, family type or occupation. This re-representation process essentially maps those variables into a latent space called the \emph{embedding space}. The benefit of this is that such spaces allow for richer nuances than the typical transformations used in categorical variables (e.g. dummy encoding, contrasted encoding, principal components analysis). While the usage of latent variable representations is not new per se in travel demand modeling, the idea presented here brings several innovations: it is an entirely data driven algorithm; it is informative and consistent, since the latent space can be visualized and interpreted based on distances between different categories; it preserves interpretability of coefficients, despite being based on Neural Network principles; and it is transferrable, in that embeddings learned from one dataset can be reused for other ones, as long as travel behavior keeps consistent between the datasets. The idea is strongly inspired on natural language processing techniques, namely the word2vec algorithm. Such algorithm is behind recent developments such as in automatic translation or next word prediction. Our method is demonstrated using a model choice model, and shows improvements of up to 60\% with respect to initial likelihood, and up to 20% with respect to likelihood of the corresponding traditional model (i.e. using dummy variables) in out-of-sample evaluation. We provide a new Python package, called PyTre (PYthon TRavel Embeddings), that others can straightforwardly use to replicate our results or improve their own models. Our experiments are themselves based on an open dataset (swissmetro).
    Date: 2019–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1909.00154&r=all
  15. By: Zheng Tracy Ke; Bryan T. Kelly; Dacheng Xiu
    Abstract: We introduce a new text-mining methodology that extracts sentiment information from news articles to predict asset returns. Unlike more common sentiment scores used for stock return prediction (e.g., those sold by commercial vendors or built with dictionary-based methods), our supervised learning framework constructs a sentiment score that is specifically adapted to the problem of return prediction. Our method proceeds in three steps: 1) isolating a list of sentiment terms via predictive screening, 2) assigning sentiment weights to these words via topic modeling, and 3) aggregating terms into an article-level sentiment score via penalized likelihood. We derive theoretical guarantees on the accuracy of estimates from our model with minimal assumptions. In our empirical analysis, we text-mine one of the most actively monitored streams of news articles in the financial system—the Dow Jones Newswires—and show that our supervised sentiment model excels at extracting return-predictive signals in this context.
    JEL: C53 C58 G10 G11 G12 G14 G17
    Date: 2019–08
    URL: http://d.repec.org/n?u=RePEc:nbr:nberwo:26186&r=all
  16. By: Gutierrez, Federico H.
    Abstract: This paper proposes a simple solution to the independence of irrelevant alternatives (IIA) problem in Choo and Siow (2006) model, overcoming what is probably the main limitation of this approach. The solution consists of assuming match-specific rather than choice-specific random preferences. The original marriage matching function gets modified by an adjustment factor that improves its empirical properties. Using the American Community Survey, I show that the new approach yields significantly different results affecting the qualitative conclusions of the analysis. The proposed solution to the IIA problem applies to other settings in which the relative "supply" of choices is observable.
    Keywords: Independence of irrelevant alternatives,marriage market,transferable utility
    JEL: J12 J16 J10
    Date: 2019
    URL: http://d.repec.org/n?u=RePEc:zbw:glodps:387&r=all
  17. By: Du Nguyen
    Abstract: We show Vector Autoregressive Moving Average models with scalar Moving Average components could be estimated by generalized least square (GLS) for each fixed moving average polynomial. The conditional variance of the GLS model is the concentrated covariant matrix of the moving average process. Under GLS the likelihood function of these models has similar format to their VAR counterparts. Maximum likelihood estimate can be done by optimizing with gradient over the moving average parameters. These models are inexpensive generalizations of Vector Autoregressive models. We discuss a relationship between this result and the Borodin-Okounkov formula in operator theory.
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1909.00386&r=all
  18. By: Yuya Sasaki; Yulong Wang
    Abstract: We develop a new extreme value theory for panel data and use it to construct asymptotically valid confidence intervals (CIs) for conditional tail features such as conditional extreme quantile and conditional tail index. As a by-product, we also construct CIs for tail features of the coefficients in the random coefficient regression model. The new CIs are robustly valid without parametric assumptions and have excellent small sample coverage and length properties. Applying the proposed method, we study the tail risk of the monthly U.S. stock returns and find that (i) the left tail features of stock returns and those of the Fama-French regression residuals heavily depend on other stock characteristics such as stock size; and (ii) the alpha's and beta's are strongly heterogeneous across stocks in the Fama-French regression. These findings suggest that the Fama-French model is insufficient to characterize the tail behavior of stock returns.
    Date: 2019–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1909.00294&r=all
  19. By: Goller, Daniel (University of St. Gallen); Lechner, Michael (University of St. Gallen); Moczall, Andreas (Institute for Employment Research (IAB), Nuremberg); Wolff, Joachim (Institute for Employment Research (IAB), Nuremberg)
    Abstract: Matching-type estimators using the propensity score are the major workhorse in active labour market policy evaluation. This work investigates if machine learning algorithms for estimating the propensity score lead to more credible estimation of average treatment effects on the treated using a radius matching framework. Considering two popular methods, the results are ambiguous: We find that using LASSO based logit models to estimate the propensity score delivers more credible results than conventional methods in small and medium sized high dimensional datasets. However, the usage of Random Forests to estimate the propensity score may lead to a deterioration of the performance in situations with a low treatment share. The application reveals a positive effect of the training programme on days in employment for long-term unemployed. While the choice of the "first stage" is highly relevant for settings with low number of observations and few treated, machine learning and conventional estimation becomes more similar in larger samples and higher treatment shares.
    Keywords: programme evaluation, active labour market policy, causal machine learning, treatment effects, radius matching, propensity score
    JEL: J68 C21
    Date: 2019–08
    URL: http://d.repec.org/n?u=RePEc:iza:izadps:dp12526&r=all
  20. By: Huber, Martin
    Abstract: Mediation analysis aims at evaluating the causal mechanisms through which a treatment or intervention affects an outcome of interest. The goal is to disentangle the total treatment effect into an indirect effect operating through one or several observed intermediate variables, the so-called mediators, as well as a direct effect reflecting any impact not captured by the observed mediator(s). This paper reviews methodological advancements with a particular focus on applications in economics. It defines the parameters of interest, covers various identification strategies, e.g. based on control variables or instruments, and presents sensitivity checks. Furthermore, it discusses several extensions of the standard mediation framework, such as multivalued treatments, mismeasured mediators, and outcome attrition.
    Keywords: Mediation; direct effect; indirect effect; sequential conditional independence; instrument
    JEL: C21
    Date: 2019–01–01
    URL: http://d.repec.org/n?u=RePEc:fri:fribow:fribow00500&r=all
  21. By: Yixiao Li; Gloria Lin; Thomas Lau; Ruochen Zeng
    Abstract: The objective of the change-point detection is to discover the abrupt property changes lying behind the time-series data. In this paper, we firstly summarize the definition and in-depth implication of the changepoint detection. The next stage is to elaborate traditional and some alternative model-based changepoint detection algorithms. Finally, we try to go a bit further in the theory and look into future research directions.
    Date: 2019–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1908.07136&r=all

This nep-ecm issue is ©2019 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.