nep-ecm New Economics Papers
on Econometrics
Issue of 2018‒01‒08
twenty-one papers chosen by
Sune Karlsson
Örebro universitet

  1. Estimation and Inference in Mixed Fixed and Random Coefficient Panel Data Models By Andrea Nocera
  2. Variational Bayes Estimation of Time Series Copulas for Multivariate Ordinal and Mixed Data By Ruben Loaiza-Maya; Michael Stanley Smith
  3. Asymptotically Distribution-Free Goodness-of-Fit Testing for Copulas By Can, S.U.; Einmahl, John; Laeven, R.J.A.
  4. New unit root tests with two smooth breaks and nonlinear adjustment By Hepsag, Aycan
  5. Normality Tests for Dependent Data By Zacharias Psaradakis; Marian Vavra
  6. Generic Machine Learning Inference on Heterogenous Treatment Effects in Randomized Experiments By Victor Chernozhukov; Mert Demirer; Esther Duflo; Ivan Fernandez-Val
  7. Markov-Switching Models with State-Dependent Time-Varying Transition Probabilities By Zacharias Psaradakis; Martin Sola
  8. Endogenous Variables in Binary Choice Models: Some Insights for Practitioners By Bontemps, Christophe; Nauges, Céline
  9. Transformation Models in High-Dimensions By Sven Klaassen; Jannis Kueck; Martin Spindler
  10. Simultaneous Confidence Intervals for High-dimensional Linear Models with Many Endogenous Variables By Alexandre Belloni; Victor Chernozhukov; Christian Hansen; Whitney Newey
  11. Causes and Effects of Negative Definite Covariance Matrices in Swamy Type Random Coefficient Models By Andrea Nocera
  12. An alternative single parameter functional form for Lorenz curve By Satya Paul; Sriram Shankar
  13. The Bunching Estimator Cannot Identify the Taxable Income Elasticity By Sören Blomquist; Whitney K. Newey
  14. Estimating Engel curves: A new way to improve the SILC-HBS matching process By Julio López-Laborda; Carmen Marín-González; Jorge Onrubia
  15. Machine Learning for Partial Identification: Example of Bracketed Data By Vira Semenova
  16. Learning Objectives for Treatment Effect Estimation By Xinkun Nie; Stefan Wager
  17. Non-discriminatory Trade Policies in Structural Gravity Models. Evidence from Monte Carlo Simulations By Sellner, Richard
  18. Semiparametric inference for non-LAN models By Zhou, Bo
  19. An Exact and Robust Conformal Inference Method for Counterfactual and Synthetic Controls By Victor Chernozhukov; Kaspar Wuthrich; Yinchu Zhu
  20. Large-scale portfolio allocation under transaction costs and model uncertainty By Hautsch, Nikolaus; Voigt, Stefan
  21. Relative efficiency of confidence interval methods around effect sizes By Doll, Monika

  1. By: Andrea Nocera (Birkbeck, University of London)
    Abstract: In this paper, we propose to implement the EM algorithm to compute restricted maximum likelihood estimates of both the average effects and the unit-specific coefficients as well as of the variance components in a wide class of heterogeneous panel data models. Compared to existing methods, our approach leads to unbiased and more efficient estimation of the variance components of the model without running into the problem of negative definite covariance matrices typically encountered in random coefficient models. This in turn leads to more accurate estimated standard errors and hypothesis tests. Monte Carlo simulations reveal that the proposed estimator has relatively good finite sample properties. In evaluating the merits of our method, we also provide an overview of the sampling and Bayesian methods commonly used to estimate heterogeneous panel data. A novel approach to investigate heterogeneity of the sensitivity of sovereign spreads to government debt is presented.
    Keywords: EM algorithm, restricted maximum likelihood, correlated random coefficient models, heterogeneous panels, debt intolerance, sovereign credit spreads.
    JEL: C13 C23 C63 F34 G15 H63
    Date: 2017–06
  2. By: Ruben Loaiza-Maya; Michael Stanley Smith
    Abstract: We propose a new variational Bayes method for estimating high-dimensional copulas with discrete, or discrete and continuous, margins. The method is based on a variational approximation to a tractable augmented posterior, and is substantially faster than previous likelihood-based approaches. We use it to estimate drawable vine copulas for univariate and multivariate Markov ordinal and mixed time series. These have dimension $rT$, where $T$ is the number of observations and $r$ is the number of series, and are difficult to estimate using previous methods. The vine pair-copulas are carefully selected to allow for heteroskedasticity, which is a common feature of ordinal time series data. When combined with flexible margins, the resulting time series models also allow for other common features of ordinal data, such as zero inflation, multiple modes and under- or over-dispersion. Using data on homicides in New South Wales, and also U.S bankruptcies, we illustrate both the flexibility of the time series copula models, and the efficacy of the variational Bayes estimator for copulas of up to 792 dimensions and 60 parameters. This far exceeds the size and complexity of copula models for discrete data that can be estimated using previous methods.
    Date: 2017–12
  3. By: Can, S.U.; Einmahl, John (Tilburg University, Center For Economic Research); Laeven, R.J.A.
    Abstract: Consider a random sample from a continuous multivariate distribution function F with copula C. In order to test the null hypothesis that C belongs to a certain parametric family, we construct an under H0 asymptotically distribution-free process that serves as a tests generator. The process is a transformation of the difference of a semi-parametric and a parametric estimator of C. This transformed empirical process converges weakly to a standard multivariate Wiener process, paving the way for a multitude of powerful asymptotically distribution-free goodness-of-t tests for copula families. We investigate the finite-sample performance of our approach through a simulation study and illustrate its applicability with a data analysis.
    Keywords: Khmaladze transform,; copula estimation; empirical process
    JEL: C12 C14
    Date: 2017
  4. By: Hepsag, Aycan
    Abstract: This paper proposes new three unit root testing procedures which consider jointly for two structural breaks and nonlinear adjustment. The structural breaks are modelled by means of two logistic smooth transition functions and nonlinear adjustment is modelled by means of ESTAR models. The Monte Carlo experiments display that the empirical sizes of tests are quite close to the nominal ones and in terms of power; the three new unit root tests are superior to the alternative tests. An empirical application involving crude oil underlines the usefulness of the new unit root tests.
    Keywords: Smooth breaks, nonlinearity, unit root, ESTAR
    JEL: C12 C22
    Date: 2017–12–19
  5. By: Zacharias Psaradakis (University of London); Marian Vavra (National Bank of Slovakia)
    Abstract: The paper considers the problem of testing for normality of the one-dimensional marginal distribution of a strictly stationary and weakly dependent stochastic process. The possibility of using an autoregressive sieve bootstrap procedure to obtain critical values and P-values for normality tests is explored. The small-sample properties of a variety of tests are investigated in an extensive set of Monte Carlo experiments. The bootstrap version of the classical skewness–kurtosis test is shown to have the best overall performance in small samples.
    Keywords: Autoregressive sieve bootstrap, Normality test, Weak dependence
    JEL: C12 C15 C32
    Date: 2017–12
  6. By: Victor Chernozhukov; Mert Demirer; Esther Duflo; Ivan Fernandez-Val
    Abstract: We propose strategies to estimate and make inference on key features of heterogeneous effects in randomized experiments. These key features include best linear predictors of the effects using machine learning proxies, average effects sorted by impact groups, and average characteristics of most and least impacted units. The approach is valid in high dimensional settings, where the effects are proxied by machine learning methods. We post-process these proxies into the estimates of the key features. Our approach is agnostic about the properties of the machine learning estimators used to produce proxies, and it completely avoids making any strong assumption. Estimation and inference relies on repeated data splitting to avoid overfitting and achieve validity. Our variational inference method is shown to be uniformly valid and quantifies the uncertainty coming from both parameter estimation and data splitting. In essence, we take medians of p-values and medians of confidence intervals, resulting from many different data splits, and then adjust their nominal level to guarantee uniform validity. The inference method could be of substantial independent interest in many machine learning applications. Empirical applications illustrate the use of the approach.
    Date: 2017–12
  7. By: Zacharias Psaradakis (Birkbeck, University of London); Martin Sola (Universidad Torcuato di Tella, Argentina)
    Abstract: This paper proposes a model which allows for discrete stochastic breaks in the time-varying transition probabilities of Markov-switching models with autoregressive dynamics. An extensive simulation study is undertaken to examine the properties of the maximum-likelihood estimator and related statistics, and to investigate the implications of misspecification due to unaccounted changes in the parameters of the Markov transition mechanism. An empirical application that examines the relationship between Argentinian sovereign bond spreads and output growth is also discussed.
    Keywords: Markov-switching models; Maximum likelihood; Monte Carlo experiments; Time-varying transition probabilities.
    JEL: C32
    Date: 2017–03
  8. By: Bontemps, Christophe; Nauges, Céline
    Abstract: The main purpose of this article is to offer practical insights to econometricians wanting to estimate binary choice models featuring a continuous endogenous regressor. We use simulated data to investigate the performance of Lewbel’s special regressor method, an estimator that is relatively easy to implement and that relies on different identification conditions than more common control function and Maximum Likelihood estimators. Our findings confirm that the large support condition is crucial for the special regressor method to perform well and that one should be very cautious when implementing heteroscedasticity corrections and trimming since these could severely bias the final estimates.
    Date: 2017–10
  9. By: Sven Klaassen; Jannis Kueck; Martin Spindler
    Abstract: Transformation models are a very important tool for applied statisticians and econometricians. In many applications, the dependent variable is transformed so that homogeneity or normal distribution of the error holds. In this paper, we analyze transformation models in a high-dimensional setting, where the set of potential covariates is large. We propose an estimator for the transformation parameter and we show that it is asymptotically normally distributed using an orthogonalized moment condition where the nuisance functions depend on the target parameter. In a simulation study, we show that the proposed estimator works well in small samples. A common practice in labor economics is to transform wage with the log-function. In this study, we test if this transformation holds in CPS data from the United States.
    Date: 2017–12
  10. By: Alexandre Belloni; Victor Chernozhukov; Christian Hansen; Whitney Newey
    Abstract: High-dimensional linear models with endogenous variables play an increasingly important role in recent econometric literature. In this work we allow for models with many endogenous variables and many instrument variables to achieve identification. Because of the high-dimensionality in the second stage, constructing honest confidence regions with asymptotically correct coverage is non-trivial. Our main contribution is to propose estimators and confidence regions that would achieve that. The approach relies on moment conditions that have an additional orthogonal property with respect to nuisance parameters. Moreover, estimation of high-dimension nuisance parameters is carried out via new pivotal procedures. In order to achieve simultaneously valid confidence regions we use a multiplier bootstrap procedure to compute critical values and establish its validity.
    Date: 2017–12
  11. By: Andrea Nocera (Birkbeck, University of London)
    Abstract: In this paper, we investigate the causes and the finite-sample consequences of negative definite covariance matrices in Swamy type random coefficient models. Monte Carlo experiments reveal that the negative definiteness problem is less severe when the degree of coefficient dispersion is substantial, and the precision of the regression disturbances is high. The sample size also plays a crucial role. We then demonstrate that relying on the asymptotic properties of a biased but consistent estimator of the random coefficient covariance may lead to poor inference.
    Keywords: Finite-sample inference, Monte Carlo analysis, negative definite covariance matrices, panel data, random coefficient models.
    JEL: C12 C15 C23
    Date: 2017–06
  12. By: Satya Paul; Sriram Shankar
    Abstract: This paper proposes a single parameter functional form for the Lorenz curve and compares its performance with the existing single parameter functional forms using Australian income data for 10 years. The proposed parametric functional form performs better than the existing Lorenz functional forms. The Gini based on the proposed functional form is closest to true Gini each year.
    Keywords: Gini coefficient; Lorenz curve; Parametric functional form
    JEL: C80 D31
    Date: 2017–11
  13. By: Sören Blomquist; Whitney K. Newey
    Abstract: Saez (2010) introduced an influential estimator that has become known as the bunching estimator. Using this method one can get an estimate of the taxable income elasticity from the bunching pattern around a kink point. The bunching estimator has become popular, with a large number of papers applying the method. In this paper, we show that the bunching estimator cannot identify the taxable income elasticity when the functional form of the distribution of preference heterogeneity is unknown. We find that an observed distribution of taxable income around a kink point in a budget set can be consistent with any taxable income elasticity if the distribution of heterogeneity is unrestricted. If one is willing to assume restrictions on the heterogeneity density some information about the taxable income elasticity can be obtained. We give bounds on the taxable income elasticity based on monotonicity of the heterogeneity density and apply these bounds to the data in Saez (2010). We also consider identification from budget set variation. We find that kinks alone are still not informative even when budget sets vary. However, if the taxable income specification is restricted to be of the parametric isoelastic form assumed in Saez (2010) the taxable income elasticity can be well identified from variation among linear segments of budget sets.
    Date: 2017
  14. By: Julio López-Laborda; Carmen Marín-González; Jorge Onrubia
    Abstract: There are several ways to match SILC-HBS surveys, with the most common technique involving the estimation of Engel curves using Ordinary Least Squares in logs with HBS data to impute household expenditure in the income dataset (SILC). The estimation in logs has certain advantages, as it can deal with skewness in data and reduce heteroskedasticity. However, the model needs to be corrected with a smearing estimate to retransform the results into levels. The presence of intrinsic heteroskedasticity in household expenditure therefore calls for another technique, as the smearing estimate produces a bias. Generalized Linear Models (GLMs) are presented as the best option.
    Date: 2017–12
  15. By: Vira Semenova
    Abstract: Partially identified models occur commonly in economic applications. A common problem in this literature is a regression problem with bracketed (interval-censored) outcome variable Y, which creates a set-identified parameter of interest. The recent studies have only considered finite-dimensional linear regression in such context. To incorporate more complex controls into the problem, we consider a partially linear projection of Y on the set functions that are linear in treatment/policy variables and nonlinear in the controls. We characterize the identified set for the linear component of this projection and propose an estimator of its support function. Our estimator converges at parametric rate and has asymptotic normality properties. It may be useful for labor economics applications that involve bracketed salaries and rich, high-dimensional demographic data about the subjects of the study.
    Date: 2017–12
  16. By: Xinkun Nie; Stefan Wager
    Abstract: We develop a general class of two-step algorithms for heterogeneous treatment effect estimation in observational studies. We first estimate marginal effects and treatment propensities to form an objective function that isolates the heterogeneous treatment effects, and then optimize the learned objective. This approach has several advantages over existing methods. From a practical perspective, our method is very flexible and easy to use: In both steps, we can use any method of our choice, e.g., penalized regression, a deep net, or boosting; moreover, these methods can be fine-tuned by cross-validating on the learned objective. Meanwhile, in the case of penalized kernel regression, we show that our method has a quasi-oracle property, whereby even if our pilot estimates for marginal effects and treatment propensities are not particularly accurate, we achieve the same regret bounds as an oracle who has a-priori knowledge of these nuisance components. We implement variants of our method based on both penalized regression and convolutional neural networks, and find promising performance relative to existing baselines.
    Date: 2017–12
  17. By: Sellner, Richard (Institute for Advanced Studies (IHS), Vienna)
    Abstract: This paper provides Monte Carlo simulation evidence on the performance of methods used for identifying the effects of non-discriminatory trade policy (NDTP) variables in structural gravity models (SGM). The benchmarked methods include the identification strategy of Heid, Larch & Yotov (2015) that utilizes data on intra-national trade flows and three other methods that do not rely on this data. Results indicate that under the assumption of a data generating process that conforms with SGM theory, data on intra-national trade flows is required for identification. The bias of the three methods that do not utilize this data, is a result of the correlation between the NDTP variable and the collinear fixed effects. The MC results and an empirical application demonstrate the severity of this bias in methods that have been applied in previous empirical research.
    Keywords: Structural Gravity Model, Non-discriminatory Trade Policies, Monte Carlo Simulation
    JEL: C31 F10 F13
    Date: 2017–12
  18. By: Zhou, Bo (Tilburg University, School of Economics and Management)
    Abstract: This thesis consists of three essays in theory of econometrics and statistics, focusing on the issue of semiparametric efficiency in non-LAN (Locally Asymptotically Normality) models. The first essay starts with a univariate case of the unit root testing problem, of which the limit experiment is of the LABF (Locally Asymptotically Brownian Functional) model. A novel approach is designed for developing the semiparametric power envelope and a family of rank-based tests that are semiparametrically efficient is proposed. The second essay generalizes the approach to all LAQ (Locally Asymptotically Quadratic) models. Moreover, it expands the rank statistics in a unique way from the univariate case to the multivariate case. Using these results, in the third essay, the semiparametric power envelop of all invariant tests for stock return predictability is developed. And subsequently, a new family of tests that are more efficient than the existing ones is proposed.
    Date: 2017
  19. By: Victor Chernozhukov; Kaspar Wuthrich; Yinchu Zhu
    Abstract: This paper introduces new inference methods for counterfactual and synthetic control methods for evaluating policy effects. Our inference methods work in conjunction with many modern and classical methods for estimating the counterfactual mean outcome in the absence of a policy intervention. Specifically, our methods work together with the difference-in-difference, canonical synthetic control, constrained and penalized regression methods for synthetic control, factor/matrix completion models for panel data, interactive fixed effects panel models, time series models, as well as fused time series panel data models. The proposed method has a double justification. (i) If the residuals from estimating the counterfactuals are exchangeable as implied, for example, by i.i.d. data, our procedure achieves exact finite sample size control without any assumption on the specific approach used to estimate the counterfactuals. (ii) If the data exhibit dynamics and serial dependence, our inference procedure achieves approximate uniform size control under weak and easy-to-verify conditions on the method used to estimate the counterfactual. We verify these condition for representative methods from each group listed above. Simulation experiments demonstrate the usefulness of our approach in finite samples. We apply our method to re-evaluate the causal effect of election day registration (EDR) laws on voter turnout in the United States.
    Date: 2017–12
  20. By: Hautsch, Nikolaus; Voigt, Stefan
    Abstract: We theoretically and empirically study large-scale portfolio allocation problems when transaction costs are taken into account in the optimization problem. We show that transaction costs act on the one hand as a turnover penalization and on the other hand as a regularization, which shrinks the covariance matrix. As an empirical framework, we propose a flexible econometric setting for portfolio optimization under transaction costs, which incorporates parameter uncertainty and combines predictive distributions of individual models using optimal prediction pooling. We consider predictive distributions resulting from highfrequency based covariance matrix estimates, daily stochastic volatility factor models and regularized rolling window covariance estimates, among others. Using data capturing several hundred Nasdaq stocks over more than 10 years, we illustrate that transaction cost regularization (even to small extent) is crucial in order to produce allocations with positive Sharpe ratios. We moreover show that performance differences between individual models decline when transaction costs are considered. Nevertheless, it turns out that adaptive mixtures based on high-frequency and low-frequency information yield the highest performance. Portfolio bootstrap reveals that naive 1=N-allocations and global minimum variance allocations (with and without short sales constraints) are significantly outperformed in terms of Sharpe ratios and utility gains.
    Keywords: portfolio choice,transaction costs,model uncertainty,regularization,high frequency data
    JEL: C58 C52 C11 G11
    Date: 2017
  21. By: Doll, Monika
    Abstract: Reporting effect sizes and corresponding confidence intervals is increasingly demanded, which generates interest to analyze the performance of confidence intervals around effect sizes. As effect sizes take on the value zero in case of no effect per definition, not only the inclusion of the population effect, but also the exclusion of the value zero are therefore performance criteria for these intervals. This study is the first to compare the performance of confidence interval methods applying these two criteria via determining their finite relative efficiency. Computing the quotient of two methods' minimum required sample sizes to achieve levels of both criteria allows to account for the problem of limitations in available observations, which often occurs in the educational, behavioral or social sciences. Results indicate that confidence intervals based on a noncentral t-distribution around the robust effect size proposed by Algina et al. (2005) possess high relative efficiency.
    Keywords: Effect Size,Confidence Interval,Minimum Required Sample Size,Finite Relative Efficiency
    Date: 2017

This nep-ecm issue is ©2018 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.