nep-ecm New Economics Papers
on Econometrics
Issue of 2021‒10‒04
sixteen papers chosen by
Sune Karlsson
Örebro universitet

  1. Estimating the Variance of a Combined Forecast: Bootstrap-Based Approach By Ulrich Hounyo; Kajal Lahiri
  2. Towards Principled Causal Effect Estimation by Deep Identifiable Models By Pengzhou Wu; Kenji Fukumizu
  3. Gaussian and Student's $t$ mixture vector autoregressive model By Savi Virolainen
  4. Causal Matrix Completion By Anish Agarwal; Munther Dahleh; Devavrat Shah; Dennis Shen
  5. Linear Panel Regressions with Two-Way Unobserved Heterogeneity By Hugo Freeman; Martin Weidner
  6. Data-driven Covariate Selection for Confounding Adjustment by Focusing on the Stability of the Effect Estimator By Loh, Wen Wei; Ren, Dongning
  7. Assessing Outcome-to-Outcome Interference in Sibling Fixed Effects Models By David C. Mallinson
  8. A Practical Guide to Weak Instruments By Michael Keane; Timothy Neal
  9. Implicit Generative Copulas By Tim Janke; Mohamed Ghanmi; Florian Steinke
  10. Covariates Hiding in the Tails By Milian Bachem; Lerby Ergun; Casper de Vries
  11. High-dimensional Portfolio Optimization using Joint Shrinkage By Anik Burman; Sayantan Banerjee
  12. Macroeconomic forecasting with LSTM and mixed frequency time series data By Sarun Kamolthip
  13. Combining Discrete Choice Models and Neural Networks through Embeddings: Formulation, Interpretability and Performance By Ioanna Arkoudi; Carlos Lima Azevedo; Francisco C. Pereira
  14. Four Australian Banks and the Multivariate Time-Varying Smooth Transition Correlation GARCH model By Anthony D. Hall; Annastiina Silvennoinen; Timo Teräsvirta
  15. Testing the Presence of Implicit Hiring Quotas with Application to German Universities By Lena Janys
  16. Treatment Effects in Market Equilibrium By Evan Munro; Stefan Wager; Kuang Xu

  1. By: Ulrich Hounyo (University at Albany and CREATES); Kajal Lahiri (University at Albany)
    Abstract: This paper considers bootstrap inference in model averaging for predictive regressions. We first consider two different types of bootstrap methods in predictive regressions: standard pairwise bootstrap and standard fixed-design residual-based bootstrap. We show that these procedures are not valid in the context of model averaging. These common bootstrap approaches induce a bias-related term in the bootstrap variance of averaging estimators. We then propose and justify a fixed-design residual-based bootstrap resampling approach for model averaging. In a local asymptotic framework, we show the validity of the bootstrap in estimating the variance of a combined forecast and the asymptotic covariance matrix of a combined parameter vector with fixed weights. Our proposed method preserves non-parametrically the cross-sectional dependence between different models and the time series dependence in the errors simultaneously. The finite sample performance of these methods are assessed via Monte Carlo simulations. We illustrate our approach using an empirical study of the Taylor rule equation with 24 alternative specifications.
    Keywords: Bootstrap, Local asymptotic theory, Model average estimators, Wild bootstrap, Variance of consensus forecast
    JEL: C33 C53 C80
    Date: 2021–09–28
    URL: http://d.repec.org/n?u=RePEc:aah:create:2021-14&r=
  2. By: Pengzhou Wu; Kenji Fukumizu
    Abstract: As an important problem of causal inference, we discuss the estimation of treatment effects (TEs) under unobserved confounding. Representing the confounder as a latent variable, we propose Intact-VAE, a new variant of variational autoencoder (VAE), motivated by the prognostic score that is sufficient for identifying TEs. Our VAE also naturally gives representation balanced for treatment groups, using its prior. Experiments on (semi-)synthetic datasets show state-of-the-art performance under diverse settings. Based on the identifiability of our model, further theoretical developments on identification and consistent estimation are also discussed. This paves the way towards principled causal effect estimation by deep neural networks.
    Date: 2021–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2109.15062&r=
  3. By: Savi Virolainen
    Abstract: A new mixture vector autoressive model based on Gaussian and Student's $t$ distributions is introduced. The G-StMVAR model incorporates conditionally homoskedastic linear Gaussian vector autoregressions and conditionally heteroskedastic linear Student's $t$ vector autoregressions as its mixture components, and mixing weights that, for a $p$th order model, depend on the full distribution of the preceding $p$ observations. Also a structural version of the model with time-varying B-matrix and statistically identified shocks is proposed. We derive the stationary distribution of $p+1$ consecutive observations and show that the process is ergodic. It is also shown that the maximum likelihood estimator is strongly consistent, and thereby has the conventional limiting distribution under conventional high-level conditions.
    Date: 2021–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2109.13648&r=
  4. By: Anish Agarwal; Munther Dahleh; Devavrat Shah; Dennis Shen
    Abstract: Matrix completion is the study of recovering an underlying matrix from a sparse subset of noisy observations. Traditionally, it is assumed that the entries of the matrix are "missing completely at random" (MCAR), i.e., each entry is revealed at random, independent of everything else, with uniform probability. This is likely unrealistic due to the presence of "latent confounders", i.e., unobserved factors that determine both the entries of the underlying matrix and the missingness pattern in the observed matrix. For example, in the context of movie recommender systems -- a canonical application for matrix completion -- a user who vehemently dislikes horror films is unlikely to ever watch horror films. In general, these confounders yield "missing not at random" (MNAR) data, which can severely impact any inference procedure that does not correct for this bias. We develop a formal causal model for matrix completion through the language of potential outcomes, and provide novel identification arguments for a variety of causal estimands of interest. We design a procedure, which we call "synthetic nearest neighbors" (SNN), to estimate these causal estimands. We prove finite-sample consistency and asymptotic normality of our estimator. Our analysis also leads to new theoretical results for the matrix completion literature. In particular, we establish entry-wise, i.e., max-norm, finite-sample consistency and asymptotic normality results for matrix completion with MNAR data. As a special case, this also provides entry-wise bounds for matrix completion with MCAR data. Across simulated and real data, we demonstrate the efficacy of our proposed estimator.
    Date: 2021–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2109.15154&r=
  5. By: Hugo Freeman; Martin Weidner
    Abstract: This paper studies linear panel regression models in which the unobserved error term is an unknown smooth function of two-way unobserved fixed effects. In standard additive or interactive fixed effect models the individual specific and time specific effects are assumed to enter with a known functional form (additive or multiplicative), while we allow for this functional form to be more general and unknown. We discuss two different estimation approaches that allow consistent estimation of the regression parameters in this setting as the number of individuals and the number of time periods grow to infinity. The first approach uses the interactive fixed effect estimator in Bai (2009), which is still applicable here, as long as the number of factors in the estimation grows asymptotically. The second approach first discretizes the two-way unobserved heterogeneity (similar to what Bonhomme, Lamadon and Manresa 2017 are doing for one-way heterogeneity) and then estimates a simple linear fixed effect model with additive two-way grouped fixed effects. For both estimation methods we obtain asymptotic convergence results, perform Monte Carlo simulations, and employ the estimators in an empirical application to UK house price data.
    Date: 2021–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2109.11911&r=
  6. By: Loh, Wen Wei; Ren, Dongning
    Abstract: Valid inference of cause-and-effect relations in observational studies necessitates adjusting for common causes of the focal predictor (i.e., treatment) and the outcome. When such common causes, henceforth termed confounders, remain unadjusted for, they generate spurious correlations that lead to biased causal effect estimates. But routine adjustment for all available covariates, when only a subset are truly confounders, is known to yield potentially inefficient and unstable estimators. In this article, we introduce a data-driven confounder selection strategy that focuses on stable estimation of the treatment effect. The approach exploits the causal knowledge that after adjusting for confounders to eliminate all confounding biases, adding any remaining non-confounding covariates associated with only treatment or outcome, but not both, should not systematically change the effect estimator. The strategy proceeds in two steps. First, we prioritize covariates for adjustment by probing how strongly each covariate is associated with treatment and outcome. Next, we gauge the stability of the effect estimator by evaluating its trajectory adjusting for different covariate subsets. The smallest subset that yields a stable effect estimate is then selected. Thus, the strategy offers direct insight into the (in)sensitivity of the effect estimator to the chosen covariates for adjustment. The ability to correctly select confounders and yield valid causal inference following data-driven covariate selection is evaluated empirically using extensive simulation studies. Furthermore, we compare the proposed method empirically with routine variable selection methods. Finally, we demonstrate the procedure using two publicly available real-world datasets.
    Date: 2021–09–24
    URL: http://d.repec.org/n?u=RePEc:osf:osfxxx:yve6u&r=
  7. By: David C. Mallinson
    Abstract: Sibling fixed effects (FE) models are useful for estimating causal treatment effects while offsetting unobserved sibling-invariant confounding. However, treatment estimates are biased if an individual's outcome affects their sibling's outcome. We propose a robustness test for assessing the presence of outcome-to-outcome interference in linear two-sibling FE models. We regress a gain-score--the difference between siblings' continuous outcomes--on both siblings' treatments and on a pre-treatment observed FE. Under certain restrictions, the observed FE's partial regression coefficient signals the presence of outcome-to-outcome interference. Monte Carlo simulations demonstrated the robustness test under several models. We found that an observed FE signaled outcome-to-outcome spillover if it was directly associated with an sibling-invariant confounder of treatments and outcomes, directly associated with a sibling's treatment, or directly and equally associated with both siblings' outcomes. However, the robustness test collapsed if the observed FE was directly but differentially associated with siblings' outcomes or if outcomes affected siblings' treatments.
    Date: 2021–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2109.13399&r=
  8. By: Michael Keane (School of Economics); Timothy Neal (UNSW School of Economics)
    Abstract: We provide a simple survey of the literature on weak instruments, aimed at giving practical advice to applied researchers. It is well-known that 2SLS has poor properties if instruments are exogenous but weak. We clarify these properties, explain weak instrument tests, and examine how behavior of 2SLS depends on instrument strength. A common standard for acceptable instruments is a ï¬ rst-stage F-statistic of at least 10. But 2SLS has poor properties in that context: It has very little power, and generates artiï¬ cially low standard errors precisely in those samples where it generates estimates most contaminated by endogeneity. This causes t-tests to give misleading results. In fact, the distribution of t-statistics is highly non-normal unless F is in the thousands. Anderson-Rubin and conditional t-tests greatly alleviate this problem, and should be used even with strong instruments. A ï¬ rst-stage F well above 10 is necessary to give high conï¬ dence that 2SLS will outperform OLS. Otherwise, OLS combined with controls for sources of endogeneity may be a superior research strategy to IV.
    Keywords: Instrumental variables, weak instruments, 2SLS, endogeneity, F-test, size distortions of tests, Anderson-Rubin test, conditional t-test, Fuller, JIVE
    Date: 2021–08
    URL: http://d.repec.org/n?u=RePEc:swe:wpaper:2021-05b&r=
  9. By: Tim Janke; Mohamed Ghanmi; Florian Steinke
    Abstract: Copulas are a powerful tool for modeling multivariate distributions as they allow to separately estimate the univariate marginal distributions and the joint dependency structure. However, known parametric copulas offer limited flexibility especially in high dimensions, while commonly used non-parametric methods suffer from the curse of dimensionality. A popular remedy is to construct a tree-based hierarchy of conditional bivariate copulas. In this paper, we propose a flexible, yet conceptually simple alternative based on implicit generative neural networks. The key challenge is to ensure marginal uniformity of the estimated copula distribution. We achieve this by learning a multivariate latent distribution with unspecified marginals but the desired dependency structure. By applying the probability integral transform, we can then obtain samples from the high-dimensional copula distribution without relying on parametric assumptions or the need to find a suitable tree structure. Experiments on synthetic and real data from finance, physics, and image generation demonstrate the performance of this approach.
    Date: 2021–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2109.14567&r=
  10. By: Milian Bachem; Lerby Ergun; Casper de Vries
    Abstract: Scaling behavior measured in cross-sectional studies through the tail index of a power law is prone to a bias. This hampers inference; in particular, time variation in estimated tail indices may be erroneous. In the case of a linear factor model, the factor biases the tail indices in the left and right tail in opposite directions. This fact can be exploited to reduce the bias. We show how this bias arises from the factor, how to remedy for the bias and how to apply our methods to financial data and geographic location data.
    Keywords: Econometric and statistical methods
    JEL: C01 C14 C58
    Date: 2021–09
    URL: http://d.repec.org/n?u=RePEc:bca:bocawp:21-45&r=
  11. By: Anik Burman; Sayantan Banerjee
    Abstract: We consider the problem of optimizing a portfolio of financial assets, where the number of assets can be much larger than the number of observations. The optimal portfolio weights require estimating the inverse covariance matrix of excess asset returns, classical solutions of which behave badly in high-dimensional scenarios. We propose to use a regression-based joint shrinkage method for estimating the partial correlation among the assets. Extensive simulation studies illustrate the superior performance of the proposed method with respect to variance, weight, and risk estimation errors compared with competing methods for both the global minimum variance portfolios and Markowitz mean-variance portfolios. We also demonstrate the excellent empirical performances of our method on daily and monthly returns of the components of the S&P 500 index.
    Date: 2021–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2109.13633&r=
  12. By: Sarun Kamolthip
    Abstract: This paper demonstrates the potentials of the long short-term memory (LSTM) when applyingwith macroeconomic time series data sampled at different frequencies. We first present how theconventional LSTM model can be adapted to the time series observed at mixed frequencies when thesame mismatch ratio is applied for all pairs of low-frequency output and higher-frequency variable. Togeneralize the LSTM to the case of multiple mismatch ratios, we adopt the unrestricted Mixed DAtaSampling (U-MIDAS) scheme (Foroni et al., 2015) into the LSTM architecture. We assess via bothMonte Carlo simulations and empirical application the out-of-sample predictive performance. Ourproposed models outperform the restricted MIDAS model even in a set up favorable to the MIDASestimator. For real world application, we study forecasting a quarterly growth rate of Thai realGDP using a vast array of macroeconomic indicators both quarterly and monthly. Our LSTM withU-MIDAS scheme easily beats the simple benchmark AR(1) model at all horizons, but outperformsthe strong benchmark univariate LSTM only at one and six months ahead. Nonetheless, we find thatour proposed model could be very helpful in the period of large economic downturns for short-termforecast. Simulation and empirical results seem to support the use of our proposed LSTM withU-MIDAS scheme to nowcasting application.
    Date: 2021–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2109.13777&r=
  13. By: Ioanna Arkoudi; Carlos Lima Azevedo; Francisco C. Pereira
    Abstract: This study proposes a novel approach that combines theory and data-driven choice models using Artificial Neural Networks (ANNs). In particular, we use continuous vector representations, called embeddings, for encoding categorical or discrete explanatory variables with a special focus on interpretability and model transparency. Although embedding representations within the logit framework have been conceptualized by Camara (2019), their dimensions do not have an absolute definitive meaning, hence offering limited behavioral insights. The novelty of our work lies in enforcing interpretability to the embedding vectors by formally associating each of their dimensions to a choice alternative. Thus, our approach brings benefits much beyond a simple parsimonious representation improvement over dummy encoding, as it provides behaviorally meaningful outputs that can be used in travel demand analysis and policy decisions. Additionally, in contrast to previously suggested ANN-based Discrete Choice Models (DCMs) that either sacrifice interpretability for performance or are only partially interpretable, our models preserve interpretability of the utility coefficients for all the input variables despite being based on ANN principles. The proposed models were tested on two real world datasets and evaluated against benchmark and baseline models that use dummy-encoding. The results of the experiments indicate that our models deliver state-of-the-art predictive performance, outperforming existing ANN-based models while drastically reducing the number of required network parameters.
    Date: 2021–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2109.12042&r=
  14. By: Anthony D. Hall; Annastiina Silvennoinen (NCER, Queensland University of Technology); Timo Teräsvirta (Aarhus University, CREATES, C.A.S.E, Humboldt-Universität zu Berlin)
    Abstract: This paper looks at changes in the correlations of daily returns between the four major banks in Australia. Revelations from the analysis are of importance to investors, but also to government involvement, due to the large proportion of the highly concentrated financial sector relying on the stability of the Big Four. For this purpose, a methodology for building Multivariate Time-Varying STCC-GARCH models is developed. The novel contributions in this area are the specification tests related to the correlation component, the extension of the general model to allow for additional correlation regimes, and a detailed exposition of the systematic, improved modelling cycle required for such nonlinear models. There is an R-package that includes the steps in the modelling cycle. Simulations evidence the robustness of the recommended model building approach. The empirical analysis reveals an increase in correlations of the Australia's four largest banks that coincides with the stagnation of the home loan market, technology changes, the mining boom, and Basel II alignment, increasing the exposure of the Australian financial sector to shocks.
    Keywords: Unconditional correlation, modelling volatility, modelling correlations, multivariate autoregressive conditional heteroskedasticity
    JEL: C32 C52 C58
    Date: 2021–09–28
    URL: http://d.repec.org/n?u=RePEc:aah:create:2021-13&r=
  15. By: Lena Janys
    Abstract: It is widely accepted that women are underrepresented in academia in general and economics in particular. This paper introduces a test to detect an under-researched form of hiring bias: implicit quotas. I derive a test under the Null of random hiring that requires no information about individual hires under some assumptions. I derive the asymptotic distribution of this test statistic and, as an alternative, propose a parametric bootstrap procedure that samples from the exact distribution. This test can be used to analyze a variety of other hiring settings. I analyze the distribution of female professors at German universities across 50 different disciplines. I show that the distribution of women, given the average number of women in the respective field, is highly unlikely to result from a random allocation of women across departments and more likely to stem from an implicit quota of one or two women on the department level. I also show that a large part of the variation in the share of women across STEM and non-STEM disciplines could be explained by a two-women quota on the department level. These findings have important implications for the potential effectiveness of policies aimed at reducing underrepresentation and providing evidence of how stakeholders perceive and evaluate diversity.
    Date: 2021–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2109.14343&r=
  16. By: Evan Munro; Stefan Wager; Kuang Xu
    Abstract: In evaluating social programs, it is important to measure treatment effects within a market economy, where interference arises due to individuals buying and selling various goods at the prevailing market price. We introduce a stochastic model of potential outcomes in market equilibrium, where the market price is an exposure mapping. We prove that average direct and indirect treatment effects converge to interpretable mean-field treatment effects, and provide estimators for these effects through a unit-level randomized experiment augmented with randomization in prices. We also provide a central limit theorem for the estimators that depends on the sensitivity of outcomes to prices. For a variant where treatments are continuous, we show that the sum of direct and indirect effects converges to the total effect of a marginal policy change. We illustrate the coverage and consistency properties of the estimators in simulations of different interventions in a two-sided market.
    Date: 2021–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2109.11647&r=

This nep-ecm issue is ©2021 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.