nep-ecm New Economics Papers
on Econometrics
Issue of 2021‒04‒12
thirty-one papers chosen by
Sune Karlsson
Örebro universitet

  1. Efficient Estimation for Staggered Rollout Designs By Jonathan Roth; Pedro H. C. Sant'Anna
  2. Adaptive Random Bandwidth for Inference in CAViaR Models By Alain Hecq; Li Sun
  3. Structural and Predictive Analyses with a Mixed Copula-Based Vector Autoregression Model By Woraphon Yamaka; Rangan Gupta; Sukrit Thongkairat; Paravee Maneejuk
  4. Testing Identifying Assumptions in Bivariate Probit Models By Acerenza, Santiago; Bartalotti, Otávio; Kedagni, Desire
  5. Inference under Covariate-Adaptive Randomization with Imperfect Compliance By Federico A. Bugni; Mengsi Gao
  6. Bootstrap Inference for Hawkes and General Point Processes By Giuseppe Cavaliere; Ye Lu; Anders Rahbek; Jacob St{\ae}rk-{\O}stergaard
  7. The Role of Score and Information Bias in Panel Data Likelihoods By Martin Schumann; Thomas A. Severini; Gautam Tripathi
  8. Singular conditional autoregressive Wishart model for realized covariance matrices By Alfelt, Gustav; Bodnar, Taras; Javed, Farrukh; Tyrcha, Joanna
  9. A note on global identification in structural vector autoregressions By Emanuele Bacchiocchi; Toru Kitagawa
  10. Bayesian Estimation of Epidemiological Models: Methods, Causality, and Policy Trade-Offs By Jonas E. Arias; Jesús Fernández-Villaverde; Juan F. Rubio-Ramírez; Minchul Shin
  11. A reality check on the GARCH-MIDAS volatility models By Virk, Nader; Javed, Farrukh; Awartani, Basel
  12. Minimax Kernel Machine Learning for a Class of Doubly Robust Functionals By AmirEmad Ghassami; Andrew Ying; Ilya Shpitser; Eric Tchetgen Tchetgen
  13. On optimal tests for rotational symmetry against new classes of hyperspherical distributions By Eduardo Garcia-Portugues; Davy Paindaveine; Thomas Verdebout
  14. High Frequency Income Dynamics By Jeppe Druedahl; Michael Graber; Thomas H. Jørgensen
  15. Average Treatment Effects in the Presence of Interference By Yuchen Hu; Shuangning Li; Stefan Wager
  16. Identifying structural shocks to volatility through a proxy-MGARCH model By Fengler, Matthias; Polivka, Jeannine
  17. Identification and Estimation in Many-to-one Two-sided Matching without Transfers By YingHua He; Shruti Sinha; Xiaoting Sun
  18. Scrutinizing the Monotonicity Assumption in IV and fuzzy RD designs By Fiorini, Mario; Stevens, Katrien
  19. CRPS Learning By Jonathan Berrisch; Florian Ziel
  20. The Properly Use of Google Trends in Forecasting Models By Marcelo C. Medeiros; Henrique F. Pires
  21. Regularized Estimation of High-Dimensional Vector AutoRegressions with Weakly Dependent Innovations By Ricardo P. Masini; Marcelo C. Medeiros; Eduardo F. Mendes
  22. Identification of Peer Effects using Panel Data By Marisa Miraldo; Carol Propper; Christiern Rose
  23. Sharp Sensitivity Analysis for Inverse Propensity Weighting via Quantile Balancing By Jacob Dorn; Kevin Guo
  24. Hypothetical bias in stated choice experiments: Part II. Macro-scale analysis of literature and effectiveness of bias mitigation methods By Milad Haghani; Michiel C. J. Bliemer; John M. Rose; Harmen Oppewal; Emily Lancsar
  25. Cluster-Robust Inference: A Guide to Empirical Practice By James G. MacKinnon; Morten Ørregaard Nielsen; Matthew D. Webb
  26. Automated and Distributed Statistical Analysis of Economic Agent-Based Models By Andrea Vandin; Daniele Giachini; Francesco Lamperti; Francesca Chiaromonte
  27. E-values for effect heterogeneity and conservative approximations for causal interaction By Mathur, Maya B; Smith, Louisa; Yoshida, Kazuki; Ding, Peng; VanderWeele, Tyler
  28. A first-stage representation for instrumental variables quantile regression By Javier Alejo; Antonio F. Galvao; Gabriel Montes-Rojas
  29. Addressing spatial dependence in technical efficiency estimation: A Spatial DEA frontier approach By Julian Ramajo; Miguel A. Marquez; Geoffrey J. D. Hewings
  30. A Stochastic Time Series Model for Predicting Financial Trends using NLP By Pratyush Muthukumar; Jie Zhong
  31. Domain Specific Concept Drift Detectors for Predicting Financial Time Series By Filippo Neri

  1. By: Jonathan Roth; Pedro H. C. Sant'Anna
    Abstract: Researchers are often interested in the causal effect of treatments that are rolled out to different units at different points in time. This paper studies how to efficiently estimate a variety of causal parameters in such staggered rollout designs when treatment timing is (as-if) randomly assigned. We solve for the most efficient estimator in a class of estimators that nests two-way fixed effects models as well as several popular generalized difference-in-differences methods. The efficient estimator is not feasible in practice because it requires knowledge of the optimal weights to be placed on pre-treatment outcomes. However, the optimal weights can be estimated from the data, and in large datasets the plug-in estimator that uses the estimated weights has similar properties to the "oracle" efficient estimator. We illustrate the performance of the plug-in efficient estimator in simulations and in an application to Wood et al. (2020a)'s study of the staggered rollout of a procedural justice training program for police officers. We find that confidence intervals based on the plug-in efficient estimator have good coverage and can be as much as five times shorter than confidence intervals based on existing methods. As an empirical contribution of independent interest, our application provides the most precise estimates to date on the effectiveness of procedural justice training programs for police officers.
    Date: 2021–02
  2. By: Alain Hecq; Li Sun
    Abstract: This paper investigates the size performance of Wald tests for CAViaR models (Engle and Manganelli, 2004). We find that the usual estimation strategy on test statistics yields inaccuracies. Indeed, we show that existing density estimation methods cannot adapt to the time-variation in the conditional probability densities of CAViaR models. Consequently, we develop a method called adaptive random bandwidth which can approximate time-varying conditional probability densities robustly for inference testing on CAViaR models based on the asymptotic normality of the model parameter estimator. This proposed method also avoids the problem of choosing an optimal bandwidth in estimating probability densities, and can be extended to multivariate quantile regressions straightforward.
    Date: 2021–02
  3. By: Woraphon Yamaka (Center of Excellence in Econometrics, Faculty of Economics, Chiang Mai University; Chiang Mai 50200, Thailand); Rangan Gupta (Department of Economics, University of Pretoria, Pretoria 0002, South Africa); Sukrit Thongkairat (Center of Excellence in Econometrics, Faculty of Economics, Chiang Mai University; Chiang Mai 50200, Thailand); Paravee Maneejuk (Center of Excellence in Econometrics, Faculty of Economics, Chiang Mai University; Chiang Mai 50200, Thailand)
    Abstract: In this study, we introduce a mixed copula-based vector autoregressive (VAR) model for investigating the relationship between random variables. The one-step maximum likelihood estimation is used to obtain point estimates of the autoregressive parameters and mixed copula parameters. More specifically, we combine the likelihoods of the marginal and mixed Copula to construct the full likelihood function. The simulation study is used to confirm the accuracy of the estimation as well as the reliability of the proposed model. Various mixed copula forms from a combination of Gaussian, Student-t, Clayton, Frank, Gumbel, and Joe copulas are introduced. The proposed model is compared to the traditional VAR model and single copula-based VAR models to assess its performance. Furthermore, the real data study is also conducted to validate our proposed method. As a result, it is found that the one-step maximum likelihood provides accurate and reliable results. Also, we show that if we ignore the complex and nonlinear correlation between the errors, it causes significant efficiency loss in the parameter estimation, in terms of Bias and MSE. In the application study, the mixed copula-based VAR is the best fitting Copula for our application study.
    Keywords: Forecasting; Mixed copula; One step maximum likelihood estimation; Vector autoregressive
    Date: 2021–01
  4. By: Acerenza, Santiago; Bartalotti, Otávio; Kedagni, Desire
    Abstract: This paper focuses on the bivariate probit model's identifyingassumptions: joint normality of errors, instrument exogeneity, and relevance conditions. First, we develop novel sharp testable equalities that can detect all possible observable violations of the assumptions. Second, we propose an easy-to-implement testing procedure for the model's validity based on feasible testable implications using existing inference methods for intersection bounds. The test achieves correct empirical size for moderately sized samples and performs well in detecting violations of the conditions in Monte Carlo simulations. Finally, we provide researchers with a road map on what to do when the bivariate probit model is rejected, including novel bounds for the average treatment effect that relax the normality assumption. Empirical examples illustrate the methodology's implementation.
    Date: 2021–03–29
  5. By: Federico A. Bugni; Mengsi Gao
    Abstract: This paper studies inference in a randomized controlled trial (RCT) with covariate-adaptive randomization (CAR) and imperfect compliance of a binary treatment. In this context, we study inference on the LATE. As in Bugni et al. (2018,2019), CAR refers to randomization schemes that first stratify according to baseline covariates and then assign treatment status so as to achieve ``balance'' within each stratum. In contrast to these papers, however, we allow participants of the RCT to endogenously decide to comply or not with the assigned treatment status. We study the properties of an estimator of the LATE derived from a ``fully saturated'' IV linear regression, i.e., a linear regression of the outcome on all indicators for all strata and their interaction with the treatment decision, with the latter instrumented with the treatment assignment. We show that the proposed LATE estimator is asymptotically normal, and we characterize its asymptotic variance in terms of primitives of the problem. We provide consistent estimators of the standard errors and asymptotically exact hypothesis tests. In the special case when the target proportion of units assigned to each treatment does not vary across strata, we can also consider two other estimators of the LATE, including the one based on the ``strata fixed effects'' IV linear regression, i.e., a linear regression of the outcome on indicators for all strata and the treatment decision, with the latter instrumented with the treatment assignment. Our characterization of the asymptotic variance of the LATE estimators allows us to understand the influence of the parameters of the RCT. We use this to propose strategies to minimize their asymptotic variance in a hypothetical RCT based on data from a pilot study. We illustrate the practical relevance of these results using a simulation study and an empirical application based on Dupas et al. (2018).
    Date: 2021–02
  6. By: Giuseppe Cavaliere; Ye Lu; Anders Rahbek; Jacob St{\ae}rk-{\O}stergaard
    Abstract: Inference and testing in general point process models such as the Hawkes model is predominantly based on asymptotic approximations for likelihood-based estimators and tests, as originally developed in Ogata (1978). As an alternative, and to improve finite sample performance, this paper considers bootstrap-based inference for interval estimation and testing. Specifically, for a wide class of point process models we consider a novel bootstrap scheme labeled 'fixed intensity bootstrap' (FIB), where the conditional intensity is kept fixed across bootstrap repetitions. The FIB, which is very simple to implement and fast in practice, naturally extends previous ideas from the bootstrap literature on time series in discrete time, where the so-called 'fixed design' and 'fixed volatility' bootstrap schemes have shown to be particularly useful and effective. We compare the FIB with the classic recursive bootstrap, which is here labeled 'recursive intensity bootstrap' (RIB). In RIB algorithms, the intensity is stochastic in the bootstrap world and implementation of the bootstrap is more involved, due to its sequential structure. For both bootstrap schemes, no asymptotic theory is available; we therefore provide a new bootstrap (asymptotic) theory, which allows to assess bootstrap validity. We also introduce novel 'non-parametric' FIB and RIB schemes, which are based on resampling time-changed transformations of the original waiting times. We show effectiveness of the different bootstrap schemes in finite samples through a set of detailed Monte Carlo experiments. As far as we are aware, this is the first detailed Monte Carlo study of bootstrap implementations for Hawkes-type processes. Finally, in order to illustrate, we provide applications of the bootstrap to both financial data and social media data.
    Date: 2021–04
  7. By: Martin Schumann (Maastricht University, NL); Thomas A. Severini (Northwestern University, Evanston, USA); Gautam Tripathi (Department of Economics and Management, Université du Luxembourg)
    Abstract: Although it has long been believed that reducing information bias can improve the performance of likelihood based estimators and confidence regions in small samples, the existing literature does not have a precise explanation about why this happens. We provide a theoretical argument to show why this improved performance can be attributed to first-order information unbiasedness, and why it seems to matter more for inference than for estimation. The insights obtained in this paper are helpful in explaining several simulation findings in the panel data literature. E.g., we can explain the well documented phenomenon that reducing the score bias alone often reduces the finite sample variance of estimators and improves the coverage of confidence regions in small samples, and why confidence regions based on conditional (on sufficient statistics) likelihoods can have excellent coverage even in very short panels. We can also explain the simulation results in Schumann, Severini, and Tripathi (2020), who find that, in panels of short duration, estimators and confidence regions based on pseudolikelihoods that are simultaneously first-order score and information unbiased perform much better than those based on panel data pseudolikelihoods that are only first-order score unbiased
    Keywords: Fixed effects, Hessian bias, Information bias, Likelihood ratio, Panel data, Pseudolikelihood, Score bias.
    JEL: C23
    Date: 2021
  8. By: Alfelt, Gustav (Department of Mathematics, Stockholm University); Bodnar, Taras (Department of Mathematics, Stockholm University); Javed, Farrukh (Örebro University School of Business); Tyrcha, Joanna (Department of Mathematics, Stockholm University)
    Abstract: Realized covariance matrices are often constructed under the assumption that richness of intra-day return data is greater than the portfolio size, resulting in non-singular matrix measures. However, when for example the portfolio size is large, assets suffer from illiquidity issues, or market microstructure noise deters sampling on very high frequencies, this relation is not guaranteed. Under these common conditions, realized covariance matrices may obtain as singular by construction. Motivated by this situation, we introduce the Singular Conditional Autoregressive Wishart (SCAW) model to capture the temporal dynamics of time series of singular realized covariance matrices, extending the rich literature on econometric Wishart time series models to the singular case. This model is furthermore developed by covariance targeting adapted to matrices and a sectorwise BEKK-specification, allowing excellent scalability to large and extremely large portfolio sizes. Finally, the model is estimated to a 20 year long time series containing 50 stocks, and evaluated using out-ofsample forecast accuracy. It outperforms the benchmark Multivariate GARCH model with high statistical significance, and the sectorwise specification outperforms the baseline model, while using much fewer parameters.
    Keywords: Covariance targeting; High-dimensional data; Realized covariance matrix; Stock co-volatility; Time series matrix-variate model
    JEL: C32 C55 C58 G17
    Date: 2020–10–02
  9. By: Emanuele Bacchiocchi; Toru Kitagawa
    Abstract: In a landmark contribution to the structural vector autoregression (SVARs) literature, Rubio-Ramirez, Waggoner, and Zha (2010, `Structural Vector Autoregressions: Theory of Identification and Algorithms for Inference,' Review of Economic Studies) shows a necessary and sufficient condition for equality restrictions to globally identify the structural parameters of a SVAR. The simplest form of the necessary and sufficient condition shown in Theorem 7 of Rubio-Ramirez et al (2010) checks the number of zero restrictions and the ranks of particular matrices without requiring knowledge of the true value of the structural or reduced-form parameters. However, this note shows by counterexample that this condition is not sufficient for global identification. Analytical investigation of the counterexample clarifies why their sufficiency claim breaks down. The problem with the rank condition is that it allows for the possibility that restrictions are redundant, in the sense that one or more restrictions may be implied by other restrictions, in which case the implied restriction contains no identifying information. We derive a modified necessary and sufficient condition for SVAR global identification and clarify how it can be assessed in practice.
    Date: 2021–02
  10. By: Jonas E. Arias; Jesús Fernández-Villaverde; Juan F. Rubio-Ramírez; Minchul Shin
    Abstract: We present a general framework for Bayesian estimation and causality assessment in epidemiological models. The key to our approach is the use of sequential Monte Carlo methods to evaluate the likelihood of a generic epidemiological model. Once we have the likelihood, we specify priors and rely on a Markov chain Monte Carlo to sample from the posterior distribution. We show how to use the posterior simulation outputs as inputs for exercises in causality assessment. We apply our approach to Belgian data for the COVID-19 epidemic during 2020. Our estimated time-varying-parameters SIRD model captures the data dynamics very well, including the three waves of infections. We use the estimated (true) number of new cases and the time-varying effective reproduction number from the epidemiological model as information for structural vector autoregressions and local projections. We document how additional government-mandated mobility curtailments would have reduced deaths at zero cost or a very small cost in terms of output.
    Date: 2021–03
  11. By: Virk, Nader (Plymouth Business School); Javed, Farrukh (Örebro University School of Business); Awartani, Basel (Westminster Business School)
    Abstract: We employ a battery of model evaluation tests for a broad-set of GARCH-MIDAS models and account for data snooping bias. We document that inferences based on standard tests for GM variance components can be misleading. Our data mining free results show that the gains of macro-variables in forecasting total (long run) variance by GM models are overstated (understated). Estimation of different components of volatility is crucial for designing differentiated investing strategies, risk management plans and pricing of derivative securities. Therefore, researchers and practitioners should be wary of data mining bias, which may contaminate a forecast that may appear statistically validated using robust evaluation tests.
    Keywords: GARCH-MIDAS models; component variance forecasts; macro-variables; data snooping
    JEL: C32 C52 G11 G17
    Date: 2021–03–30
  12. By: AmirEmad Ghassami; Andrew Ying; Ilya Shpitser; Eric Tchetgen Tchetgen
    Abstract: A moment function is called doubly robust if it is comprised of two nuisance functions and the estimator based on it is a consistent estimator of the target parameter even if one of the nuisance functions is misspecified. In this paper, we consider a class of doubly robust moment functions originally introduced in (Robins et al., 2008). We demonstrate that this moment function can be used to construct estimating equations for the nuisance functions. The main idea is to choose each nuisance function such that it minimizes the dependency of the expected value of the moment function to the other nuisance function. We implement this idea as a minimax optimization problem. We then provide conditions required for asymptotic linearity of the estimator of the parameter of interest, which are based on the convergence rate of the product of the errors of the nuisance functions, as well as the local ill-posedness of a conditional expectation operator. The convergence rates of the nuisance functions are analyzed using the modern techniques in statistical learning theory based on the Rademacher complexity of the function spaces. We specifically focus on the case that the function spaces are reproducing kernel Hilbert spaces, which enables us to use its spectral properties to analyze the convergence rates. As an application of the proposed methodology, we consider the parameter of average causal effect both in presence and absence of latent confounders. For the case of presence of latent confounders, we use the recently proposed proximal causal inference framework of (Miao et al., 2018; Tchetgen Tchetgen et al., 2020), and hence our results lead to a robust non-parametric estimator for average causal effect in this framework.
    Date: 2021–04
  13. By: Eduardo Garcia-Portugues (Carlos III University of Madrid); Davy Paindaveine (TSE - Toulouse School of Economics - UT1 - Université Toulouse 1 Capitole - EHESS - École des hautes études en sciences sociales - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement); Thomas Verdebout (ECARES - European Center for Advanced Research in Economics and Statistics - ULB - Université libre de Bruxelles)
    Abstract: Motivated by the central role played by rotationally symmetric distributions in directionalstatistics, we consider the problem of testing rotational symmetry on the hypersphere. We adopta semiparametric approach and tackle problems where the location of the symmetry axis iseither specified or unspecified. For each problem, we define two tests and study their asymptoticproperties under very mild conditions. We introduce two new classes of directional distributionsthat extend the rotationally symmetric class and are of independent interest. We prove thateach test is locally asymptotically maximin, in the Le Cam sense, for one kind of the alternativesgiven by the new classes of distributions, both for specified and unspecified symmetry axis. Thetests, aimed to detect location-like and scatter-like alternatives, are combined into convenienthybrid tests that are consistent against both alternatives. We perform Monte Carlo experimentsthat illustrate the finite-sample performances of the proposed tests and their agreement withthe asymptotic results. Finally, the practical relevance of our tests is illustrated on a real dataapplication from astronomy. The R packagerotasymimplements the proposed tests and allowspractitioners to reproduce the data application
    Keywords: Locally asymptotically maximin tests,Rotational symmetry,Local asymptotic normality,Hypothesis testing,Directional data
    Date: 2020–09–21
  14. By: Jeppe Druedahl (CEBI, Department of Economics, University of Copenhagen); Michael Graber (Department of Economics, University of Chicago); Thomas H. Jørgensen (CEBI, Department of Economics, University of Copenhagen)
    Abstract: We generalize the canonical permanent-transitory income process to allow for infrequent shocks. The distribution of income growth rates can then have a discrete mass point at zero and fat tails as observed in income data. We provide analytical formulas for the unconditional and conditional distributions of income growth rates and higher-order moments. We prove a set of identification results and numerically validate that we can simultaneously identify the frequency, variance, and persistence of income shocks. We estimate the income process on monthly panel data of 400,000 Danish males observed over 8 years. When allowing shocks to be infrequent, the proposed income process can closely match the central features of the data.
    Keywords: consumption-saving, income dynamics, panel data models
    JEL: C33 D31 J30
    Date: 2021–03–31
  15. By: Yuchen Hu; Shuangning Li; Stefan Wager
    Abstract: We propose a definition for the average indirect effect of a binary treatment in the potential outcomes model for causal inference. Our definition is analogous to the standard definition of the average direct effect, and can be expressed without needing to compare outcomes across multiple randomized experiments. We show that the proposed indirect effect satisfies a universal decomposition theorem, whereby the sum of the average direct and indirect effects always corresponds to the average effect of a policy intervention. We also consider a number of parametric models for interference considered by applied researchers, and find that our (non-parametrically defined) indirect effect remains a natural estimand when re-expressed in the context of these models.
    Date: 2021–04
  16. By: Fengler, Matthias; Polivka, Jeannine
    Abstract: We extend the classical MGARCH specification for volatility modeling by developing a structural MGARCH model targeting identification of shocks and volatility spillovers in a speculative return system. Similarly to the proxy-sVAR framework, we work with auxiliary proxy variables constructed from news-related measures to identify the underlying shock system. We achieve full identification with multiple proxies by chaining Givens rotations. In an empirical application, we identify an equity, bond and currency shock. We study the volatility spillovers implied by these labelled structural shocks. Our analysis shows that symmetric spillover regimes are rejected.
    Keywords: Givens rotations, identification, news-based measures, proxy-MGARCH, shock labelling, structural innovations, volatility spillovers
    JEL: C32 C51 C58 G12
    Date: 2021–04
  17. By: YingHua He; Shruti Sinha; Xiaoting Sun
    Abstract: In a setting of many-to-one two-sided matching with non-transferable utilities, e.g., college admissions, we study conditions under which preferences of both sides are identified with data on one single market. The main challenge is that every agent's actual choice set is unobservable to the researcher. Assuming that the observed matching is stable, we show nonparametric and semiparametric identification of preferences of both sides under appropriate exclusion restrictions. Our identification arguments are constructive and thus directly provide a semiparametric estimator. In Monte Carlo simulations, the estimator can perform well but suffers from the curse of dimensionality. We thus adopt a parametric model and estimate it by a Bayesian approach with a Gibbs sampler, which works well in simulations. Finally, we apply our method to school admissions in Chile and conduct a counterfactual analysis of an affirmative action policy.
    Date: 2021–04
  18. By: Fiorini, Mario; Stevens, Katrien
    Abstract: Whenever treatment effects are heterogeneous, and there is sorting into treatment based on the gain, monotonicity is a condition that both Instrumental Variable and fuzzy Regression Discontinuity designs must satisfy for their estimate to be interpretable as a LATE. However, applied economic work often omits a discussion of this important assumption. A possible explanation for this missing step is the lack of a clear framework to think about monotonicity in practice. In this paper, we use an extended Roy model to provide insights into the interpretation of IV and fuzzy RD estimates under various degrees of treatment effect heterogeneity, sorting on gain and violation of monotonicity. We then extend our analysis to two applied settings to illustrate how monotonicity can be investigated using a mix of economic insights, data patterns and formal tests. For both settings, we use a Roy model to interpret the estimate even in the absence of monotonicity. We conclude with a set of recommendations for the applied researcher.
    Keywords: essential heterogeneity; monotonicity assumption; LATE; average causal response; instrumental variable; regression discontinuity; education; health.
    Date: 2021–02
  19. By: Jonathan Berrisch; Florian Ziel
    Abstract: Combination and aggregation techniques can improve forecast accuracy substantially. This also holds for probabilistic forecasting methods where full predictive distributions are combined. There are several time-varying and adaptive weighting schemes like Bayesian model averaging (BMA). However, the performance of different forecasters may vary not only over time but also in parts of the distribution. So one may be more accurate in the center of the distributions, and other ones perform better in predicting the distribution's tails. Consequently, we introduce a new weighting procedure that considers both varying performance across time and the distribution. We discuss pointwise online aggregation algorithms that optimize with respect to the continuous ranked probability score (CRPS). After analyzing the theoretical properties of a fully adaptive Bernstein online aggregation (BOA) method, we introduce smoothing procedures for pointwise CRPS learning. The properties are confirmed and discussed using simulation studies. Additionally, we illustrate the performance in a forecasting study for carbon markets. In detail, we predict the distribution of European emission allowance prices.
    Date: 2021–02
  20. By: Marcelo C. Medeiros; Henrique F. Pires
    Abstract: It is widely known that \texttt{Google Trends} has become one of the most popular free tools used by forecasters both in academics and in the private and public sectors. There are many papers, from several different fields, concluding that \texttt{Google Trends} improve forecasts' accuracy. However, what seems to be widely unknown, is that each sample of Google search data is different from the other, even if you set the same search term, data and location. This means that it is possible to find arbitrary conclusions merely by chance. This paper aims to show why and when it can become a problem and how to overcome this obstacle.
    Date: 2021–04
  21. By: Ricardo P. Masini; Marcelo C. Medeiros; Eduardo F. Mendes
    Abstract: There has been considerable advance in understanding the properties of sparse regularization procedures in high-dimensional models. In time series context, it is mostly restricted to Gaussian autoregressions or mixing sequences. We study oracle properties of LASSO estimation of weakly sparse vector-autoregressive models with heavy tailed, weakly dependent innovations with virtually no assumption on the conditional heteroskedasticity. In contrast to current literature, our innovation process satisfy an $L^1$ mixingale type condition on the centered conditional covariance matrices. This condition covers $L^1$-NED sequences and strong ($\alpha$-) mixing sequences as particular examples. From a modeling perspective, it covers several multivariate-GARCH specifications, such as the BEKK model, and other factor stochastic volatility specifications that were ruled out by assumption in previous studies.
    Date: 2019–12
  22. By: Marisa Miraldo (Imperial College Business School,); Carol Propper (Imperial College Business School); Christiern Rose (School of Economics, University of Queensland, Brisbane, Australia)
    Abstract: This paper provides new identification results for panel data models with contextual and endogenous peer effects. Contextual effects operate through individuals’ time-invariant unobserved heterogeneity. Identification hinges on a conditional mean restriction requiring exogenous mobility of individuals between groups over time. Some networks governing peer interactions preclude identification. For these cases we propose additional conditional variance restrictions. We conduct a Monte-Carlo experiment to evaluate the performance of our method and apply it to surgeon-hospital-year data to study take-up of minimally invasive surgery. A standard deviation increase in the average time-invariant unobserved heterogeneity of other surgeons in the same hospital leads to a 0.12 standard deviation increase in take-up. The effect is equally due to endogenous and contextual effects.
    Keywords: Peer effects, panel data, networks, identification, innovation, healthcare
    Date: 2020–12–03
  23. By: Jacob Dorn; Kevin Guo
    Abstract: Inverse propensity weighting (IPW) is a popular method for estimating treatment effects from observational data. However, its correctness relies on the untestable (and frequently implausible) assumption that all confounders have been measured. This paper introduces a robust sensitivity analysis for IPW that estimates the range of treatment effects compatible with a given amount of unobserved confounding. The estimated range converges to the narrowest possible interval (under the given assumptions) that must contain the true treatment effect. Our proposal is a refinement of the influential sensitivity analysis by Zhao, Small, and Bhattacharya (2019), which we show gives bounds that are too wide even asymptotically. This analysis is based on new partial identification results for Tan (2006)'s marginal sensitivity model.
    Date: 2021–02
  24. By: Milad Haghani; Michiel C. J. Bliemer; John M. Rose; Harmen Oppewal; Emily Lancsar
    Abstract: This paper reviews methods of hypothetical bias (HB) mitigation in choice experiments (CEs). It presents a bibliometric analysis and summary of empirical evidence of their effectiveness. The paper follows the review of empirical evidence on the existence of HB presented in Part I of this study. While the number of CE studies has rapidly increased since 2010, the critical issue of HB has been studied in only a small fraction of CE studies. The present review includes both ex-ante and ex-post bias mitigation methods. Ex-ante bias mitigation methods include cheap talk, real talk, consequentiality scripts, solemn oath scripts, opt-out reminders, budget reminders, honesty priming, induced truth telling, indirect questioning, time to think and pivot designs. Ex-post methods include follow-up certainty calibration scales, respondent perceived consequentiality scales, and revealed-preference-assisted estimation. It is observed that the use of mitigation methods markedly varies across different sectors of applied economics. The existing empirical evidence points to their overall effectives in reducing HB, although there is some variation. The paper further discusses how each mitigation method can counter a certain subset of HB sources. Considering the prevalence of HB in CEs and the effectiveness of bias mitigation methods, it is recommended that implementation of at least one bias mitigation method (or a suitable combination where possible) becomes standard practice in conducting CEs. Mitigation method(s) suited to the particular application should be implemented to ensure that inferences and subsequent policy decisions are as much as possible free of HB.
    Date: 2021–02
  25. By: James G. MacKinnon (Queen's University); Morten Ørregaard Nielsen (Queen's University and CREATES); Matthew D. Webb (Carleton University)
    Abstract: Methods for cluster-robust inference are routinely used in economics and many other disciplines. However, it is only recently that theoretical foundations for the use of these methods in many empirically relevant situations have been developed. In this paper, we use these theoretical results to provide a guide to empirical practice. We do not attempt to present a comprehensive survey of the (very large) literature. Instead, we bridge theory and practice by providing a thorough guide on what to do and why, based on recently available econometric theory and simulation evidence. The paper includes an empirical analysis of the effects of the minimum wage on teenagers using individual data, in which we practice what we preach.
    JEL: C12 C15 C21 C23
    Date: 2021–04
  26. By: Andrea Vandin; Daniele Giachini; Francesco Lamperti; Francesca Chiaromonte
    Abstract: We propose a novel approach to the statistical analysis of simulation models and, especially, agent-based models (ABMs). Our main goal is to provide a fully automated and model-independent tool-kit to inspect simulations and perform counterfactual analysis. Our approach: (i) is easy-to-use by the modeller, (ii) improves reproducibility of results, (iii) optimizes running time given the modeller's machine, (iv) automatically chooses the number of required simulations and simulation steps to reach user-specified statistical confidence, and (v) automatically performs a variety of statistical tests. In particular, our framework is designed to distinguish the transient dynamics of the model from its steady-state behaviour (if any), estimate properties of the model in both "phases", and provide indications on the ergodic (or non-ergodic) nature of the simulated processes -- which, in turns allows one to gauge the reliability of a steady-state analysis. Estimates are equipped with statistical guarantees, allowing for robust comparisons across computational experiments. To demonstrate the effectiveness of our approach, we apply it to two models from the literature: a large scale macro-financial ABM and a small scale prediction market model. Compared to prior analyses of these models, we obtain new insights and we are able to identify and fix some erroneous conclusions.
    Date: 2021–02
  27. By: Mathur, Maya B; Smith, Louisa; Yoshida, Kazuki; Ding, Peng; VanderWeele, Tyler
    Abstract: We provide sensitivity analyses for unmeasured confounding in estimates of effect heterogeneity and causal interaction.
    Date: 2021–04–08
  28. By: Javier Alejo; Antonio F. Galvao; Gabriel Montes-Rojas
    Abstract: This paper develops a first-stage linear regression representation for the instrumental variables (IV) quantile regression (QR) model. The first-stage is analogue to the least squares case, i.e., a conditional mean regression of the endogenous variables on the instruments, with the difference that for the QR case is a weighted regression. The weights are given by the conditional density function of the innovation term in the QR structural model, conditional on the endogeneous and exogenous covariates, and the instruments as well, at a given quantile. The first-stage regression is a natural framework to evaluate the validity of instruments. Thus, we are able to use the first-stage result and suggest testing procedures to evaluate the adequacy of instruments in IVQR models by evaluating their statistical significance. In the QR case, the instruments may be relevant at some quantiles but not at others or at the mean. Monte Carlo experiments provide numerical evidence that the proposed tests work as expected in terms of empirical size and power in finite samples. An empirical application illustrates that checking for the statistical significance of the instruments at different quantiles is important.
    Date: 2021–02
  29. By: Julian Ramajo; Miguel A. Marquez; Geoffrey J. D. Hewings
    Abstract: This paper introduces a new specification for the nonparametric production-frontier based on Data Envelopment Analysis (DEA) when dealing with decision-making units whose economic performances are correlated with those of the neighbors (spatial dependence). To illustrate the bias reduction that the SpDEA provides with respect to standard DEA methods, an analysis of the regional production frontiers for the NUTS-2 European regions during the period 2000-2014 was carried out. The estimated SpDEA scores show a bimodal distribution do not detected by the standard DEA estimates. The results confirm the crucial role of space, offering important new insights on both the causes of regional disparities in labour productivity and the observed polarization of the European distribution of per capita income.
    Date: 2021–03
  30. By: Pratyush Muthukumar; Jie Zhong
    Abstract: Stock price forecasting is a highly complex and vitally important field of research. Recent advancements in deep neural network technology allow researchers to develop highly accurate models to predict financial trends. We propose a novel deep learning model called ST-GAN, or Stochastic Time-series Generative Adversarial Network, that analyzes both financial news texts and financial numerical data to predict stock trends. We utilize cutting-edge technology like the Generative Adversarial Network (GAN) to learn the correlations among textual and numerical data over time. We develop a new method of training a time-series GAN directly using the learned representations of Naive Bayes' sentiment analysis on financial text data alongside technical indicators from numerical data. Our experimental results show significant improvement over various existing models and prior research on deep neural networks for stock price forecasting.
    Date: 2021–02
  31. By: Filippo Neri
    Abstract: Concept drift detectors allow learning systems to maintain good accuracy on non-stationary data streams. Financial time series are an instance of non-stationary data streams whose concept drifts (market phases) are so important to affect investment decisions worldwide. This paper studies how concept drift detectors behave when applied to financial time series. General results are: a) concept drift detectors usually improve the runtime over continuous learning, b) their computational cost is usually a fraction of the learning and prediction steps of even basic learners, c) it is important to study concept drift detectors in combination with the learning systems they will operate with, and d) concept drift detectors can be directly applied to the time series of raw financial data and not only to the model's accuracy one. Moreover, the study introduces three simple concept drift detectors, tailored to financial time series, and shows that two of them can be at least as effective as the most sophisticated ones from the state of the art when applied to financial time series.
    Date: 2021–03

This nep-ecm issue is ©2021 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.