nep-ecm New Economics Papers
on Econometrics
Issue of 2024‒06‒24
twenty-one papers chosen by
Sune Karlsson, Örebro universitet


  1. Conditional Choice Probability Estimation of Dynamic Discrete Choice Models with 2-period Finite Dependence By Yu Hao; Hiroyuki Kasahara
  2. Transfer Learning for Spatial Autoregressive Models By Hao Zeng; Wei Zhong; Xingbai Xu
  3. Instrumented Difference-in-Differences with heterogeneous treatment effects By Sho Miyaji
  4. Random effects panel data models with known heteroskedasticity By Julius Schäper; Rainer Winkelmann
  5. Estimating Idea Production: A Methodological Survey By Ege Erdil; Tamay Besiroglu; Anson Ho
  6. Edge differentially private estimation in the β-model via jittering and method of moments By Chang, Jinyuan; Hu, Qiao; Kolaczyk, Eric D.; Yao, Qiwei; Yi, Fengting
  7. A Sharp Test for the Judge Leniency Design By Mohamed Coulibaly; Yu-Chin Hsu; Ismael Mourifié; Yuanyuan Wan
  8. Evaluating dynamic conditional quantile treatment effects with applications in ridesharing By Li, Ting; Shi, Chengchun; Lu, Zhaohua; Li, Yi; Zhu, Hongtu
  9. Double Robustness of Local Projections and Some Unpleasant VARithmetic By Jos\'e Luis Montiel Olea; Mikkel Plagborg-M{\o}ller; Eric Qian; Christian K. Wolf
  10. Testing Sign Congruence By Douglas L. Miller; Francesca Molinari; J\"org Stoye
  11. Comprehensive Causal Machine Learning By Michael Lechner; Jana Mareckova
  12. Fitting complex stochastic volatility models using Laplace approximation By Marín Díazaraque, Juan Miguel; Romero, Eva; Lopes Moreira Da Veiga, María Helena
  13. Generating density nowcasts for U.S. GDP growth with deep learning: Bayes by Backprop and Monte Carlo dropout By Krist\'of N\'emeth; D\'aniel Hadh\'azi
  14. Synthetic Controls with spillover effects: A comparative study By Andrii Melnychuk
  15. Sequential Validation of Treatment Heterogeneity By Stefan Wager
  16. Forecasting Tail Risk via Neural Networks with Asymptotic Expansions By Yuji Sakurai; Zhuohui Chen
  17. fabOF: A Novel Tree Ensemble Method for Ordinal Prediction By Buczak, Philip
  18. Comparing predictive ability in presence of instability over a very short time By Fabrizio Iacone; Luca Rossini; Andrea Viselli
  19. Predictive Decision Synthesis for Portfolios: Betting on Better Models By Emily Tallman; Mike West
  20. Optimal Text-Based Time-Series Indices By David Ardia; Keven Bluteau
  21. More Reasons Why Replication Is A Difficult Issue By Wilcox, Rand R.; Rousselet, Guillaume A

  1. By: Yu Hao; Hiroyuki Kasahara
    Abstract: This paper extends the work of Arcidiacono and Miller (2011, 2019) by introducing a novel characterization of finite dependence within dynamic discrete choice models, demonstrating that numerous models display 2-period finite dependence. We recast finite dependence as a problem of sequentially searching for weights and introduce a computationally efficient method for determining these weights by utilizing the Kronecker product structure embedded in state transitions. With the estimated weights, we develop a computationally attractive Conditional Choice Probability estimator with 2-period finite dependence. The computational efficacy of our proposed estimator is demonstrated through Monte Carlo simulations.
    Date: 2024–05
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2405.12467&r=
  2. By: Hao Zeng; Wei Zhong; Xingbai Xu
    Abstract: The spatial autoregressive (SAR) model has been widely applied in various empirical economic studies to characterize the spatial dependence among subjects. However, the precision of estimating the SAR model diminishes when the sample size of the target data is limited. In this paper, we propose a new transfer learning framework for the SAR model to borrow the information from similar source data to improve both estimation and prediction. When the informative source data sets are known, we introduce a two-stage algorithm, including a transferring stage and a debiasing stage, to estimate the unknown parameters and also establish the theoretical convergence rates for the resulting estimators. If we do not know which sources to transfer, a transferable source detection algorithm is proposed to detect informative sources data based on spatial residual bootstrap to retain the necessary spatial dependence. Its detection consistency is also derived. Simulation studies demonstrate that using informative source data, our transfer learning algorithm significantly enhances the performance of the classical two-stage least squares estimator. In the empirical application, we apply our method to the election prediction in swing states in the 2020 U.S. presidential election, utilizing polling data from the 2016 U.S. presidential election along with other demographic and geographical data. The empirical results show that our method outperforms traditional estimation methods.
    Date: 2024–05
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2405.15600&r=
  3. By: Sho Miyaji
    Abstract: Many studies exploit variation in the timing of policy adoption across units as an instrument for treatment, and use instrumental variable techniques. This paper formalizes the underlying identification strategy as an instrumented difference-in-differences (DID-IV). In a simple setting with two periods and two groups, our DID-IV design mainly consists of a monotonicity assumption, and parallel trends assumptions in the treatment and the outcome. In this design, a Wald-DID estimand, which scales the DID estimand of the outcome by the DID estimand of the treatment, captures the local average treatment effect on the treated (LATET). In contrast to Fuzzy DID design considered in \cite{De_Chaisemartin2018-xe}, our DID-IV design does not {\it ex-ante} require strong restrictions on the treatment adoption behavior across units, and our target parameter, the LATET, is policy-relevant if the instrument is based on the policy change of interest to the researcher. We extend the canonical DID-IV design to multiple period settings with the staggered adoption of the instrument across units, which we call staggered DID-IV designs. We propose an estimation method in staggered DID-IV designs that is robust to treatment effect heterogeneity. We illustrate our findings in the setting of \cite{Oreopoulos2006-bn}, estimating returns to schooling in the United Kingdom. In this application, the two-way fixed effects instrumental variable regression, which is the conventional approach to implement staggered DID-IV designs, yields the negative estimate, whereas our estimation method indicates the substantial gain from schooling.
    Date: 2024–05
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2405.12083&r=
  4. By: Julius Schäper; Rainer Winkelmann
    Abstract: The paper introduces two estimators for the linear random effects panel data model with known heteroskedasticity. Examples where heteroskedasticity can be treated as given include panel regressions with averaged data, meta regressions and the linear probability model. While one estimator builds on the additive random effects assumption, the other, which is simpler to implement in standard software, assumes that the random effect is multiplied by the heteroskedastic standard deviation. Simulation results show that substantial efficiency gains can be realized with either of the two estimators, that they are robust against deviations from the assumed specification, and that the confidence interval coverage equals the nominal level if clustered standard errors are used. Efficiency gains are also evident in an illustrative meta-regression application estimating the effect of study design features on loss aversion coefficients.
    Keywords: Generalized least squares, linear probability model, meta regression
    JEL: C23
    Date: 2024–05
    URL: https://d.repec.org/n?u=RePEc:zur:econwp:445&r=
  5. By: Ege Erdil; Tamay Besiroglu; Anson Ho
    Abstract: Accurately modeling the production of new ideas is crucial for innovation theory and endogenous growth models. This paper provides a comprehensive methodological survey of strategies for estimating idea production functions. We explore various methods, including naive approaches, linear regression, maximum likelihood estimation, and Bayesian inference, each suited to different data availability settings. Through case studies ranging from total factor productivity to software R&D, we show how to apply these methodologies in practice. Our synthesis provides researchers with guidance on strategies for characterizing idea production functions and highlights obstacles that must be addressed through further empirical validation.
    Date: 2024–05
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2405.10494&r=
  6. By: Chang, Jinyuan; Hu, Qiao; Kolaczyk, Eric D.; Yao, Qiwei; Yi, Fengting
    Abstract: A standing challenge in data privacy is the trade-off between the level of privacy and the efficiency of statistical inference. Here we conduct an in-depth study of this trade-off for parameter estimation in the β-model (Chatterjee, Diaconis and Sly, 2011) for edge differentially private network data re-leased via jittering (Karwa, Krivitsky and Slavkovi´c, 2017). Unlike most previous approaches based on maximum likelihood estimation for this network model, we proceed via method of moments. This choice facilitates our exploration of a substantially broader range of privacy levels – corresponding to stricter privacy – than has been to date. Over this new range we discover our proposed estimator for the parameters exhibits an interesting phase transition, with both its convergence rate and asymptotic variance following one of three different regimes of behavior depending on the level of privacy. Because identification of the operable regime is difficult to impossible in practice, we devise a novel adaptive bootstrap procedure to construct uniform inference across different phases. In fact, leveraging this bootstrap we are able to provide for simultaneous inference of all parameters in the β-model (i.e., equal to the number of vertices), which would appear to be the first result of its kind. Numerical experiments confirm the competitive and reliable finite sample performance of the proposed inference methods, next to a comparable maximum likelihood method, as well as significant advantages in terms of computational speed and memory.
    JEL: C1
    Date: 2024–04–01
    URL: http://d.repec.org/n?u=RePEc:ehl:lserod:122099&r=
  7. By: Mohamed Coulibaly; Yu-Chin Hsu; Ismael Mourifié; Yuanyuan Wan
    Abstract: We propose a new specification test to assess the validity of the judge leniency design. We characterize a set of sharp testable implications, which exploit all the relevant information in the observed data distribution to detect violations of the judge leniency design assumptions. The proposed sharp test is asymptotically valid and consistent and will not make discordant recommendations. When the judge’s leniency design assumptions are rejected, we propose a way to salvage the model using partial monotonicity and exclusion assumptions, under which a variant of the Local Instrumental Variable (LIV) estimand can recover the Marginal Treatment Effect. Simulation studies show our test outperforms existing non-sharp tests by significant margins. We apply our test to assess the validity of the judge leniency design using data from Stevenson (2018), and it rejects the validity for three crime categories: robbery, drug selling, and drug possession.
    JEL: C1 C12 C18 C26
    Date: 2024–05
    URL: http://d.repec.org/n?u=RePEc:nbr:nberwo:32456&r=
  8. By: Li, Ting; Shi, Chengchun; Lu, Zhaohua; Li, Yi; Zhu, Hongtu
    Abstract: Many modern tech companies, such as Google, Uber, and Didi, use online experiments (also known as A/B testing) to evaluate new policies against existing ones. While most studies concentrate on average treatment effects, situations with skewed and heavy-tailed outcome distributions may benefit from alternative criteria, such as quantiles. However, assessing dynamic quantile treatment effects (QTE) remains a challenge, particularly when dealing with data from ride-sourcing platforms that involve sequential decision-making across time and space. In this article, we establish a formal framework to calculate QTE conditional on characteristics independent of the treatment. Under specific model assumptions, we demonstrate that the dynamic conditional QTE (CQTE) equals the sum of individual CQTEs across time, even though the conditional quantile of cumulative rewards may not necessarily equate to the sum of conditional quantiles of individual rewards. This crucial insight significantly streamlines the estimation and inference processes for our target causal estimand. We then introduce two varying coefficient decision process (VCDP) models and devise an innovative method to test the dynamic CQTE. Moreover, we expand our approach to accommodate data from spatiotemporal dependent experiments and examine both conditional quantile direct and indirect effects. To showcase the practical utility of our method, we apply it to three real-world datasets from a ride-sourcing platform. Theoretical findings and comprehensive simulation studies further substantiate our proposal. Supplementary materials for this article are available online Code implementing the proposed method is also available at: https://github.com/BIG-S2/CQSTVCM.
    Keywords: varying coefficient models; A/B testing; policy evaluation; quantile treatment effect; ridesourcing platform; spatialtemporal experiments; Li’s research is partially supported by the Nation12101388; CCF-DiDi GAIA Collaborative Research Funds for Young Scholars and Program for Innovative Research Team of Shanghai University of Finance and Economics; EP/W014971/1
    JEL: C1
    Date: 2024
    URL: http://d.repec.org/n?u=RePEc:ehl:lserod:122488&r=
  9. By: Jos\'e Luis Montiel Olea; Mikkel Plagborg-M{\o}ller; Eric Qian; Christian K. Wolf
    Abstract: We consider impulse response inference in a locally misspecified stationary vector autoregression (VAR) model. The conventional local projection (LP) confidence interval has correct coverage even when the misspecification is so large that it can be detected with probability approaching 1. This follows from a "double robustness" property analogous to that of modern estimators for partially linear regressions. In contrast, VAR confidence intervals dramatically undercover even for misspecification so small that it is difficult to detect statistically and cannot be ruled out based on economic theory. This is because of a "no free lunch" result for VARs: the worst-case bias and coverage distortion are small if, and only if, the variance is close to that of LP. While VAR coverage can be restored by using a bias-aware critical value or a large lag length, the resulting confidence interval tends to be at least as wide as the LP interval.
    Date: 2024–05
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2405.09509&r=
  10. By: Douglas L. Miller; Francesca Molinari; J\"org Stoye
    Abstract: We consider testing the null hypothesis that two parameters $({\mu}_1, {\mu}_2)$ have the same sign, assuming that (asymptotically) normal estimators are available. Examples of this problem include the analysis of heterogeneous treatment effects, causal interpretation of reduced-form estimands, meta-studies, and mediation analysis. A number of tests were recently proposed. We recommend a test that is simple and rejects more often than many of these recent proposals. Like all other tests in the literature, it is conservative if the truth is near (0, 0) and therefore also biased. To clarify whether these features are avoidable, we also provide a test that is unbiased and has exact size control on the boundary of the null hypothesis, but which has counterintuitive properties and hence we do not recommend. The method that we recommend can be used to revisit existing findings using information typically reported in empirical research papers.
    Date: 2024–05
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2405.11759&r=
  11. By: Michael Lechner; Jana Mareckova
    Abstract: Uncovering causal effects at various levels of granularity provides substantial value to decision makers. Comprehensive machine learning approaches to causal effect estimation allow to use a single causal machine learning approach for estimation and inference of causal mean effects for all levels of granularity. Focusing on selection-on-observables, this paper compares three such approaches, the modified causal forest (mcf), the generalized random forest (grf), and double machine learning (dml). It also provides proven theoretical guarantees for the mcf and compares the theoretical properties of the approaches. The findings indicate that dml-based methods excel for average treatment effects at the population level (ATE) and group level (GATE) with few groups, when selection into treatment is not too strong. However, for finer causal heterogeneity, explicitly outcome-centred forest-based approaches are superior. The mcf has three additional benefits: (i) It is the most robust estimator in cases when dml-based approaches underperform because of substantial selectivity; (ii) it is the best estimator for GATEs when the number of groups gets larger; and (iii), it is the only estimator that is internally consistent, in the sense that low-dimensional causal ATEs and GATEs are obtained as aggregates of finer-grained causal parameters.
    Date: 2024–05
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2405.10198&r=
  12. By: Marín Díazaraque, Juan Miguel; Romero, Eva; Lopes Moreira Da Veiga, María Helena
    Abstract: The paper proposes the use of Laplace approximation (LA) to estimate complex univariate symmetric and asymmetric stochastic volatility (SV) models with flexible distributions for standardized returns. LA is a method for approximating integrals, especially in Bayesian statistics, and is often used to approximate the posterior distribution of the model parameters. This method simplifies complex problems by focusing on the most critical areas and using a well-understood approximation. We show how easily complex SV models can be estimated and analyzed using LA, with changes to specifications, priors, and sampling error distributions requiring only minor changes to the code. The simulation study shows that the LA estimates of the model parameters are close to the true values in finite samples and that the proposed estimator is computationally efficient and fast. It is an effective alternative to existing estimation methods for SV models. Finally, we evaluate the in-sample and out-of-sample performance of the models by forecasting one-day-ahead volatility. We use four well-known energy index series: two for clean energy and two for conventional (brown) energy. In the out-of-sample analysis, we also examine the impact of climate policy uncertainty and energy prices on the volatility forecasts. The results support the use of asymmetric SV models for clean energy series and symmetric SV models for brown energy indices conditional on these state variables.
    Keywords: Asymmetric Volatility; Laplace Approximation; Stochastic Volatility
    Date: 2024–06–06
    URL: https://d.repec.org/n?u=RePEc:cte:wsrepe:43947&r=
  13. By: Krist\'of N\'emeth; D\'aniel Hadh\'azi
    Abstract: Recent results in the literature indicate that artificial neural networks (ANNs) can outperform the dynamic factor model (DFM) in terms of the accuracy of GDP nowcasts. Compared to the DFM, the performance advantage of these highly flexible, nonlinear estimators is particularly evident in periods of recessions and structural breaks. From the perspective of policy-makers, however, nowcasts are the most useful when they are conveyed with uncertainty attached to them. While the DFM and other classical time series approaches analytically derive the predictive (conditional) distribution for GDP growth, ANNs can only produce point nowcasts based on their default training procedure (backpropagation). To fill this gap, first in the literature, we adapt two different deep learning algorithms that enable ANNs to generate density nowcasts for U.S. GDP growth: Bayes by Backprop and Monte Carlo dropout. The accuracy of point nowcasts, defined as the mean of the empirical predictive distribution, is evaluated relative to a naive constant growth model for GDP and a benchmark DFM specification. Using a 1D CNN as the underlying ANN architecture, both algorithms outperform those benchmarks during the evaluation period (2012:Q1 -- 2022:Q4). Furthermore, both algorithms are able to dynamically adjust the location (mean), scale (variance), and shape (skew) of the empirical predictive distribution. The results indicate that both Bayes by Backprop and Monte Carlo dropout can effectively augment the scope and functionality of ANNs, rendering them a fully compatible and competitive alternative for classical time series approaches.
    Date: 2024–05
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2405.15579&r=
  14. By: Andrii Melnychuk
    Abstract: Iterative Synthetic Control Method is introduced in this study, a modification of the Synthetic Control Method (SCM) designed to improve its predictive performance by utilizing control units affected by the treatment in question. This method is then compared to other SCM modifications: SCM without any modifications, SCM after removing all spillover-affected units, Inclusive SCM, and the SP SCM model. For the comparison, Monte Carlo simulations are utilized, generating artificial datasets with known counterfactuals and comparing the predictive performance of the methods. Generally, the Inclusive SCM performed best in all settings and is relatively simple to implement. The Iterative SCM, introduced in this paper, was in close seconds, with a small difference in performance and a simpler implementation.
    Date: 2024–05
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2405.01645&r=
  15. By: Stefan Wager
    Abstract: We use the martingale construction of Luedtke and van der Laan (2016) to develop tests for the presence of treatment heterogeneity. The resulting sequential validation approach can be instantiated using various validation metrics, such as BLPs, GATES, QINI curves, etc., and provides an alternative to cross-validation-like cross-fold application of these metrics.
    Date: 2024–05
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2405.05534&r=
  16. By: Yuji Sakurai; Zhuohui Chen
    Abstract: We propose a new machine-learning-based approach for forecasting Value-at-Risk (VaR) named CoFiE-NN where a neural network (NN) is combined with Cornish-Fisher expansions (CoFiE). CoFiE-NN can capture non-linear dynamics of high-order statistical moments thanks to the flexibility of a NN while maintaining interpretability of the outputs by using CoFiE which is a well-known statistical formula. First, we explain CoFiE-NN. Second, we compare the forecasting performance of CoFiE-NN with three conventional models using both Monte Carlo simulation and real data. To do so, we employ Long Short-Term Memory (LSTM) as our main specification of the NN. We then apply the CoFiE-NN for different asset classes, with a focus on foreign exchange markets. We report that CoFiE-NN outperfoms the conventional EGARCH-t model and the Extreme Value Theory model in several statistical criteria for both the simulated data and the real data. Finally, we introduce a new empirical proxy for tail risk named tail risk ratio under CoFiE-NN. We discover that the only 20 percent of tail risk dynamics across 22 currencies is explained by one common factor. This is contrasting to the fact that 60 percent of volatility dynamics across the same currencies is explained by one common factor.
    Keywords: Machine learning; Value-at-Risk; Neural Network
    Date: 2024–05–10
    URL: http://d.repec.org/n?u=RePEc:imf:imfwpa:2024/099&r=
  17. By: Buczak, Philip
    Abstract: Ordinal responses commonly occur in the life sciences, e.g., through school grades or rating scales. Where traditionally parametric statistical models have been used, machine learning (ML) methods such as random forest (RF) are increasingly employed for ordinal prediction. As RF does not account for ordinality, several extensions have been proposed. A promising approach lies in assigning optimized numeric scores to the ordinal response categories and using regression RF. However, these optimization procedures are computationally expensive and have been shown to yield only situational benefit. In this work, I propose Frequency Adjusted Borders Ordinal Forest (fabOF), a novel tree ensemble method for ordinal prediction forgoing extensive optimization while offering improved predictive performance in simulation and an illustrative example of student performance.
    Date: 2024–05–15
    URL: http://d.repec.org/n?u=RePEc:osf:osfxxx:h8t4p&r=
  18. By: Fabrizio Iacone; Luca Rossini; Andrea Viselli
    Abstract: We consider forecast comparison in the presence of instability when this affects only a short period of time. We demonstrate that global tests do not perform well in this case, as they were not designed to capture very short-lived instabilities, and their power vanishes altogether when the magnitude of the shock is very large. We then discuss and propose approaches that are more suitable to detect such situations, such as nonparametric methods (S test or MAX procedure). We illustrate these results in different Monte Carlo exercises and in evaluating the nowcast of the quarterly US nominal GDP from the Survey of Professional Forecasters (SPF) against a naive benchmark of no growth, over the period that includes the GDP instability brought by the Covid-19 crisis. We recommend that the forecaster should not pool the sample, but exclude the short periods of high local instability from the evaluation exercise.
    Date: 2024–05
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2405.11954&r=
  19. By: Emily Tallman; Mike West
    Abstract: We discuss and develop Bayesian dynamic modelling and predictive decision synthesis for portfolio analysis. The context involves model uncertainty with a set of candidate models for financial time series with main foci in sequential learning, forecasting, and recursive decisions for portfolio reinvestments. The foundational perspective of Bayesian predictive decision synthesis (BPDS) defines novel, operational analysis and resulting predictive and decision outcomes. A detailed case study of BPDS in financial forecasting of international exchange rate time series and portfolio rebalancing, with resulting BPDS-based decision outcomes compared to traditional Bayesian analysis, exemplifies and highlights the practical advances achievable under the expanded, subjective Bayesian approach that BPDS defines.
    Date: 2024–04
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2405.01598&r=
  20. By: David Ardia; Keven Bluteau
    Abstract: We propose an approach to construct text-based time-series indices in an optimal way--typically, indices that maximize the contemporaneous relation or the predictive performance with respect to a target variable, such as inflation. We illustrate our methodology with a corpus of news articles from the Wall Street Journal by optimizing text-based indices focusing on tracking the VIX index and inflation expectations. Our results highlight the superior performance of our approach compared to existing indices.
    Date: 2024–05
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2405.10449&r=
  21. By: Wilcox, Rand R.; Rousselet, Guillaume A (University of Glasgow)
    Abstract: Many issues complicate efforts to replicate studies, including concerns about models. Hundreds of papers published over the last sixty years make it clear that the models underlying the conventional statistical methods that are routinely taught and used can lead to low power, inflated false positive rates and inaccurate confidence intervals. In this chapter, we summarize these issues and how they affect replication assessment. We conclude that instead of trying to replicate poorly characterized effects, our efforts would be better spent on developing and discussing more detailed models.
    Date: 2024–05–22
    URL: http://d.repec.org/n?u=RePEc:osf:osfxxx:9amhe&r=

This nep-ecm issue is ©2024 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.