nep-ecm New Economics Papers
on Econometrics
Issue of 2024‒09‒23
nineteen papers chosen by
Sune Karlsson, Örebro universitet


  1. Estimation and Inference of Average Treatment Effect in Percentage Points under Heterogeneity By Ying Zeng
  2. Deep Learning for the Estimation of Heterogeneous Parameters in Discrete Choice Models By Stephan Hetzenecker; Maximilian Osterhaus
  3. Gradient Wild Bootstrap for Instrumental Variable Quantile Regressions with Weak and Few Clusters By Wenjie Wang; Yichong Zhang
  4. Endogeneity Corrections in Binary Outcome Models with Nonlinear Transformations: Identification and Inference By Alexander Mayer; Dominik Wied
  5. Counterfactual and Synthetic Control Method: Causal Inference with Instrumented Principal Component Analysis By Cong Wang
  6. Anytime-Valid Inference for Double/Debiased Machine Learning of Causal Parameters By Abhinandan Dalal; Patrick Bl\"obaum; Shiva Kasiviswanathan; Aaditya Ramdas
  7. Continuous difference-in-differences with double/debiased machine learning By Lucas Zhang
  8. Methodological Foundations of Modern Causal Inference in Social Science Research By Guanghui Pan
  9. Engle-Granger Representation in Spatial and Spatio-Temporal Models By Bhattacharjee, A.; Ditzen, J.,; Holly, S.
  10. Hidden Threshold Models with applications to asymmetric cycles By Harvey, A.; Simons, J.
  11. Beyond Pearson’s correlation: modern nonparametric independence tests for psychological research By Karch, Julian D.; Perez-Alonso, Andres F.; Bergsma, Wicher P.
  12. Posterior sampling from truncated Ferguson-Klass representation of normalised completely random measure mixtures By Zhang, Junyi; Dassios, Angelos
  13. Striking the Right Balance: Why Standard Balance Tests Over-Reject the Null, and How to Fix It By Kerwin, Jason; Rostom, Nada; Sterck, Olivier
  14. Learning Firm Conduct: Pass-Through as a Foundation for Instrument Relevance By Adam Dearing; Lorenzo Magnolfi; Daniel Quint; Christopher J. Sullivan; Sarah B. Waldfogel
  15. Solving and analyzing DSGE models in the frequency domain By Meyer-Gohde, Alexander
  16. Enhancing Startup Success Predictions in Venture Capital: A GraphRAG Augmented Multivariate Time Series Method By Zitian Gao; Yihao Xiao
  17. EX-DRL: Hedging Against Heavy Losses with EXtreme Distributional Reinforcement Learning By Parvin Malekzadeh; Zissis Poulos; Jacky Chen; Zeyu Wang; Konstantinos N. Plataniotis
  18. An Integrated Approach to Importance Sampling and Machine Learning for Efficient Monte Carlo Estimation of Distortion Risk Measures in Black Box Models By S\"oren Bettels; Stefan Weber
  19. Enhancing Causal Discovery in Financial Networks with Piecewise Quantile Regression By Cameron Cornell; Lewis Mitchell; Matthew Roughan

  1. By: Ying Zeng
    Abstract: In semi-log regression models with heterogeneous treatment effects, the average treatment effect (ATE) in log points and its exponential transformation minus one underestimate the ATE in percentage points. I propose new estimation and inference methods for the ATE in percentage points, with inference utilizing the Fenton-Wilkinson approximation. These methods are particularly relevant for staggered difference-in-differences designs, where treatment effects often vary across groups and periods. I prove the methods' large-sample properties and demonstrate their finite-sample performance through simulations, revealing substantial discrepancies between conventional and proposed measures. Two empirical applications further underscore the practical importance of these methods.
    Date: 2024–08
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2408.06624
  2. By: Stephan Hetzenecker; Maximilian Osterhaus
    Abstract: This paper studies the finite sample performance of the flexible estimation approach of Farrell, Liang, and Misra (2021a), who propose to use deep learning for the estimation of heterogeneous parameters in economic models, in the context of discrete choice models. The approach combines the structure imposed by economic models with the flexibility of deep learning, which assures the interpretebility of results on the one hand, and allows estimating flexible functional forms of observed heterogeneity on the other hand. For inference after the estimation with deep learning, Farrell et al. (2021a) derive an influence function that can be applied to many quantities of interest. We conduct a series of Monte Carlo experiments that investigate the impact of regularization on the proposed estimation and inference procedure in the context of discrete choice models. The results show that the deep learning approach generally leads to precise estimates of the true average parameters and that regular robust standard errors lead to invalid inference results, showing the need for the influence function approach for inference. Without regularization, the influence function approach can lead to substantial bias and large estimated standard errors caused by extreme outliers. Regularization reduces this property and stabilizes the estimation procedure, but at the expense of inducing an additional bias. The bias in combination with decreasing variance associated with increasing regularization leads to the construction of invalid inferential statements in our experiments. Repeated sample splitting, unlike regularization, stabilizes the estimation approach without introducing an additional bias, thereby allowing for the construction of valid inferential statements.
    Date: 2024–08
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2408.09560
  3. By: Wenjie Wang; Yichong Zhang
    Abstract: We study the gradient wild bootstrap-based inference for instrumental variable quantile regressions in the framework of a small number of large clusters in which the number of clusters is viewed as fixed, and the number of observations for each cluster diverges to infinity. For the Wald inference, we show that our wild bootstrap Wald test, with or without studentization using the cluster-robust covariance estimator (CRVE), controls size asymptotically up to a small error as long as the parameter of endogenous variable is strongly identified in at least one of the clusters. We further show that the wild bootstrap Wald test with CRVE studentization is more powerful for distant local alternatives than that without. Last, we develop a wild bootstrap Anderson-Rubin (AR) test for the weak-identification-robust inference. We show it controls size asymptotically up to a small error, even under weak or partial identification for all clusters. We illustrate the good finite-sample performance of the new inference methods using simulations and provide an empirical application to a well-known dataset about US local labor markets.
    Date: 2024–08
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2408.10686
  4. By: Alexander Mayer; Dominik Wied
    Abstract: For binary outcome models, an endogeneity correction based on nonlinear rank-based transformations is proposed. Identification without external instruments is achieved under one of two assumptions: Either the endogenous regressor is a nonlinear function of one component of the error term conditionally on exogenous regressors. Or the dependence between endogenous regressor and exogenous regressor is nonlinear. Under these conditions, we prove consistency and asymptotic normality. Monte Carlo simulations and an application on German insolvency data illustrate the usefulness of the method.
    Date: 2024–08
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2408.06977
  5. By: Cong Wang
    Abstract: The fundamental problem of causal inference lies in the absence of counterfactuals. Traditional methodologies impute the missing counterfactuals implicitly or explicitly based on untestable or overly stringent assumptions. Synthetic control method (SCM) utilizes a weighted average of control units to impute the missing counterfactual for the treated unit. Although SCM relaxes some strict assumptions, it still requires the treated unit to be inside the convex hull formed by the controls, avoiding extrapolation. In recent advances, researchers have modeled the entire data generating process (DGP) to explicitly impute the missing counterfactual. This paper expands the interactive fixed effect (IFE) model by instrumenting covariates into factor loadings, adding additional robustness. This methodology offers multiple benefits: firstly, it incorporates the strengths of previous SCM approaches, such as the relaxation of the untestable parallel trends assumption (PTA). Secondly, it does not require the targeted outcomes to be inside the convex hull formed by the controls. Thirdly, it eliminates the need for correct model specification required by the IFE model. Finally, it inherits the ability of principal component analysis (PCA) to effectively handle high-dimensional data and enhances the value extracted from numerous covariates.
    Date: 2024–08
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2408.09271
  6. By: Abhinandan Dalal; Patrick Bl\"obaum; Shiva Kasiviswanathan; Aaditya Ramdas
    Abstract: Double (debiased) machine learning (DML) has seen widespread use in recent years for learning causal/structural parameters, in part due to its flexibility and adaptability to high-dimensional nuisance functions as well as its ability to avoid bias from regularization or overfitting. However, the classic double-debiased framework is only valid asymptotically for a predetermined sample size, thus lacking the flexibility of collecting more data if sharper inference is needed, or stopping data collection early if useful inferences can be made earlier than expected. This can be of particular concern in large scale experimental studies with huge financial costs or human lives at stake, as well as in observational studies where the length of confidence of intervals do not shrink to zero even with increasing sample size due to partial identifiability of a structural parameter. In this paper, we present time-uniform counterparts to the asymptotic DML results, enabling valid inference and confidence intervals for structural parameters to be constructed at any arbitrary (possibly data-dependent) stopping time. We provide conditions which are only slightly stronger than the standard DML conditions, but offer the stronger guarantee for anytime-valid inference. This facilitates the transformation of any existing DML method to provide anytime-valid guarantees with minimal modifications, making it highly adaptable and easy to use. We illustrate our procedure using two instances: a) local average treatment effect in online experiments with non-compliance, and b) partial identification of average treatment effect in observational studies with potential unmeasured confounding.
    Date: 2024–08
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2408.09598
  7. By: Lucas Zhang
    Abstract: This paper extends difference-in-differences to settings involving continuous treatments. Specifically, the average treatment effect on the treated (ATT) at any level of continuous treatment intensity is identified using a conditional parallel trends assumption. In this framework, estimating the ATTs requires first estimating infinite-dimensional nuisance parameters, especially the conditional density of the continuous treatment, which can introduce significant biases. To address this challenge, estimators for the causal parameters are proposed under the double/debiased machine learning framework. We show that these estimators are asymptotically normal and provide consistent variance estimators. To illustrate the effectiveness of our methods, we re-examine the study by Acemoglu and Finkelstein (2008), which assessed the effects of the 1983 Medicare Prospective Payment System (PPS) reform. By reinterpreting their research design using a difference-in-differences approach with continuous treatment, we nonparametrically estimate the treatment effects of the 1983 PPS reform, thereby providing a more detailed understanding of its impact.
    Date: 2024–08
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2408.10509
  8. By: Guanghui Pan
    Abstract: This paper serves as a literature review of methodology concerning the (modern) causal inference methods to address the causal estimand with observational/survey data that have been or will be used in social science research. Mainly, this paper is divided into two parts: inference from statistical estimand for the causal estimand, in which we reviewed the assumptions for causal identification and the methodological strategies addressing the problems if some of the assumptions are violated. We also discuss the asymptotical analysis concerning the measure from the observational data to the theoretical measure and replicate the deduction of the efficient/doubly robust average treatment effect estimator, which is commonly used in current social science analysis.
    Date: 2024–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2408.00032
  9. By: Bhattacharjee, A.; Ditzen, J.,; Holly, S.
    Abstract: The literature on panel models has made considerable progress in the last few decades, integrating non-stationary data both in the time and spatial domain. However, there remains a gap in the literature that simultaneously models non-stationarity and cointegration in both the time and spatial dimensions. This paper develops Granger representation theorems for spatial and spatio-temporal dynamics. In a panel setting, this provides a way to represent both spatial and temporal equilibria and dynamics as error correction models. This requires potentially two different processes for modelling spatial (or network) dynamics, both of which can be expressed in terms of spatial weights matrices. The first captures strong cross-sectional dependence, so that a spatial difference, suitably defined, is weakly cross-section dependent (granular) but can be nonstationary. The second is a conventional weights matrix that captures short-run spatio-temporal dynamics as stationary and granular processes. In large samples, cross-section averages serve the first purpose and we propose the mean group, common correlated effects estimator together with multiple testing of cross-correlations to provide the short-run spatial weights. We apply this model to house prices in the 375 MSAs of the US. We show that our approach is useful for capturing both weak and strong cross-section dependence, and partial adjustment to two long-run equilibrium relationships in terms of time and space.
    Keywords: Spatio-temporal dynamics, Error Correction Models, Weak and strong cross sectional dependence, US house prices, Spatial Weight matrices, Common Correlated Effects Estimator.
    JEL: C21 C22 C23 R3
    Date: 2024–08–19
    URL: https://d.repec.org/n?u=RePEc:cam:camdae:2447
  10. By: Harvey, A.; Simons, J.
    Abstract: Threshold models are set up so that there is a switch between regimes for the parameters of an unobserved components model. When Gaussianity is assumed, the model is handled by the Kalman filter. The switching depends on a component crossing a boundary, and, because the component is not observed directly, the error in its estimation leads naturally to a smooth transition mechanism. A prominent example motivating thresholds is that of a cyclical time series characterized by a downturn that is more, or less, rapid than the upturn. The situation is illustrated by fitting a model with three potentially asymmetric cycles, each with its own threshold, to observations on ice volume in Antarctica since 799, 000 BCE. The model is able to produce multi-step forecasts with associated prediction intervals. A second example shows how a hidden threshold model is able to deal with the asymmetric cycle in monthly US unemployment.
    Keywords: Conditionally Gaussian state space model, Kalman filter, nonlinear time series model, regimes, smooth transition autoregressive model, unobserved components
    JEL: C22
    Date: 2024–08–21
    URL: https://d.repec.org/n?u=RePEc:cam:camdae:2448
  11. By: Karch, Julian D.; Perez-Alonso, Andres F.; Bergsma, Wicher P.
    Abstract: When examining whether two continuous variables are associated, tests based on Pearson’s, Kendall’s, and Spearman’s correlation coefficients are typically used. This paper explores modern nonparametric independence tests as an alternative, which, unlike traditional tests, have the ability to potentially detect any type of relationship. In addition to existing modern nonparametric independence tests, we developed and considered two novel variants of existing tests, most notably the Heller-Heller-Gorfine-Pearson (HHG-Pearson) test. We conducted a simulation study to compare traditional independence tests, such as Pearson’s correlation, and the modern nonparametric independence tests in situations commonly encountered in psychological research. As expected, no test had the highest power across all relationships. However, the distance correlation and the HHG-Pearson tests were found to have substantially greater power than all traditional tests for many relationships and only slightly less power in the worst case. A similar pattern was found in favor of the HHG-Pearson test compared to the distance correlation test. However, given that distance correlation performed better for linear relationships and is more widely accepted, we suggest considering its use in place or additional to traditional methods when there is no prior knowledge of the relationship type, as is often the case in psychological research.
    Keywords: correlation; hypothesis test; independence; nonparametric; relationship
    JEL: C1
    Date: 2024–08–04
    URL: https://d.repec.org/n?u=RePEc:ehl:lserod:124587
  12. By: Zhang, Junyi; Dassios, Angelos
    Abstract: In this paper, we study the finite approximation of the completely random measure (CRM) by truncating its Ferguson-Klass representation. The approximation is obtained by keeping the N largest atom weights of the CRM unchanged and combining the smaller atom weights into a single term.We develop the simulation algorithms for the approximation and characterise its posterior distribution, for which a blocked Gibbs sampler is devised.We demonstrate the usage of the approximation in two models. The first assumes such an approximation as the mixing distribution of a Bayesian nonparametric mixture model and leads to a finite approximation to the model posterior. The second concerns the finite approximation to the Caron-Fox model. Examples and numerical implementations are given based on the gamma, stable and generalised gamma processes.
    Keywords: Bayesian nonparametric statistics; completely random measures; blocked Gibbs sampler; approximate inference; generalised gamma process
    JEL: C1
    Date: 2024–03–19
    URL: https://d.repec.org/n?u=RePEc:ehl:lserod:122228
  13. By: Kerwin, Jason (University of Washington); Rostom, Nada (University of Antwerp); Sterck, Olivier (University of Oxford)
    Abstract: Economists often use balance tests to demonstrate that the treatment and control groups are comparable prior to an intervention. We show that typical implementations of balance tests have poor statistical properties. Pairwise t-tests leave it unclear how many rejections indicate overall imbalance. Omnibus tests of joint orthogonality, in which the treatment is regressed on all the baseline covariates, address this ambiguity but substantially over-reject the null hypothesis using the sampling-based p-values that are typical in the literature. This problem is exacerbated when the number of covariates is high compared to the number of observations. We examine the performance of alternative tests, and show that omnibus F-tests of joint orthogonality with randomization inference p-values have the correct size and reasonable power. We apply these tests to data from two prominent recent articles, where standard F-tests indicate imbalance, and show that the study arms are actually balanced when appropriate tests are used.
    Keywords: balance tests, power, size, randomization inference
    JEL: C1 C9 O12
    Date: 2024–08
    URL: https://d.repec.org/n?u=RePEc:iza:izadps:dp17217
  14. By: Adam Dearing; Lorenzo Magnolfi; Daniel Quint; Christopher J. Sullivan; Sarah B. Waldfogel
    Abstract: Researchers often test firm conduct models using pass-through regressions or instrumental variables (IV) methods. The former has limited applicability; the latter relies on potentially irrelevant instruments. We show the falsifiable restriction underlying the IV method generalizes the pass-through regression, and cost pass-through differences are the economic determinants of instrument relevance. We analyze standard instruments' relevance and link instrument selection to target counterfactuals. We illustrate our findings via simulations and an application to the Washington marijuana market. Testing conduct using targeted instruments, we find the optimal ad valorem tax closely matches the actual rate.
    JEL: L0
    Date: 2024–08
    URL: https://d.repec.org/n?u=RePEc:nbr:nberwo:32863
  15. By: Meyer-Gohde, Alexander
    Abstract: I provide a solution method in the frequency domain for multivariate linear rational expectations models. The method works with the generalized Schur decomposition, providing a numerical implementation of the underlying analytic function solution methods suitable for standard DSGE estimation and analysis procedures. This approach generalizes the time-domain restriction of autoregressive-moving average exogenous driving forces to arbitrary covariance stationary processes. Applied to the standard New Keynesian model, I find that a Bayesian analysis favors a single parameter log harmonic function of the lag operator over the usual AR(1) assumption as it generates humped shaped autocorrelation patterns more consistent with the data.
    Keywords: DSGE, solution methods, spectral methods, Bayesian estimation, general exogenous processes
    JEL: C32 C62 C63 E17 E47
    Date: 2024
    URL: https://d.repec.org/n?u=RePEc:zbw:imfswp:302176
  16. By: Zitian Gao; Yihao Xiao
    Abstract: In the Venture Capital(VC) industry, predicting the success of startups is challenging due to limited financial data and the need for subjective revenue forecasts. Previous methods based on time series analysis or deep learning often fall short as they fail to incorporate crucial inter-company relationships such as competition and collaboration. Regarding the issues, we propose a novel approach using GrahphRAG augmented time series model. With GraphRAG, time series predictive methods are enhanced by integrating these vital relationships into the analysis framework, allowing for a more dynamic understanding of the startup ecosystem in venture capital. Our experimental results demonstrate that our model significantly outperforms previous models in startup success predictions. To the best of our knowledge, our work is the first application work of GraphRAG.
    Date: 2024–08
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2408.09420
  17. By: Parvin Malekzadeh; Zissis Poulos; Jacky Chen; Zeyu Wang; Konstantinos N. Plataniotis
    Abstract: Recent advancements in Distributional Reinforcement Learning (DRL) for modeling loss distributions have shown promise in developing hedging strategies in derivatives markets. A common approach in DRL involves learning the quantiles of loss distributions at specified levels using Quantile Regression (QR). This method is particularly effective in option hedging due to its direct quantile-based risk assessment, such as Value at Risk (VaR) and Conditional Value at Risk (CVaR). However, these risk measures depend on the accurate estimation of extreme quantiles in the loss distribution's tail, which can be imprecise in QR-based DRL due to the rarity and extremity of tail data, as highlighted in the literature. To address this issue, we propose EXtreme DRL (EX-DRL), which enhances extreme quantile prediction by modeling the tail of the loss distribution with a Generalized Pareto Distribution (GPD). This method introduces supplementary data to mitigate the scarcity of extreme quantile observations, thereby improving estimation accuracy through QR. Comprehensive experiments on gamma hedging options demonstrate that EX-DRL improves existing QR-based models by providing more precise estimates of extreme quantiles, thereby improving the computation and reliability of risk metrics for complex financial risk management.
    Date: 2024–08
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2408.12446
  18. By: S\"oren Bettels; Stefan Weber
    Abstract: Distortion risk measures play a critical role in quantifying risks associated with uncertain outcomes. Accurately estimating these risk measures in the context of computationally expensive simulation models that lack analytical tractability is fundamental to effective risk management and decision making. In this paper, we propose an efficient important sampling method for distortion risk measures in such models that reduces the computational cost through machine learning. We demonstrate the applicability and efficiency of the Monte Carlo method in numerical experiments on various distortion risk measures and models.
    Date: 2024–08
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2408.02401
  19. By: Cameron Cornell; Lewis Mitchell; Matthew Roughan
    Abstract: Financial networks can be constructed using statistical dependencies found within the price series of speculative assets. Across the various methods used to infer these networks, there is a general reliance on predictive modelling to capture cross-correlation effects. These methods usually model the flow of mean-response information, or the propagation of volatility and risk within the market. Such techniques, though insightful, don't fully capture the broader distribution-level causality that is possible within speculative markets. This paper introduces a novel approach, combining quantile regression with a piecewise linear embedding scheme - allowing us to construct causality networks that identify the complex tail interactions inherent to financial markets. Applying this method to 260 cryptocurrency return series, we uncover significant tail-tail causal effects and substantial causal asymmetry. We identify a propensity for coins to be self-influencing, with comparatively sparse cross variable effects. Assessing all link types in conjunction, Bitcoin stands out as the primary influencer - a nuance that is missed in conventional linear mean-response analyses. Our findings introduce a comprehensive framework for modelling distributional causality, paving the way towards more holistic representations of causality in financial markets.
    Date: 2024–08
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2408.12210

This nep-ecm issue is ©2024 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.