nep-ecm New Economics Papers
on Econometrics
Issue of 2025–03–24
twenty-two papers chosen by
Sune Karlsson, Örebro universitet


  1. Self-Normalized Inference in (Quantile, Expected Shortfall) Regressions for Time Series By Yannick Hoga; Christian Schulz
  2. Residualised Treatment Intensity and the Estimation of Average Partial Effects By Julius Sch\"aper
  3. Empirical likelihood approach for high-dimensional moment restrictions with dependent data By Jinyuan Chang; Qiao Hu; Zhentao Shi; Jia Zhang
  4. Regression Modeling of the Count Relational Data with Exchangeable Dependencies By Wenqin Du; Bailey K. Fosdick; Wen Zhou
  5. Semiparametric Triple Difference Estimators By Sina Akbari; Negar Kiyavash; AmirEmad Ghassami
  6. Balancing Flexibility and Interpretability: A Conditional Linear Model Estimation via Random Forest By Ricardo Masini; Marcelo Medeiros
  7. Causal Inference for Qualitative Outcomes By Riccardo Di Francesco; Giovanni Mellace
  8. Binary Outcome Models with Extreme Covariates: Estimation and Prediction By Laura Liu; Yulong Wang
  9. Time-Varying Identification of Structural Vector Autoregressions By Annika Camehl; Tomasz Wo\'zniak
  10. Robust Inference for the Direct Average Treatment Effect with Treatment Assignment Interference By Matias D. Cattaneo; Yihan He; Ruiqi; Yu
  11. Conditional Triple Difference-in-Differences By Dor Leventer
  12. Triple Difference Designs with Heterogeneous Treatment Effects By Laura Caron
  13. Functional Network Autoregressive Models for Panel Data By Tomohiro Ando; Tadao Hoshino
  14. A Supervised Screening and Regularized Factor-Based Method for Time Series Forecasting By Sihan Tu; Zhaoxing Gao
  15. On (in)consistency of M-estimators under contamination By Jens Klooster; Bent Nielsen
  16. Gradients can train reward models: An Empirical Risk Minimization Approach for Offline Inverse RL and Dynamic Discrete Choice Model By Enoch H. Kang; Hema Yoganarasimhan; Lalit Jain
  17. Enhancing External Validity of Experiments with Ongoing Sampling By Chen Wang; Shichao Han; Shan Huang
  18. Tensor dynamic conditional correlation model: A new way to pursuit "Holy Grail of investing" By Cheng Yu; Zhoufan Zhu; Ke Zhu
  19. Modifying Final Splits of Classification Tree for Fine-tuning Subpopulation Target in Policy Making By Lei Bill Wang; Zhenbang Jiao; Fangyi Wang
  20. Generalized Factor Neural Network Model for High-dimensional Regression By Zichuan Guo; Mihai Cucuringu; Alexander Y. Shestopaloff
  21. Clustered Network Connectedness: A New Measurement Framework with Application to Global Equity Markets By Bastien Buchwalter; Francis X. Diebold; Kamil Yilmaz
  22. Machine Learning for Propensity Score Estimation: A Systematic Review and Reporting Guidelines By Leite, Walter; Zhang, Huibin; collier, zachary; Chawla, Kamal; , l.kong@ufl.edu; Lee, Yongseok; Quan, Jia; Soyoye, Olushola

  1. By: Yannick Hoga; Christian Schulz
    Abstract: This paper is the first to propose valid inference tools, based on self-normalization, in time series expected shortfall regressions. In doing so, we propose a novel two-step estimator for expected shortfall regressions which is based on convex optimization in both steps (rendering computation easy) and it only requires minimization of quantile losses and squared error losses (methods for both of which are implemented in every standard statistical computing package). As a corollary, we also derive self-normalized inference tools in time series quantile regressions. Extant methods, based on a bootstrap or direct estimation of the long-run variance, are computationally more involved, require the choice of tuning parameters and have serious size distortions when the regression errors are strongly serially dependent. In contrast, our inference tools only require estimates of the quantile regression parameters that are computed on an expanding window and are correctly sized. Simulations show the advantageous finite-sample properties of our methods. Finally, two applications to stock return predictability and to Growth-at-Risk demonstrate the practical usefulness of the developed inference tools.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.10065
  2. By: Julius Sch\"aper
    Abstract: This paper introduces R-OLS, an estimator for the average partial effect (APE) of a continuous treatment variable on an outcome variable in the presence of non-linear and non-additively separable confounding of unknown form. Identification of the APE is achieved by generalising Stein's Lemma (Stein, 1981), leveraging an exogenous error component in the treatment along with a flexible functional relationship between the treatment and the confounders. The identification results for R-OLS are used to characterize the properties of Double/Debiased Machine Learning (Chernozhukov et al., 2018), specifying the conditions under which the APE is estimated consistently. A novel decomposition of the ordinary least squares estimand provides intuition for these results. Monte Carlo simulations demonstrate that the proposed estimator outperforms existing methods, delivering accurate estimates of the true APE and exhibiting robustness to moderate violations of its underlying assumptions. The methodology is further illustrated through an empirical application to Fetzer (2019).
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.10301
  3. By: Jinyuan Chang; Qiao Hu; Zhentao Shi; Jia Zhang
    Abstract: Economic and financial models -- such as vector autoregressions, local projections, and multivariate volatility models -- feature complex dynamic interactions and spillovers across many time series. These models can be integrated into a unified framework, with high-dimensional parameters identified by moment conditions. As the number of parameters and moment conditions may surpass the sample size, we propose adding a double penalty to the empirical likelihood criterion to induce sparsity and facilitate dimension reduction. Notably, we utilize a marginal empirical likelihood approach despite temporal dependence in the data. Under regularity conditions, we provide asymptotic guarantees for our method, making it an attractive option for estimating large-scale multivariate time series models. We demonstrate the versatility of our procedure through extensive Monte Carlo simulations and three empirical applications, including analyses of US sectoral inflation rates, fiscal multipliers, and volatility spillover in China's banking sector.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.18970
  4. By: Wenqin Du; Bailey K. Fosdick; Wen Zhou
    Abstract: Relational data characterized by directed edges with count measurements are common in social science. Most existing methods either assume the count edges are derived from continuous random variables or model the edge dependency by parametric distributions. In this paper, we develop a latent multiplicative Poisson model for relational data with count edges. Our approach directly models the edge dependency of count data by the pairwise dependence of latent errors, which are assumed to be weakly exchangeable. This assumption not only covers a variety of common network effects, but also leads to a concise representation of the error covariance. In addition, the identification and inference of the mean structure, as well as the regression coefficients, depend on the errors only through their covariance. Such a formulation provides substantial flexibility for our model. Based on this, we propose a pseudo-likelihood based estimator for the regression coefficients, demonstrating its consistency and asymptotic normality. The newly suggested method is applied to a food-sharing network, revealing interesting network effects in gift exchange behaviors.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.11255
  5. By: Sina Akbari; Negar Kiyavash; AmirEmad Ghassami
    Abstract: The triple difference causal inference framework is an extension of the well-known difference-in-differences framework. It relaxes the parallel trends assumption of the difference-in-differences framework through leveraging data from an auxiliary domain. Despite being commonly applied in empirical research, the triple difference framework has received relatively limited attention in the statistics literature. Specifically, investigating the intricacies of identification and the design of robust and efficient estimators for this framework has remained largely unexplored. This work aims to address these gaps in the literature. From the identification standpoint, we present outcome regression and weighting methods to identify the average treatment effect on the treated in both panel data and repeated cross-section settings. For the latter, we relax the commonly made assumption of time-invariant covariates. From the estimation perspective, we consider semiparametric estimators for the triple difference framework in both panel data and repeated cross-sections settings. We demonstrate that our proposed estimators are doubly robust.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.19788
  6. By: Ricardo Masini; Marcelo Medeiros
    Abstract: Traditional parametric econometric models often rely on rigid functional forms, while nonparametric techniques, despite their flexibility, frequently lack interpretability. This paper proposes a parsimonious alternative by modeling the outcome $Y$ as a linear function of a vector of variables of interest $\boldsymbol{X}$, conditional on additional covariates $\boldsymbol{Z}$. Specifically, the conditional expectation is expressed as $\mathbb{E}[Y|\boldsymbol{X}, \boldsymbol{Z}]=\boldsymbol{X}^{T}\boldsymbol{\beta}(\boldsymbol{Z})$, where $\boldsymbol{\beta}(\cdot)$ is an unknown Lipschitz-continuous function. We introduce an adaptation of the Random Forest (RF) algorithm to estimate this model, balancing the flexibility of machine learning methods with the interpretability of traditional linear models. This approach addresses a key challenge in applied econometrics by accommodating heterogeneity in the relationship between covariates and outcomes. Furthermore, the heterogeneous partial effects of $\boldsymbol{X}$ on $Y$ are represented by $\boldsymbol{\beta}(\cdot)$ and can be directly estimated using our proposed method. Our framework effectively unifies established parametric and nonparametric models, including varying-coefficient, switching regression, and additive models. We provide theoretical guarantees, such as pointwise and $L^p$-norm rates of convergence for the estimator, and establish a pointwise central limit theorem through subsampling, aiding inference on the function $\boldsymbol\beta(\cdot)$. We present Monte Carlo simulation results to assess the finite-sample performance of the method.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.13438
  7. By: Riccardo Di Francesco; Giovanni Mellace
    Abstract: Causal inference methods such as instrumental variables, regression discontinuity, and difference-in-differences are widely used to estimate treatment effects. However, their application to qualitative outcomes poses fundamental challenges, as standard causal estimands are ill-defined in this context. This paper highlights these issues and introduces an alternative framework that focuses on well-defined and interpretable estimands that quantify how treatment affects the probability distribution over outcome categories. We establish that standard identification assumptions are sufficient for identification and propose simple, intuitive estimation strategies that remain fully compatible with conventional econometric methods. To facilitate implementation, we provide an open-source R package, $\texttt{causalQual}$, which is publicly available on GitHub.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.11691
  8. By: Laura Liu; Yulong Wang
    Abstract: This paper presents a novel semiparametric method to study the effects of extreme events on binary outcomes and subsequently forecast future outcomes. Our approach, based on Bayes' theorem and regularly varying (RV) functions, facilitates a Pareto approximation in the tail without imposing parametric assumptions beyond the tail. We analyze cross-sectional as well as static and dynamic panel data models, incorporate additional covariates, and accommodate the unobserved unit-specific tail thickness and RV functions in panel data. We establish consistency and asymptotic normality of our tail estimator, and show that our objective function converges to that of a panel Logit regression on tail observations with the log extreme covariate as a regressor, thereby simplifying implementation. The empirical application assesses whether small banks become riskier when local housing prices sharply decline, a crucial channel in the 2007--2008 financial crisis.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.16041
  9. By: Annika Camehl (Erasmus University Rotterdam); Tomasz Wo\'zniak (University of Melbourne)
    Abstract: We propose a novel Bayesian heteroskedastic Markov-switching structural vector autoregression with data-driven time-varying identification. The model selects among alternative patterns of exclusion restrictions to identify structural shocks within the Markov process regimes. We implement the selection through a multinomial prior distribution over these patterns, which is a spike'n'slab prior for individual parameters. By combining a Markov-switching structural matrix with heteroskedastic structural shocks following a stochastic volatility process, the model enables shock identification through time-varying volatility within a regime. As a result, the exclusion restrictions become over-identifying, and their selection is driven by the signal from the data. Our empirical application shows that data support time variation in the US monetary policy shock identification. We also verify that time-varying volatility identifies the monetary policy shock within the regimes.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.19659
  10. By: Matias D. Cattaneo (Rae); Yihan He (Rae); Ruiqi (Rae); Yu
    Abstract: Uncertainty quantification in causal inference settings with random network interference is a challenging open problem. We study the large sample distributional properties of the classical difference-in-means Hajek treatment effect estimator, and propose a robust inference procedure for the (conditional) direct average treatment effect, allowing for cross-unit interference in both the outcome and treatment equations. Leveraging ideas from statistical physics, we introduce a novel Ising model capturing interference in the treatment assignment, and then obtain three main results. First, we establish a Berry-Esseen distributional approximation pointwise in the degree of interference generated by the Ising model. Our distributional approximation recovers known results in the literature under no-interference in treatment assignment, and also highlights a fundamental fragility of inference procedures developed using such a pointwise approximation. Second, we establish a uniform distributional approximation for the Hajek estimator, and develop robust inference procedures that remain valid regardless of the unknown degree of interference in the Ising model. Third, we propose a novel resampling method for implementation of robust inference procedure. A key technical innovation underlying our work is a new \textit{De-Finetti Machine} that facilitates conditional i.i.d. Gaussianization, a technique that may be of independent interest in other settings.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.13238
  11. By: Dor Leventer
    Abstract: Triple difference-in-differences (TDID) designs are widely used in empirical research to estimate causal effects. In practice, most implementations rely on a specification with controls. However, we show that such approaches introduce bias due to differences in covariate distributions across groups. To address this issue, we propose a re-weighted estimator that correctly identifies a causal estimand of interest by aligning covariate distributions across groups. For estimation we develop a double-robust approach. A R package is provided for general use.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.16126
  12. By: Laura Caron
    Abstract: Triple difference designs have become increasingly popular in empirical economics. The advantage of a triple difference design is that, within treatment group, it allows for another subgroup of the population -- potentially less impacted by the treatment -- to serve as a control for the subgroup of interest. While literature on difference-in-differences has discussed heterogeneity in treatment effects between treated and control groups or over time, little attention has been given to the implications of heterogeneity in treatment effects between subgroups. In this paper, I show that interpretation of the usual triple difference parameter of interest, the difference in average treatment effects on the treated between subgroups, may be affected by this kind of heterogeneity. I propose a new parameter of interest, the causal difference in average treatment effects on the treated, which makes causal comparisons between subgroups. I discuss assumptions for identification and derive the semiparametric efficiency bounds for this parameter. I then propose doubly-robust, efficient estimators for this parameter. I use a simulation study to highlight the desirable finite-sample properties of these estimators, as well as to show the difference between this parameter and the usual triple difference parameter of interest. An empirical application shows the importance of considering treatment effect heterogeneity in practical applications.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.19620
  13. By: Tomohiro Ando; Tadao Hoshino
    Abstract: This study proposes a novel functional vector autoregressive framework for analyzing network interactions of functional outcomes in panel data settings. In this framework, an individual's outcome function is influenced by the outcomes of others through a simultaneous equation system. To estimate the functional parameters of interest, we need to address the endogeneity issue arising from these simultaneous interactions among outcome functions. This issue is carefully handled by developing a novel functional moment-based estimator. We establish the consistency, convergence rate, and pointwise asymptotic normality of the proposed estimator. Additionally, we discuss the estimation of marginal effects and impulse response analysis. As an empirical illustration, we analyze the demand for a bike-sharing service in the U.S. The results reveal statistically significant spatial interactions in bike availability across stations, with interaction patterns varying over the time of day.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.13431
  14. By: Sihan Tu; Zhaoxing Gao
    Abstract: Factor-based forecasting using Principal Component Analysis (PCA) is an effective machine learning tool for dimension reduction with many applications in statistics, economics, and finance. This paper introduces a Supervised Screening and Regularized Factor-based (SSRF) framework that systematically addresses high-dimensional predictor sets through a structured four-step procedure integrating both static and dynamic forecasting mechanisms. The static approach selects predictors via marginal correlation screening and scales them using univariate predictive slopes, while the dynamic method screens and scales predictors based on time series regression incorporating lagged predictors. PCA then extracts latent factors from the scaled predictors, followed by LASSO regularization to refine predictive accuracy. In the simulation study, we validate the effectiveness of SSRF and identify its parameter adjustment strategies in high-dimensional data settings. An empirical analysis of macroeconomic indices in China demonstrates that the SSRF method generally outperforms several commonly used forecasting techniques in out-of-sample predictions.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.15275
  15. By: Jens Klooster; Bent Nielsen
    Abstract: We consider robust location-scale estimators under contamination. We show that commonly used robust estimators such as the median and the Huber estimator are inconsistent under asymmetric contamination, while the Tukey estimator is consistent. In order to make nuisance parameter free inference based on the Tukey estimator a consistent scale estimator is required. However, standard robust scale estimators such as the interquartile range and the median absolute deviation are inconsistent under contamination.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.09145
  16. By: Enoch H. Kang; Hema Yoganarasimhan; Lalit Jain
    Abstract: We study the problem of estimating Dynamic Discrete Choice (DDC) models, also known as offline Maximum Entropy-Regularized Inverse Reinforcement Learning (offline MaxEnt-IRL) in machine learning. The objective is to recover reward or $Q^*$ functions that govern agent behavior from offline behavior data. In this paper, we propose a globally convergent gradient-based method for solving these problems without the restrictive assumption of linearly parameterized rewards. The novelty of our approach lies in introducing the Empirical Risk Minimization (ERM) based IRL/DDC framework, which circumvents the need for explicit state transition probability estimation in the Bellman equation. Furthermore, our method is compatible with non-parametric estimation techniques such as neural networks. Therefore, the proposed method has the potential to be scaled to high-dimensional, infinite state spaces. A key theoretical insight underlying our approach is that the Bellman residual satisfies the Polyak-Lojasiewicz (PL) condition -- a property that, while weaker than strong convexity, is sufficient to ensure fast global convergence guarantees. Through a series of synthetic experiments, we demonstrate that our approach consistently outperforms benchmark methods and state-of-the-art alternatives.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.14131
  17. By: Chen Wang; Shichao Han; Shan Huang
    Abstract: Participants in online experiments often enroll over time, which can compromise sample representativeness due to temporal shifts in covariates. This issue is particularly critical in A/B tests, online controlled experiments extensively used to evaluate product updates, since these tests are cost-sensitive and typically short in duration. We propose a novel framework that dynamically assesses sample representativeness by dividing the ongoing sampling process into three stages. We then develop stage-specific estimators for Population Average Treatment Effects (PATE), ensuring that experimental results remain generalizable across varying experiment durations. Leveraging survival analysis, we develop a heuristic function that identifies these stages without requiring prior knowledge of population or sample characteristics, thereby keeping implementation costs low. Our approach bridges the gap between experimental findings and real-world applicability, enabling product decisions to be based on evidence that accurately represents the broader target population. We validate the effectiveness of our framework on three levels: (1) through a real-world online experiment conducted on WeChat; (2) via a synthetic experiment; and (3) by applying it to 600 A/B tests on WeChat in a platform-wide application. Additionally, we provide practical guidelines for practitioners to implement our method in real-world settings.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.18253
  18. By: Cheng Yu; Zhoufan Zhu; Ke Zhu
    Abstract: Style investing creates asset classes (or the so-called "styles") with low correlations, aligning well with the principle of "Holy Grail of investing" in terms of portfolio selection. The returns of styles naturally form a tensor-valued time series, which requires new tools for studying the dynamics of the conditional correlation matrix to facilitate the aforementioned principle. Towards this goal, we introduce a new tensor dynamic conditional correlation (TDCC) model, which is based on two novel treatments: trace-normalization and dimension-normalization. These two normalizations adapt to the tensor nature of the data, and they are necessary except when the tensor data reduce to vector data. Moreover, we provide an easy-to-implement estimation procedure for the TDCC model, and examine its finite sample performance by simulations. Finally, we assess the usefulness of the TDCC model in international portfolio selection across ten global markets and in large portfolio selection for 1800 stocks from the Chinese stock market.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.13461
  19. By: Lei Bill Wang; Zhenbang Jiao; Fangyi Wang
    Abstract: Policymakers often use Classification and Regression Trees (CART) to partition populations based on binary outcomes and target subpopulations whose probability of the binary event exceeds a threshold. However, classic CART and knowledge distillation method whose student model is a CART (referred to as KD-CART) do not minimize the misclassification risk associated with classifying the latent probabilities of these binary events. To reduce the misclassification risk, we propose two methods, Penalized Final Split (PFS) and Maximizing Distance Final Split (MDFS). PFS incorporates a tunable penalty into the standard CART splitting criterion function. MDFS maximizes a weighted sum of distances between node means and the threshold. It can point-identify the optimal split under the unique intersect latent probability assumption. In addition, we develop theoretical result for MDFS splitting rule estimation, which has zero asymptotic risk. Through extensive simulation studies, we demonstrate that these methods predominately outperform classic CART and KD-CART in terms of misclassification error. Furthermore, in our empirical evaluations, these methods provide deeper insights than the two baseline methods.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.15072
  20. By: Zichuan Guo; Mihai Cucuringu; Alexander Y. Shestopaloff
    Abstract: We tackle the challenges of modeling high-dimensional data sets, particularly those with latent low-dimensional structures hidden within complex, non-linear, and noisy relationships. Our approach enables a seamless integration of concepts from non-parametric regression, factor models, and neural networks for high-dimensional regression. Our approach introduces PCA and Soft PCA layers, which can be embedded at any stage of a neural network architecture, allowing the model to alternate between factor modeling and non-linear transformations. This flexibility makes our method especially effective for processing hierarchical compositional data. We explore ours and other techniques for imposing low-rank structures on neural networks and examine how architectural design impacts model performance. The effectiveness of our method is demonstrated through simulation studies, as well as applications to forecasting future price movements of equity ETF indices and nowcasting with macroeconomic data.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.11310
  21. By: Bastien Buchwalter; Francis X. Diebold; Kamil Yilmaz
    Abstract: Network connections, both across and within markets, are central in countless economic contexts. In recent decades, a large literature has developed and applied flexible methods for measuring network connectedness and its evolution, based on variance decompositions from vector autoregressions (VARs), as in Diebold and Yilmaz (2014). Those VARs are, however, typically identified using full orthogonalization (Sims, 1980), or no orthogonalization (Koop, Pesaran, and Potter, 1996; Pesaran and Shin, 1998), which, although useful, are special and extreme cases of a more general framework that we develop in this paper. In particular, we allow network nodes to be connected in "clusters", such as asset classes, industries, regions, etc., where shocks are orthogonal across clusters (Sims style orthogonalized identification) but correlated within clusters (Koop-Pesaran-Potter-Shin style generalized identification), so that the ordering of network nodes is relevant across clusters but irrelevant within clusters. After developing the clustered connectedness framework, we apply it in a detailed empirical exploration of sixteen country equity markets spanning three global regions.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.15458
  22. By: Leite, Walter; Zhang, Huibin; collier, zachary; Chawla, Kamal; , l.kong@ufl.edu; Lee, Yongseok (University of Florida); Quan, Jia; Soyoye, Olushola
    Abstract: Machine learning has become a common approach for estimating propensity scores for quasi-experimental research using matching, weighting, or stratification on the propensity score. This systematic review examined machine learning applications for propensity score estimation across different fields, such as health, education, social sciences, and business over 40 years. The results show that the gradient boosting machine (GBM) is the most frequently used method, followed by random forest. Classification and regression trees (CART), neural networks, and the super learner were also used in more than five percent of studies. The most frequently used packages to estimate propensity scores were twang, gbm and randomforest in the R statistical software. The review identified many hyperparameter configurations used for machine learning methods. However, it also shows that hyperparameters are frequently under-reported, as well as critical steps of the propensity score analysis, such as the covariate balance evaluation. A set of guidelines for reporting the use of machine learning for propensity score estimation is provided.
    Date: 2024–10–09
    URL: https://d.repec.org/n?u=RePEc:osf:osfxxx:gmrk7_v1

This nep-ecm issue is ©2025 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.