|
on Econometrics |
| By: | Felix Weinhardt |
| Abstract: | Applied researchers commonly interpret coefficient movements across OLS and difference-in-differences specifications with varying controls as evidence of bias reduction or improved identification. This note shows that such interpretations generally fail under heterogeneous treatment effects. In this case, each specification estimates a differently weighted average treatment effect, where the weights depend on the set of included controls. Consequently, adding or removing controls can change the estimand even when the controls are irrelevant for the outcome conditional on treatment. The note further derives an augmented version of the canonical omitted variables bias formula that remains valid under heterogeneity by explicitly accounting for induced changes in weighting. |
| Keywords: | ordinary least squares, heterogeneous treatment effects, model specification and testing |
| JEL: | C31 C51 |
| Date: | 2026–04–22 |
| URL: | https://d.repec.org/n?u=RePEc:bdp:dpaper:0095 |
| By: | Ohyun Kwon (School of Economics, Drexel University); Mario Larch (University of Bayreuth); Jangsu Yoon (Department of Economics, University of Kentucky); Yoto Yotov (School of Economics, Drexel University) |
| Abstract: | We implement an instrumental-variable Poisson pseudo-maximum likelihood estimator with high-dimensional fixed effects (IV-PPML-HDFE). To correct for incidental parameter bias, we use a split-panel jackknife (SPJ) routine with bootstrapped standard errors. Monte Carlo simulations across the three most common fixed-effect structures confirm that SPJ reduces the mean absolute bias by 42% and raises mean bootstrap confidence-interval coverage from 69% to 92%. We provide a robust and user-friendly ‘ivppmlhdfe’ package, and deploy it in three empirical applications to establish the validity and usefulness of our methods. |
| Keywords: | Poisson pseudo-maximum likelihood, instrumental variables, high-dimensional fixed effects, incidental parameter problem, gravity model, split-panel jackknife. |
| JEL: | C13 C23 C26 F14 |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:drx:wpaper:202611 |
| By: | André L. S. Chagas (Department of Economics, University of São Paulo) |
| Abstract: | This paper develops specification tests for irregular network panels with time-varying and asymmetric interaction matrices. We decompose each matrix into symmetric and antisymmetric components and show that standard residual quadratic diagnostics, including Moran’s I and the LM-error statistic, are exactly invariant to the antisymmetric component. In contrast, lag-type bilinear diagnostics retain directional information. Building on this dichotomy, we derive a joint score limit for the decomposed lag alternative and propose conditional score tests for directional and contextual relevance. The directional statistic LM tests whether the antisymmetric component contributes information beyond symmetric exposure, while LM provides the corresponding contextual test. The tests have standard chi-square limits under primitive conditions for irregular panels. Monte Carlo evidence shows that the decomposed tests control size and correctly attribute network propagation across contextual and directional channels, whereas conventional quadratic diagnostics are exactly direction-blind and conventional lag diagnostics conflate the two channels. |
| Keywords: | spatial econometrics; network panels; specification testing; asymmetric weight matrix; directed propagation; contextual dependence |
| Date: | 2026 |
| URL: | https://d.repec.org/n?u=RePEc:ris:nereus:022454 |
| By: | Burkhard Raunig (Oesterreichische Nationalbank, Economic Studies Division) |
| Abstract: | Directed acyclic graphs (DAGs) provide transparent framework for encoding causal structures and identifying causal effects. This paper demonstrates how DAGs help specify local projections (LPs) for estimating causal impulse responses. Examples illustrate how graphical rules can be used to select controls and instruments for identifying overall and path-specific effects. An empirical application to uncertainty shocks reveals substantial differences in the estimated responses of German industrial production across LP designs. The underlying DAGs help explain these differences and diagnose biases arising from violations of assumed causal structures. A DAG-based instrumental-variable LP reveals pronounced negative effects of U.S. uncertainty shocks. |
| Keywords: | Directed acyclic graph; Local projections; Impulse response; Instrumental variable; Uncertainty shocks |
| JEL: | C18 C22 C26 E27 |
| Date: | 2026–01–22 |
| URL: | https://d.repec.org/n?u=RePEc:onb:oenbwp:271 |
| By: | Francesco Vidoli (Department of Economics, Society & Politics, Università di Urbino Carlo Bo); ; |
| Abstract: | Standard policy evaluation methods typically assume that treatment effects are homogeneous within fixed administrative units. However, the true policy relevant boundaries are typically unknown to the researcher, as latent territorial characteristics, such as institutional quality or local economic structure, generate unobserved spatial heterogeneity that does not align with administrative borders. To address this challenge, we propose a novel unsupervised learning algorithm that endogenously identifies geographic regimes heterogeneous in terms of causal impact. Unlike existing clustering methods that group units based on geometric density or outcome similarity, our approach partitions spatial units specifically on the basis of their causal response to treatment. By explicitly maximizing treatment effect variance subject to spatial coherence, we identify where policies have differential impacts, recovering latent economic boundaries while maintaining identification requirements. We validate the estimator through Monte Carlo simulations, demonstrating its robustness in recovering latent economic structures even in high-noise environments. Finally, we apply the method to analyse the local labour market effects of the 2001 Chinese import competition shock in the United States, revealing distinct latent spatial regimes of industrial resilience that cut across state lines. |
| Keywords: | Difference-in-Differences, Spatial Heterogeneity, Treatment Effect Heterogeneity, Clustering Algorithms, Place-Based Policies, Causal Inference |
| JEL: | C21 C23 H40 R10 |
| Date: | 2026 |
| URL: | https://d.repec.org/n?u=RePEc:urb:wpaper:26_01 |
| By: | Tirthatanmoy Das; Solomon W. Polachek |
| Abstract: | This paper shows that incorporating what we call antidotal variables (AV) into a causal treatment effects analysis can with one cross-sectional regression identify the causal effect, the spillover effect, as well as possible biases from selectivity. We apply the AV technique to analyze leave taking arising from the California Paid Family Leave (CPFL) program. Our analysis yields between a 55% and 70% larger treatment effect than the traditional DID methods, which we attribute to confounding effects and spillovers, neither of which are found in traditional studies. |
| Keywords: | Bias, California Paid Family Leave, Causality, Nullifying Effect, Spillover Effect |
| JEL: | C18 C36 I38 J18 J38 |
| Date: | 2025–09 |
| URL: | https://d.repec.org/n?u=RePEc:crm:wpaper:2574 |
| By: | Dave Donaldson; Federico Huneeus; Vincent Rollet |
| Abstract: | A widespread threat to the validity of standard policy evaluation tools is the presence of spillovers between treated and untreated groups. Economic interactions across units of analysis—due to the flow of goods, factors, and payments to and from the government, for instance—result in bias in standard estimates of objects of interest such as the average treatment effect or the total effect of a program. In this paper, we develop a suite of approaches that can enable researchers to use theory and data about economic flows and distortions in order to overcome this bias. We apply this methodology to estimate the effects of a large earthquake that struck Chile in 2010. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:chb:bcchwp:1060 |
| By: | Michael Pfarrhofer (Vienna University of Economics and Business); Anna Stelzer (Oesterreichische Nationalbank) |
| Abstract: | We assess asymmetries, nonlinearities and state dependencies in dynamic responses of the euro area to monetary policy shocks. The dataset includes macroeconomic, financial, and survey-based variables measuring credit conditions and bank lending transmission channels. These data are observed at different frequencies. We propose a multivariate nonparametric mixed-frequency model, and discuss how to compute dynamic causal effects in a nonlinear context. The results suggest limited effects of expansionary policy shocks whereas contractionary shocks yield responses in line with theory. There is little variation over the business cycle and in distinct periods such as at the effective lower bound. |
| Keywords: | nonlinear structural inference, mixed frequency data, Bayesian nonparametrics, credit channel |
| JEL: | C32 E32 E52 |
| Date: | 2026–03–16 |
| URL: | https://d.repec.org/n?u=RePEc:onb:oenbwp:276 |
| By: | Martin Biewen; Stefan Glaisner; Simon Zeller |
| Abstract: | This paper explores distributional random forests as a flexible machine learning method for analysing income distributions. Distributional random forests avoid parametric assumptions, capture complex interactions among covariates, and, once trained, provide full estimates of conditional income distributions. From these, any type of distributional index such as measures of location, inequality and poverty risk can be readily computed. They can also efficiently process grouped income data and be used as inputs for distributional decomposition methods. We consider four types of applications: (i) estimating income distributions for granular population subgroups, (ii) analysing distributional change over time, (iii) small-area estimation of income distributions, and (iv) purging spatial income distributions of differences in spatial characteristics. Our application based on the German Microcensus provides new results on the socio-economic and spatial structure of the German income distribution. |
| Keywords: | inequality, poverty, small-area estimation, grouped income data |
| JEL: | D31 I32 |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:crm:wpaper:26051 |
| By: | Maximilian Göbel (Brain); Philippe Goulet Coulombe (Université du Québec à Montréal); Karin Klieber (Oesterreichische Nationalbank) |
| Abstract: | Machine learning predictions are typically interpreted as the sum of contributions of predictors. Yet, each out-of-sample prediction can also be expressed as a linear combination of in-sample values of the predicted variable, with weights corresponding to pairwise proximity scores between current and past economic events. While this dual route leads nowhere in some contexts (e.g., large cross-sectional datasets), it provides sparser interpretations in settings with many regressors and little training data—like macroeconomic forecasting. In this case, the sequence of contributions can be visualized as a time series, allowing analysts to explain predictions as quantifiable combinations of historical analogies. Moreover, the weights can be viewed as those of a data portfolio, inspiring new diagnostic measures such as forecast concentration, short position, and turnover. We show how weights can be retrieved seamlessly for (kernel) ridge regression, random forest, boosted trees, and neural networks. Then, we apply these tools to analyze postpandemic forecasts of inflation, GDP growth, and recession probabilities. In all cases, the approach opens the black box from a new angle and demonstrates how machine learning models leverage history partly repeating itself. |
| Date: | 2025–03–27 |
| URL: | https://d.repec.org/n?u=RePEc:onb:oenbwp:265 |
| By: | Plüghan, Oliver; Rehfeld, Katharina-Maria |
| Abstract: | This paper investigates the methodological performance of Ordinary Least Squares (OLS) regression and Random Forest machine learning algorithms in measuring adjusted gender pay gaps. The research is motivated by the European Union's Pay Transparency Directive (2023/970), which mandates that employers report adjusted gender pay gaps. While Oaxaca-Blinder Decomposition and the underlying OLS regression have served as the industry standard for gap estimation, this paper examines whether machine learning approaches can better capture complex, nonlinear compensation relationships. Using synthetic datasets with controlled discrimination parameters, the study compares both methods across two sample sizes and multiple discrimination scenarios. Key findings demonstrate that both methods successfully distinguish between occupational segregation and direct wage discrimination at large sample sizes. However, at smaller sample sizes, Random Forest exhibits substantial instability whereas OLS remains slightly more stable. A methodological adjustment, training Random Forest on the larger population before applying predictions to subsets substantially improves small-sample performance. The paper concludes that OLS regression remains preferable for formal regulatory compliance due to its interpretability and stability, while Random Forest can serve as a complementary validation tool for largescale analysis. |
| Keywords: | Gender Pay Gap, Pay Transparency, OLS Regression, Random Forest, Wage Discrimination, Unexplained Wage Gap, Adjusted Gender Pay Gap |
| JEL: | J16 J31 J71 M52 C13 C45 |
| Date: | 2026 |
| URL: | https://d.repec.org/n?u=RePEc:zbw:iubhhr:340172 |