nep-ecm New Economics Papers
on Econometrics
Issue of 2025–11–17
fourteen papers chosen by
Sune Karlsson, Örebro universitet


  1. Unlocking the Regression Space By Liudas Giraitis; George Kapetanios; Yufei Li; Alexia Ventouri
  2. Synthetic Parallel Trends By Yiqi Liu
  3. Cluster-robust inference with a single treated cluster using the t-test By Chun Pong Lau; Xinran Li
  4. The Exact Variance of the Average Treatment Effect Estimator in Cluster RCT By Yue Fang; Geert Ridder
  5. Training and Testing with Multiple Splits: A Central Limit Theorem for Split-Sample Estimators By Bruno Fava
  6. Residual Balancing for Non-Linear Outcome Models in High Dimensions By Isaac Meza
  7. Shrinkage Estimation and Identification of Latent Group Structures in Panel Data with Interactive Fixed Effects By Ali Mehrabani; Shahnaz Parsaeian
  8. Distributionally Robust Synthetic Control: Ensuring Robustness Against Highly Correlated Controls and Weight Shifts By Taehyeon Koo; Zijian Guo
  9. Boundary Discontinuity Designs: Theory and Practice By Matias D. Cattaneo; Rocio Titiunik; Ruiqi Rae Yu
  10. A sensitivity analysis for the average derivative effect By Jeffrey Zhang
  11. Multilevel non-linear interrupted time series analysis By RJ Waken; Fengxian Wang; Sarah A. Eisenstein; Tim McBride; Kim Johnson; Karen Joynt-Maddox
  12. Optimally-Transported Generalized Method of Moments By Susanne Schennach; Vincent Starck
  13. Multivariate AutoRegressive Smooth Liquidity (MARSLiQ) By Hafner, C. M.; Linton, O. B.; Wang, L.
  14. Fast and Slow Level Shifts in Intraday Stochastic Volatility By Martins, Igor F. B. Martins; Virbickaitè, Audronè; Nguyen, Hoang; Hedibert, Freitas Lopes

  1. By: Liudas Giraitis; George Kapetanios; Yufei Li; Alexia Ventouri
    Abstract: This paper introduces and analyzes a framework that accommodates general heterogeneity in regression modeling. It demonstrates that regression models with fixed or time-varying parameters can be estimated using the OLS and time-varying OLS methods, respectively, across a broad class of regressors and noise processes not covered by existing theory. The proposed setting facilitates the development of asymptotic theory and the estimation of robust standard errors. The robust confidence interval estimators accommodate substantial heterogeneity in both regressors and noise. The resulting robust standard error estimates coincide with White's (1980) heteroskedasticity-consistent estimator but are applicable to a broader range of conditions, including models with missing data. They are computationally simple and perform well in Monte Carlo simulations. Their robustness, generality, and ease of implementation make them highly suitable for empirical applications. Finally, the paper provides a brief empirical illustration.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.07183
  2. By: Yiqi Liu
    Abstract: Popular empirical strategies for policy evaluation in the panel data literature -- including difference-in-differences (DID), synthetic control (SC) methods, and their variants -- rely on key identifying assumptions that can be expressed through a specific choice of weights $\omega$ relating pre-treatment trends to the counterfactual outcome. While each choice of $\omega$ may be defensible in empirical contexts that motivate a particular method, it relies on fundamentally untestable and often fragile assumptions. I develop an identification framework that allows for all weights satisfying a Synthetic Parallel Trends assumption: the treated unit's trend is parallel to a weighted combination of control units' trends for a general class of weights. The framework nests these existing methods as special cases and is by construction robust to violations of their respective assumptions. I construct a valid confidence set for the identified set of the treatment effect, which admits a linear programming representation with estimated coefficients and nuisance parameters that are profiled out. In simulations where the assumptions underlying DID or SC-based methods are violated, the proposed confidence set remains robust and attains nominal coverage, while existing methods suffer severe undercoverage.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.05870
  3. By: Chun Pong Lau; Xinran Li
    Abstract: This paper considers inference when there is a single treated cluster and a fixed number of control clusters, a setting that is common in empirical work, especially in difference-in-differences designs. We use the t-statistic and develop suitable critical values to conduct valid inference under weak assumptions allowing for unknown dependence within clusters. In particular, our inference procedure does not involve variance estimation. It only requires specifying the relative heterogeneity between the variances from the treated cluster and some, but not necessarily all, control clusters. Our proposed test works for any significance level when there are at least two control clusters. When the variance of the treated cluster is bounded by those of all control clusters up to some prespecified scaling factor, the critical values for our t-statistic can be easily computed without any optimization for many conventional significance levels and numbers of clusters. In other cases, one-dimensional numerical optimization is needed and is often computationally efficient. We have also tabulated common critical values in the paper so researchers can use our test readily. We illustrate our method in simulations and empirical applications.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.05710
  4. By: Yue Fang; Geert Ridder
    Abstract: In cluster randomized controlled trials (CRCT) with a finite populations, the exact design-based variance of the Horvitz-Thompson (HT) estimator for the average treatment effect (ATE) depends on the joint distribution of unobserved cluster-aggregated potential outcomes and is therefore not point-identifiable. We study a common two-stage sampling design-random sampling of clusters followed by sampling units within sampled clusters-with treatment assigned at the cluster level. First, we derive the exact (infeasible) design-based variance of the HT ATE estimator that accounts jointly for cluster- and unit-level sampling as well as random assignment. Second, extending Aronow et al (2014), we provide a sharp, attanable upper bound on that variance and propose a consistent estimator of the bound using only observed outcomes and known sampling/assignment probabilities. In simulations and an empirical application, confidence intervals based on our bound are valid and typically narrower than those based on cluster standard errors.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.05801
  5. By: Bruno Fava
    Abstract: As predictive algorithms grow in popularity, using the same dataset to both train and test a new model has become routine across research, policy, and industry. Sample-splitting attains valid inference on model properties by using separate subsamples to estimate the model and to evaluate it. However, this approach has two drawbacks, since each task uses only part of the data, and different splits can lead to widely different estimates. Averaging across multiple splits, I develop an inference approach that uses more data for training, uses the entire sample for testing, and improves reproducibility. I address the statistical dependence from reusing observations across splits by proving a new central limit theorem for a large class of split-sample estimators under arguably mild and general conditions. Importantly, I make no restrictions on model complexity or convergence rates. I show that confidence intervals based on the normal approximation are valid for many applications, but may undercover in important cases of interest, such as comparing the performance between two models. I develop a new inference approach for such cases, explicitly accounting for the dependence across splits. Moreover, I provide a measure of reproducibility for p-values obtained from split-sample estimators. Finally, I apply my results to two important problems in development and public economics: predicting poverty and learning heterogeneous treatment effects in randomized experiments. I show that my inference approach with repeated cross-fitting achieves better power than previous alternatives, often enough to find statistical significance that would otherwise be missed.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.04957
  6. By: Isaac Meza
    Abstract: We extend the approximate residual balancing (ARB) framework to nonlinear models, answering an open problem posed by Athey et al. (2018). Our approach addresses the challenge of estimating average treatment effects in high-dimensional settings where the outcome follows a generalized linear model. We derive a new bias decomposition for nonlinear models that reveals the need for a second-order correction to account for the curvature of the link function. Based on this insight, we construct balancing weights through an optimization problem that controls for both first and second-order sources of bias. We provide theoretical guarantees for our estimator, establishing its $\sqrt{n}$-consistency and asymptotic normality under standard high-dimensional assumptions.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.00324
  7. By: Ali Mehrabani (Department of Economics, University of Kansas, Lawrence, KS 66045); Shahnaz Parsaeian (Department of Economics, University of Kansas, Lawrence, KS 66045)
    Abstract: This paper provides a framework for joint shrinkage estimation and identification of latent group structures in panel data models with interactive fixed effects and large number of explanatory variables. The latent structure of the model allows individuals to be classified into a number of groups where the number of groups and/or each individual’s group identity are unknown. A doubly penalized principal component estimation procedure using a pairwise fusion penalty and an adaptive LASSO (least absolute shrinkage and selection operator) penalty is introduced to detect the latent group structure and select the relevant regressors. To implement the proposed approach, an alternating direction method of multipliers algorithm has been developed. The proposed method is further illustrated by simulation studies and an empirical application of economic growth across various countries which demonstrate the good performance of the method.
    Keywords: ADMM algorithm, high dimensionality, interactive fixed effects, pairwise adaptive group fused LASSO, parameter heterogeneity, principal component analysis.
    JEL: C33 C38 C51
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:kan:wpaper:202516
  8. By: Taehyeon Koo; Zijian Guo
    Abstract: The synthetic control method estimates the causal effect by comparing the outcomes of a treated unit to a weighted average of control units that closely match the pre-treatment outcomes of the treated unit. This method presumes that the relationship between the potential outcomes of the treated and control units remains consistent before and after treatment. However, the estimator may become unreliable when these relationships shift or when control units are highly correlated. To address these challenges, we introduce the Distributionally Robust Synthetic Control (DRoSC) method by accommodating potential shifts in relationships and addressing high correlations among control units. The DRoSC method targets a new causal estimand defined as the optimizer of a worst-case optimization problem that checks through all possible synthetic weights that comply with the pre-treatment period. When the identification conditions for the classical synthetic control method hold, the DRoSC method targets the same causal effect as the synthetic control. When these conditions are violated, we show that this new causal estimand is a conservative proxy of the non-identifiable causal effect. We further show that the limiting distribution of the DRoSC estimator is non-normal and propose a novel inferential approach to characterize this non-normal limiting distribution. We demonstrate its finite-sample performance through numerical studies and an analysis of the economic impact of terrorism in the Basque Country.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.02632
  9. By: Matias D. Cattaneo; Rocio Titiunik; Ruiqi Rae Yu
    Abstract: We review the literature on boundary discontinuity (BD) designs, a powerful non-experimental research methodology that identifies causal effects by exploiting a thresholding treatment assignment rule based on a bivariate score and a boundary curve. This methodology generalizes standard regression discontinuity designs based on a univariate score and scalar cutoff, and has specific challenges and features related to its multi-dimensional nature. We synthesize the empirical literature by systematically reviewing over $80$ empirical papers, tracing the method's application from its formative uses to its implementation in modern research. In addition to the empirical survey, we overview the latest methodological results on identification, estimation and inference for the analysis of BD designs, and offer recommendations for practice.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.06474
  10. By: Jeffrey Zhang
    Abstract: In observational studies, exposures are often continuous rather than binary or discrete. At the same time, sensitivity analysis is an important tool that can help determine the robustness of a causal conclusion to a certain level of unmeasured confounding, which can never be ruled out in an observational study. Sensitivity analysis approaches for continuous exposures have now been proposed for several causal estimands. In this article, we focus on the average derivative effect (ADE). We obtain closed-form bounds for the ADE under a sensitivity model that constrains the odds ratio (at any two dose levels) between the latent and observed generalized propensity score. We propose flexible, efficient estimators for the bounds, as well as point-wise and simultaneous (over the sensitivity parameter) confidence intervals. We examine the finite sample performance of the methods through simulations and illustrate the methods on a study assessing the effect of parental income on educational attainment and a study assessing the price elasticity of petrol.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.06243
  11. By: RJ Waken; Fengxian Wang; Sarah A. Eisenstein; Tim McBride; Kim Johnson; Karen Joynt-Maddox
    Abstract: Recent advances in interrupted time series analysis permit characterization of a typical non-linear interruption effect through use of generalized additive models. Concurrently, advances in latent time series modeling allow efficient Bayesian multilevel time series models. We propose to combine these concepts with a hierarchical model selection prior to characterize interruption effects with a multilevel structure, encouraging parsimony and partial pooling while incorporating meaningful variability in causal effects across subpopulations of interest, while allowing poststratification. These models are demonstrated with three applications: 1) the effect of the introduction of the prostate specific antigen test on prostate cancer diagnosis rates by race and age group, 2) the change in stroke or trans-ischemic attack hospitalization rates across Medicare beneficiaries by rurality in the months after the start of the COVID-19 pandemic, and 3) the effect of Medicaid expansion in Missouri on the proportion of inpatient hospitalizations discharged with Medicaid as a primary payer by key age groupings and sex.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.05725
  12. By: Susanne Schennach; Vincent Starck
    Abstract: We propose a novel optimal transport-based version of the Generalized Method of Moment (GMM). Instead of handling overidentification by reweighting the data to satisfy the moment conditions (as in Generalized Empirical Likelihood methods), this method proceeds by allowing for errors in the variables of the least mean-square magnitude necessary to simultaneously satisfy all moment conditions. This approach, based on the notions of optimal transport and Wasserstein metric, aims to address the problem of assigning a logical interpretation to GMM results even when overidentification tests reject the null, a situation that cannot always be avoided in applications. We illustrate the method by revisiting Duranton, Morrow and Turner's (2014) study of the relationship between a city's exports and the extent of its transportation infrastructure. Our results corroborate theirs under weaker assumptions and provide insight into the error structure of the variables.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.05712
  13. By: Hafner, C. M.; Linton, O. B.; Wang, L.
    Abstract: We propose MARSLiQ (Multivariate AutoRegressive Smooth Liquidity), a new multivariate model for daily liquidity that combines slowly evolving trends with short-run dynamics to capture both persistent and transitory liquidity movements. In our framework, each asset's liquidity is decomposed into a smooth time-varying trend component and a stationary short-run component, allowing us to separate long-term liquidity levels from short-term fluctuations. The trend for each asset is estimated nonparametrically and further decomposed into a common market trend and idiosyncratic (asset-specific) trends, and seasonal trends, facilitating interpretation of market-wide liquidity shifts versus firm-level effects. We introduce a novel dynamic structure in which an asset's short-run liquidity is driven by its own past liquidity as well as by lagged liquidity of a broad liquidity index (constructed from all assets). This parsimonious specification-combining asset-specific autoregressive feedback with index-based spillovers-makes the model tractable even for high-dimensional systems, while capturing rich liquidity spillover effects across assets. Our model's structure enables a clear analysis of permanent vs. transitory liquidity shocks and their propagation throughout the market. Using the model's Vector MA representation, we perform forecast error variance decompositions to quantify how shocks to one asset's liquidity affect others over time, and we interpret these results through network connectedness measures that map out the web of liquidity interdependence across assets.
    Keywords: Forecast Error Decomposition, Liquidity Spillovers, Multiplicative Error Model, Network Connectedness, Nonparametric Trends
    JEL: C12 C14 C32 C53 C58
    Date: 2025–10–20
    URL: https://d.repec.org/n?u=RePEc:cam:camdae:2569
  14. By: Martins, Igor F. B. Martins (Örebro University School of Business); Virbickaitè, Audronè (CUNEF University, Madrid, Spain); Nguyen, Hoang (Linköping University); Hedibert, Freitas Lopes (Insper Institute of Education and Research)
    Abstract: This paper proposes a mixed-frequency stochastic volatility model for intraday returns that captures fast and slow level shifts in the volatility level induced by news from both low-frequency variables and scheduled announcements. A MIDAS component describes slow-moving changes in volatility driven by daily variables, while an announcement component captures fast eventdriven volatility bursts. Using 5-minute crude oil futures returns, we show that accounting for both fast and slow level shifts significantly improves volatility forecasts at intraday and daily horizons. The superior forecasts also translate into higher Sharpe ratios using the volatilitymanaged portfolio strategy.
    Keywords: Intraday volatility; high-frequency; announcements; MIDAS; oil; sparsity.
    JEL: C22 C52 C58 G32
    Date: 2025–11–07
    URL: https://d.repec.org/n?u=RePEc:hhs:oruesi:2025_012

This nep-ecm issue is ©2025 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.