|
on Econometrics |
| By: | Yihong Liu; Gonzalo Vazquez-Bare |
| Abstract: | When estimating treatment effects with two-way fixed effects (2WFE) models, researchers often use matching as a pre-processing step when the parallel trends assumption is thought to hold conditionally on covariates. Specifically, in a first step, each treated unit is matched to one or more untreated units based on observed time-invariant covariates. In the second step, treatment effects are estimated with a 2WFE regression in the matched sample, reweighting the untreated units by the number of times they are matched. We formally analyze this common practice and highlight two problems. First, when different treatment cohorts enter treatment in different time periods, the post-matching 2WFE estimator that pools all treated cohorts has an asymptotic bias, even when the treatment effect is constant across units and over time. Second, failing to account for the variability introduced by the matching procedure yields invalid standard error estimators, which can be biased upwards or downwards depending on the data generating process. We propose simple post-matching difference-in-differences estimators that compare each treated cohort to the never-treated separately, instead of pooling all treated cohorts. We provide conditions under which these estimators are consistent for well-defined causal parameters, and derive valid standard errors that account for the matching step. We illustrate our results with simulations and with an empirical application. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.13453 |
| By: | Harold D. Chiang; Antonio F. Galvao; Chia-Min Wei |
| Abstract: | This paper develops an asymptotic and inferential theory for fixed-effects panel quantile regression (FEQR) that delivers inference robust to pervasive common shocks. Such shocks induce cross-sectional dependence that is central in many economic and financial panels but largely ignored in existing FEQR theory, which typically assumes cross-sectional independence and requires $T \gg N$. We show that the standard FEQR estimator remains asymptotically normal under the mild condition $(\log N)^2/T \to 0$, thereby accommodating empirically relevant regimes, including those with $T \ll N$. We further show that common shocks fundamentally alter the asymptotic covariance structure, rendering conventional covariance estimators inconsistent, and we propose a simple covariance estimator that remains consistent both in the presence and absence of common shocks. The proposed procedure therefore provides valid robust inference without requiring prior knowledge of the dependence structure, substantially expanding the applicability of FEQR methods in realistic panel data settings. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.19201 |
| By: | Michal Koles\'ar; Pengjin Min; Wenjie Wang; Yichong Zhang |
| Abstract: | This paper studies inference for quadratic forms of linear regression coefficients with clustered data and many covariates. Our framework covers three important special cases: instrumental variables regression with many instruments and controls, inference on variance components, and testing multiple restrictions in a linear regression. Na\"{\i}ve plug-in estimators are known to be biased. We study a leave-one-cluster-out estimator that is unbiased, and provide sufficient conditions for its asymptotic normality. For inference, we establish the consistency of a leave-three-cluster-out variance estimator under primitive conditions. In addition, we develop a novel leave-two-cluster-out variance estimator that is computationally simpler and guaranteed to be conservative under weaker conditions. Our analysis allows cluster sizes to diverge with the sample size, accommodates strong within-cluster dependence, and permits the dimension of the covariates to diverge with the sample size, potentially at the same rate. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.13537 |
| By: | Federico A. Bugni; Joel L. Horowitz; Linqi Zhang |
| Abstract: | The BLP model is the workhorse framework in empirical IO and enables estimation of demand models for differentiated products using aggregate product shares. In practice, however, the share of the outside good is often unobserved. This paper studies identification and inference in the BLP model when the share of the outside good is unobserved. We show that the model is partially identified, and we derive sharp identified sets for structural parameters and equilibrium objects. We also develop inference procedures based on moment inequalities that deliver valid confidence sets for these structural parameters and equilibrium objects. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.19154 |
| By: | Xiaojun Song; Jichao Yuan |
| Abstract: | This paper develops a novel nonparametric significance test based on a tailored nonparametric-type projected weighting function that exhibits appealing theoretical and numerical properties. We derive the asymptotic properties of the proposed test and show that it can detect local alternatives at the parametric rate. Using the nonparametric orthogonal projection, we construct a computationally convenient multiplier bootstrap to obtain critical values from the case-dependent asymptotic null distribution. Compared with the existing literature, our approach overcomes the need for a stronger compact support assumption on the density of covariates arising from random denominators. We also extend the tailor-made projection procedure to test the conditional independence assumption. The simulation experiments further illustrate the advantages of our proposed method in testing significance and conditional independence in finite samples. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.15289 |
| By: | Peter A. Zadrozny |
| Abstract: | The paper derives and proves results of Gaussian maximum likelihood estimation of constant unknowns (coefficients, covariances) and time-varying unknowns (factors, disturbances) of static and dynamic factor models and, thereby, extends the statistics and econometrics literatures on estimation and statistical evaluation of estimates of the unknowns. The paper presents a new, general, unified, and one-step-comprehensive method for simultaneously estimating and statistically evaluating all constant and time-varying unknowns of static and dynamic factor models. |
| Keywords: | differential matrix, differentiations, vectorized to Hessians |
| JEL: | C13 C32 C55 |
| Date: | 2026 |
| URL: | https://d.repec.org/n?u=RePEc:ces:ceswps:_12380 |
| By: | Donald W. K. Andrews; Ming Li; Yapeng Zheng |
| Abstract: | This paper considers confidence intervals (CIs) for the autoregressive (AR) parameter in an AR model with an AR parameter that may be close or equal to one. Existing CIs rely on the assumption of a stationary or fixed initial condition to obtain correct asymptotic coverage and good finite sample coverage. When this assumption fails, their coverage can be quite poor. In this paper, we introduce a new CI for the AR parameter whose coverage probability is completely robust to the initial condition, both asymptotically and in finite samples. This CI pays only a small price in terms of its length when the initial condition is stationary or fixed. The new CI also is robust to conditional heteroskedasticity of the errors. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.09382 |
| By: | Ana Armendariz; Martin Huber |
| Abstract: | We propose a framework for testing the homogeneity of conditional average treatment effects (CATEs) across multiple experimental and observational studies. Our approach leverages multiple randomized trials to assess whether treatment effects vary with unobserved heterogeneity that differs across trials: if CATEs are homogeneous, this indicates the absence of interactions between treatment and unobservables in the mean effect. Comparing CATEs between experimental and observational data further allows evaluation of potential confounding: if the estimands coincide, there is no unobserved confounding; if they differ, deviations may arise from unobserved confounding, effect heterogeneity, or both. We extend the framework to settings with alternative identification strategies, namely instrumental variable settings and panel data with parallel trends assumptions based on differences in differences, where effects are identified only locally for subpopulations such as compliers or treated units. In these contexts, testing homogeneity is useful for assessing whether local effects can be extrapolated to the total population. We suggest a test based on double machine learning that accommodates high-dimensional covariates in a data-driven way and investigate its finite-sample performance through a simulation study. Finally, we apply the test to the International Stroke Trial (IST), a large multi-country randomized controlled trial in patients with acute ischaemic stroke that evaluated whether early treatment with aspirin altered subsequent clinical outcomes. Our methodology provides a flexible tool for both validating identification assumptions and understanding the generalizability of estimated treatment effects. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.19703 |
| By: | Xi Wang |
| Abstract: | Distributional effects, characterized by quantile frameworks, are well-known to capture heterogeneous impacts of economic factors across the unobserved relative ranks. Censored outcome, endogenous regressor and heteroskedastic error are prevalent in empirical work, yet challenge the consistency of existing quantile estimation methods. This paper develops a Sequential Control Function Censored Quantile (SCFCQ) estimator for distributional effects in censored quantile models with unbounded endogenous regressors. Our method combines the sequential analysis with the control function approach, particularly adapting for conditional heteroskedasticity in the endogenous regressor. The estimation algorithm is a two-step procedure composed of series quantile regressions, thereby providing applied researchers with a computationally tractable and practically feasible tool. We apply the SCFCQ method to estimate heterogeneous income elasticities over household preferences using data from the UK Family Expenditure Survey. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.19279 |
| By: | George Kapetanios; Vasilis Sarafidis; Alexia Ventouri |
| Abstract: | High-dimensional regression specification and analysis is a complex and active area of research in statistics, machine learning, and econometrics. This paper proposes a new approach, Boosting with Multiple Testing (BMT), which combines forward stepwise variable selection with the multiple testing framework of Chudik et al (2018). At each stage, the model is updated by adding only the most significant regressor conditional on those already included, while a family-wise multiple testing filter is applied to the remaining candidates. In this way, the method retains the strong screening properties of Chudik et al (2018) while operating in a less greedy manner with respect to proxy and noise variables. Using sharp probability inequalities for heterogeneous strongly mixing processes from Dendramis et al (2022), we show that BMT enjoys oracle type properties relative to an approximating model that includes all true signals and excludes pure noise variables: this model is selected with probability tending to one, and the resulting estimator achieves standard parametric rates for prediction error and coefficient estimation. Additional results establish conditions under which BMT recovers the exact true model and avoids selection of proxy signals. Monte Carlo experiments indicate that BMT performs very well relative to OCMT and Lasso type procedures, delivering higher model selection accuracy and smaller RMSE for the estimated coefficients, especially under strong multicollinearity of the regressors. Two empirical illustrations based on a large set of macro-financial indicators as covariates, show that BMT yields sparse, interpretable specifications with favourable out-of-sample performance. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.19705 |
| By: | Rigato, Rodolfo Dinis |
| Abstract: | Sequence-space models are becoming increasingly popular in macroeconomics, especially in the heterogeneous-agent literature. However, the econometric toolkit for users of these models remains less developed than that available for traditional state-space methods. This note introduces an algorithm for efficiently filtering unobserved shocks in linear sequence-space models. The proposed filter solves a least-squares optimization problem in closed form and returns the expectation of unobserved shocks conditional on observed data. It handles heteroskedasticity, missing observations, measurement error, and non- Gaussian shock distributions. To illustrate its properties, I apply it to data simulated from a medium-scale heterogeneous-agent New Keynesian model and show that it accurately recovers the underlying structural shocks. JEL Classification: C32, E27, E32, E37 |
| Keywords: | filtering, least squares, sequence space |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:ecb:ecbwps:20263191 |
| By: | Kyle Schindl; Larry Wasserman |
| Abstract: | Regression discontinuity and kink designs are typically analyzed through mean effects, even when treatment changes the shape of the entire outcome distribution. To address this, we introduce distributional discontinuity designs, a framework for estimating causal effects for a scalar outcome at the boundary of a discontinuity in treatment assignment. Our estimand is the Wasserstein distance between limiting conditional outcome distributions; a single scale-interpretable measure of distribution shift. We show that this weakly bounds the average treatment effect, where equality holds if and only if the treatment effect is purely additive; thus, departure from equality measures effect heterogeneity. To further encode effect heterogeneity we show that the Wasserstein distance admits an orthogonal decomposition into squared differences in $L$-moments, thereby quantifying the contribution from location, scale, skewness, and higher-order shape components to the overall distributional distance. Next, we extend this framework to distributional kink designs by evaluating the Wasserstein derivative at a policy kink; this describes the flow of probability mass through the kink. In the case of fuzzy kink designs, we derive new identification results. Finally, we apply our methods on real data by re-analyzing two natural experiments to compare our distributional effects to traditional causal estimands. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.19290 |
| By: | Nikolaos Ignatiadis; Sid Kankanala |
| Abstract: | We study the Gaussian sequence compound decision problem and analyze a Bayesian nonparametric estimator from an empirical Bayes, regret-based perspective. Motivated by sharp results for the classical nonparametric maximum likelihood estimator (NPMLE), we ask whether an analogous guarantee can be obtained using a standard Bayesian nonparametric prior. We show that a Dirichlet-process-based Bayesian procedure achieves near-optimal regret bounds. Our main results are stated in the compound decision framework, where the mean vector is treated as fixed, while we also provide parallel guarantees under a hierarchical model in which the means are drawn from a true unknown prior distribution. The posterior mean Bayes rule is, a fortiori, admissible, whereas we show that the NPMLE plug-in rule is inadmissible. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.20115 |
| By: | Jiti Gao; Fei Liu; Bin Peng |
| Abstract: | This paper rigorously analyzes the properties of the local projection (LP) methodology within a high-dimensional (HD) framework, with a central focus on achieving robust long-horizon inference. We integrate a general dependence structure into h-step ahead forecasting models via a flexible specification of the residual terms. Additionally, we study the corresponding HD covariance matrix estimation, explicitly addressing the complexity arising from the long-horizon setting. Extensive Monte Carlo simulations are conducted to substantiate the derived theoretical findings. In the empirical study, we utilize the proposed HD LP framework to study the impact of business news attention on U.S. industry-level stock volatility. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.10415 |
| By: | Peter Caradonna; Christopher Turansick |
| Abstract: | We characterize the identified sets of a wide range of stochastic choice models, including random utility, various models of boundedly-rational behavior, and dynamic discrete choice. In each of these settings, we show two distributions over choice rules are observationally equivalent if and only if they can be obtained from one another via a finite sequence of simple swapping transforms. We leverage this to obtain complete descriptions of both the defining inequalities and extreme points of these identified sets. In cases where choice frequencies vary smoothly with some parameters, we provide a novel global-inverse result for practically testing identification. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.19950 |
| By: | Joel Persson; Jurri\"en Bakker; Dennis Bohle; Stefan Feuerriegel; Florian von Wangenheim |
| Abstract: | Heterogeneous treatment effects (HTEs) are increasingly estimated using machine learning models that produce highly personalized predictions of treatment effects. In practice, however, predicted treatment effects are rarely interpreted, reported, or audited at the individual level but, instead, are often aggregated to broader subgroups, such as demographic segments, risk strata, or markets. We show that such aggregation can induce systematic bias of the group-level causal effect: even when models for predicting the individual-level conditional average treatment effect (CATE) are correctly specified and trained on data from randomized experiments, aggregating the predicted CATEs up to the group level does not, in general, recover the corresponding group average treatment effect (GATE). We develop a unified statistical framework to detect and mitigate this form of group bias in randomized experiments. We first define group bias as the discrepancy between the model-implied and experimentally identified GATEs, derive an asymptotically normal estimator, and then provide a simple-to-implement statistical test. For mitigation, we propose a shrinkage-based bias-correction, and show that the theoretically optimal and empirically feasible solutions have closed-form expressions. The framework is fully general, imposes minimal assumptions, and only requires computing sample moments. We analyze the economic implications of mitigating detected group bias for profit-maximizing personalized targeting, thereby characterizing when bias correction alters targeting decisions and profits, and the trade-offs involved. Applications to large-scale experimental data at major digital platforms validate our theoretical results and demonstrate empirical performance. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.20383 |
| By: | Zhiheng You |
| Abstract: | We evaluate how well state-dependent local projections recover true impulse responses in nonlinear environments. Using quadratic vector autoregressions as a laboratory, we show that linear local projections fail to capture any nonlinearities when shocks are symmetrically distributed. Popular state-dependent local projections specifications capture distinct aspects of nonlinearity: those interacting shocks with their signs capture higher-order effects, while those interacting shocks with lagged states capture state dependence. However, their gains over linear specifications are concentrated in tail shocks or tail states; and, for lag-based specifications, hinge on how well the chosen observable proxies the latent state. Our proposed specification-which augments the linear specification with a squared shock term and an interaction between the shock and lagged observables-best approximates the true responses across the entire joint distribution of shocks and states. An application to monetary policy reveals economically meaningful state dependence, whereas higher-order effects, though statistically significant, prove economically modest. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.14455 |
| By: | Abhinandan Dalal; Iris Horng; Yang Feng; Dylan S. Small |
| Abstract: | Sensitivity analysis is widely used to assess the robustness of causal conclusions in observational studies, yet its interaction with the structure of measured covariates is often overlooked. When latent confounders cannot be directly adjusted for and are instead controlled using proxy variables, strong associations between exposure and measured proxies can amplify sensitivity to residual confounding. We formalize this phenomenon in linear regression settings by showing that a simple ratio involving the exposure model coefficient and residual exposure variance provides an observable measure of this increased sensitivity. Applying our framework to smoking and lung cancer, we document how growing socioeconomic stratification in smoking behavior over time leads to heightened sensitivity to unmeasured confounding in more recent data. These results highlight the importance of multicollinearity when interpreting sensitivity analyses based on proxy adjustment. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.14414 |
| By: | Webel, Karsten |
| Abstract: | The classical X-11 seasonal adjustment method for monthly and quarterly time series is equipped with routines for data-driven selections of both Henderson trendcycle filters and 3 × k seasonal moving averages, currently involving up to three candidate filters in either case. Although these routines have a long-standing tradition that can be traced back at least to 1960, they have not been adopted in a recent JDemetra+ implementation of a modified X-11 method tailored to the specifics of infra-monthly time series, such as the coexistence of multiple seasonal patterns with potentially fractional periodicities. Focusing on seasonal moving averages, we seek to fill this gap by suggesting a generic redesign of the legacy selection concept based upon the so-called moving seasonality ratio. This blueprint utilises a broader set of candidate seasonal filters and, unlike the original setting, a set of common approaches for deriving the requisite asymmetric variants. Considering intersections of multiple approach-specific selection rules stabilises the final filter choice and, what is more, naturally provides the warranted thresholds controlling the potential recalculation of the moving seasonality ratio from suitably shortened detrended observations. Our proposed redesign is illustrated using one specific rule based upon threshold quartiles and real-time data for three German macroeconomic time series sampled at quarterly, monthly, and daily intervals. The last example also highlights the need for additional intermediate steps in the calculation of the moving seasonality ratio when the data contain complex seasonal dynamics. |
| Keywords: | asymmetric linear filters, concurrent revision policy, JDemetra+, moving seasonality ratio, nonparametric seasonal adjustment, real-time data |
| JEL: | C01 C02 C14 C22 C40 C50 |
| Date: | 2026 |
| URL: | https://d.repec.org/n?u=RePEc:zbw:bubdps:337482 |
| By: | Brenda Prallon |
| Abstract: | Robustness checks are routine in empirical work, but there is no standard statistical procedure to formally measure what one can learn from them. I propose a "robustness radius" measure to quantify the amount by which the robustness checks estimands differ from the main specification estimand. I do so by framing robustness checks as explicitly biased regressions, clarifying what exactly the estimands are when comparing multiple regressions with slightly different samples, and applying a test from the moment inequalities literature. The robustness radius is easily interpretable and adapts to sampling uncertainty and correlation across regressions. An application shows that, although assessing overall robustness is context-specific, the robustness radius guides those judgments and improves transparency. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.19384 |