|
on Econometrics |
| By: | Claudia Noack; Tomasz Olma; Christoph Rothe |
| Abstract: | Clustered sampling is prevalent in empirical regression discontinuity (RD) designs, but it has not received much attention in the theoretical literature. In this paper, we introduce a general model-based framework for such settings and derive high-level conditions under which the standard local linear RD estimator is asymptotically normal. We verify that our high-level assumptions hold across a wide range of empirical designs, including settings of growing cluster sizes. We further show that clustered standard errors that are currently used in practice can be either inconsistent or overly conservative in finite samples. To address these issues, we propose a novel nearest-neighbor-type variance estimator and illustrate its properties in a diverse set of empirical applications. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.18870 |
| By: | Edvard Bakhitov |
| Abstract: | This paper develops a penalized GMM (PGMM) framework for automatic debiased inference on functionals of nonparametric instrumental variable estimators. We derive convergence rates for the PGMM estimator and provide conditions for root-n consistency and asymptotic normality of debiased functional estimates, covering both linear and nonlinear functionals. Monte Carlo experiments on average derivative show that the PGMM-based debiased estimator performs on par with the analytical debiased estimator that uses the known closed-form Riesz representer, achieving 90-96% coverage while the plug-in estimator falls below 5%. We apply our procedure to estimate mean own-price elasticities in a semiparametric demand model for differentiated products. Simulations confirm near-nominal coverage while the plug-in severely undercovers. Applied to IRI scanner data on carbonated beverages, debiased semiparametric estimates are approximately 20% more elastic compared to the logit benchmark, and debiasing corrections are heterogeneous across products, ranging from negligible to several times the standard error. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.29889 |
| By: | Pedro Picchetti |
| Abstract: | In this paper I develop a breakdown frontier approach to assess the sensitivity of Local Average Treatment Effects (LATE) estimates to violations of monotonicity and independence of the instrument. I parametrize violations of independence using the concept of $c$-dependence from Masten & Poirier (2018) and allow for the share of defiers to be greater than zero but smaller than the share of compliers. I derive identified sets for the LATE and the Average Treatment Effect (ATE) in which the bounds are functions of these two sensitivity parameters. Using these bounds, I derive the breakdown frontier for the LATE, which is the weakest set of assumptions such that a conclusion regarding the LATE holds. I derive consistent sample analogue estimators for the breakdown frontiers and provide a valid bootstrap procedure for inference. Monte Carlo simulations show the desirable finite-sample properties of the estimators and an empirical application shows that the conclusions regarding the effect of family size on unemployment from Angrist & Evans (1998) are highly sensitive to violations of independence and monotonicity. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.25529 |
| By: | Ting Ji; Laura Liu; Yulong Wang; Jiahe Xing |
| Abstract: | This paper proposes a specification test for the conventional distributional assumptions of error terms in binary choice models, focusing on its tail properties. Based on extreme value theory, we first establish that the tail index of the unobserved error can be recovered by that of the observed covariates. The null hypothesis of the index being zero essentially covers the widely used probit and logit models. We then construct a simple and powerful statistical test for both cross-sectional and panel data, requiring no model estimation and no parametric assumptions. Monte Carlo simulations demonstrate that our test performs well in size and power, and applications to three empirical examples on firm export and innovation decisions and female labor force participation illustrate its general applicability. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.27881 |
| By: | Fernando Rios-Avila (Universidad Privada Boliviana); Andrey Ramos (Bank of Spain); Gustavo Canavire-Bacarreza (World Bank and Universidad Privada Boliviana); Leonardo Siles (Universidad de Chile) |
| Abstract: | This paper proposes a method to estimate quantile regression models with multiple fixed effects. We extend the quantile–via–moments estimator of Machado and Santos Silva (2019) and suggest a computationally efficient Frisch–Waugh–Lovell residualization to partial out additive fixed effects in both the location and scale equations. A unified influence-function inference framework is derived, accommodating heteroskedasticity-robust, clustered, and feasible GLS standard errors. Monte Carlo simulations provide strong support for the validity of the proposed procedure in applications with multi-way unobserved heterogeneity and intra-cluster correlated disturbances. An empirical application to Climate Growth-at-Risk illustrates how temperature shocks affect the conditional distribution of macroeconomic outcomes in a panel of 194 countries. Our findings suggest that in low income countries, downside risks to growth are more strongly linked to temperature shocks than the central tendency or upside risks. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:ays:ispwps:paper2615 |
| By: | Zongwu Cai (Department of Economics, The University of Kansas, Lawrence, KS 66045, USA); Wei Long (Department of Economics, Tulane University, New Orleans, LA 70118, USA) |
| Abstract: | This paper develops a persistence-robust inferential framework for predictive expectile regression with highly persistent regressors. We combine expectile score equations with IVX instruments to construct an IVX-expectile estimator that preserves the distributional interpretation of expectile regression while regularizing the nonstandard effects of near-unit-root regressors, endogeneity, and conditional heteroscedasticity. For fixed expectile levels, we establish consistency and asymptotic normality of the estimator and show that the associated Wald statistic converges to a standard chi-square distribution. Simulation evidence indicates that the proposed procedure delivers accurate size for regressors with differential persistence, with only a modest local-power cost relative to conventional methods. In an application to monthly and quarterly U.S. stock return predictability, the method detects substantially asymmetric predictive ability across expectiles, showing that IVX-expectile regression provides a useful tool for studying heterogeneous predictive effects and downside tail risk when predictors are highly persistent. |
| Keywords: | IVX inference; Persistent predictors; Predictive expectile regression; Stock return |
| JEL: | C32 C51 C58 |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:kan:wpaper:202610 |
| By: | Jacob Carlson; Neil Shephard |
| Abstract: | The potential system is a nonparametric time series model for assessing the causal impact of moving an assignment at time $t$ on an outcome at future time $t+h$, accounting for the presence of features. The potential system provides nonparametric content for, e.g., time series experiments, time series regression, local projection, impulse response functions and SVARs. It closes a gap between time series causality and nonparametric cross-sectional causal methods, and provides a foundation for many new methods which have causal content. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.20394 |
| By: | James G. MacKinnon |
| Abstract: | It is common when using cross-section or panel data to assign each observation to a cluster and allow for arbitrary patterns of heteroskedasticity and correlation within clusters. For regression models, there are many ways to make cluster-robust inferences. A number of different variance matrix estimators can be used. Hypothesis tests and confidence intervals can then be based on several alternative analytic or bootstrap distributions. Some methods typically perform much better than others, but no method yields reliable inferences in every case. Thus it can be hard to know which $P$ values and confidence intervals to trust. Nevertheless, by using a number of procedures to assess the reliability of various inferential methods for a specific model and dataset, we can often obtain results in which we may be reasonably confident. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.02000 |
| By: | Stéphane Goutte (SOURCE - SOUtenabilité et RésilienCE - UVSQ - Université de Versailles Saint-Quentin-en-Yvelines - IRD [Ile-de-France] - Institut de Recherche pour le Développement); Konstantinos N. Konstantakis (University of Piraeus); Dimitris Konstantios (ALBA Graduate Business School [Athens, Greece]); Panayotis G. Michaelides (NTUA - National Technical University of Athens); Arsenios‐georgios N. Prelorentzos |
| Abstract: | This paper surveys quantile modelling from its theoretical origins to current advances. We organize the literature and present core econometric formulations and estimation methods for: (i) cross‐sectional quantile regression; (ii) quantile time series models and their time series properties; (iii) quantile vector autoregressions for multivariate data; (iv) quantile panel models for longitudinal data; and (v) quantile factor‐augmented models for information compression in data‐rich environments. Each section outlines theoretical foundations and developments, followed by representative empirical applications. Finally, the survey highlights open gaps in quantile modelling. By studying distributional dynamics beyond averages, quantile methods provide policymakers and regulators with tools to design interventions that are robust to risks and effective across the entire spectrum of possible outcomes. |
| Keywords: | Quantile, Quantile regression, Estimation, Econometric model, Multivariate statistics, Series (stratigraphy) |
| Date: | 2026 |
| URL: | https://d.repec.org/n?u=RePEc:hal:journl:hal-05503058 |
| By: | Jieun Lee; Anil K. Bera |
| Abstract: | We study Cressie Read power divergence (CRPD) estimation for moment based models, focusing on finite sample behavior. While generalized empirical likelihood estimators, dual to CRPD, are known to outperform generalized method of moments estimators in small to moderate samples, the power parameter is typically chosen arbitrarily by the researcher, serving mainly as an index. We interpret it as a hyperparameter that determines the loss function and governs the learning procedure, shaping the curvature of the objective and influencing finite sample performance. Using second order asymptotics, we show that it affects both the structural estimator and the associated Lagrange multipliers, governing robustness, bias, and sensitivity to sampling variation. Monte Carlo simulations illustrate how estimator performance varies with the choice of the power parameter and underlying distributional features, with implications for second order bias and coverage distortion. An empirical illustration based on Owen (2001)s classical example highlights the practical relevance of tuning the power parameter. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.22599 |
| By: | Easton Huch; Michael Keane |
| Abstract: | Discrete choice models are fundamental tools in management science, economics, and marketing for understanding and predicting decision-making. Logit-based models are dominant in applied work, largely due to their convenient closed-form expressions for choice probabilities. However, these models entail restrictive assumptions on the stochastic utility component, constraining our ability to capture realistic and theoretically grounded choice behavior$-$most notably, substitution patterns. In this work, we propose an amortized inference approach using a neural network emulator to approximate choice probabilities for general error distributions, including those with correlated errors. Our proposal includes a specialized neural network architecture and accompanying training procedures designed to respect the invariance properties of discrete choice models. We provide group-theoretic foundations for the architecture, including a proof of universal approximation given a minimal set of invariant features. Once trained, the emulator enables rapid likelihood evaluation and gradient computation. We use Sobolev training, augmenting the likelihood loss with a gradient-matching penalty so that the emulator learns both choice probabilities and their derivatives. We show that emulator-based maximum likelihood estimators are consistent and asymptotically normal under mild approximation conditions, and we provide sandwich standard errors that remain valid even with imperfect likelihood approximation. Simulations show significant gains over the GHK simulator in accuracy and speed. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.24705 |
| By: | Masahiro Kato |
| Abstract: | We propose a method for constructing distribution-free prediction intervals in nonparametric instrumental variable regression (NPIV), with finite-sample coverage guarantees. Building on the conditional guarantee framework in conformal inference, we reformulate conditional coverage as marginal coverage over a class of IV shifts $\mathcal{F}$. Our method can be combined with any NPIV estimator, including sieve 2SLS and other machine-learning-based NPIV methods such as neural networks minimax approaches. Our theoretical analysis establishes distribution-free, finite-sample coverage over a practitioner-chosen class of IV shifts. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.25509 |
| By: | Xiangyu Song |
| Abstract: | Attrition in survey and field experiments presents a challenge for social science research. Common approaches to deal with this problem -- such as complete case analysis, multiple imputation, and weighting methods -- rely on strong assumptions that may not hold in practice. This paper introduces a new method that combines recent advances in statistical inference with established tools for handling missing data. The approach produces prediction intervals for treatment effects that are both robust and precise. Evidence from simulation studies shows that the method achieves better coverage and produces narrower intervals than common alternatives. The reanalysis of two recently published experiment studies illustrates how this framework allows researchers to compare treatment effects across participants who remain in the study, those who drop out, and the full sample. Taken together, these results highlight how the proposed approach provides a stronger foundation for causal inference in the presence of attrition. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.00504 |
| By: | Mehic, Adrian (Research Institute of Industrial Economics (IFN)); Nordström, Marcus (Department of Economics, Lund University, Sweden) |
| Abstract: | We propose a novel dynamic panel estimator. Different from the commonly used difference and system GMM, our proposed estimator requires only one of the cross-sectional dimension (N) or the time dimension (T) to grow large to be asymptotically unbiased. This improves reliability in panels with long time spans, where GMM suffers from weak instrument problems, and more generally in finite samples where results can be sensitive to instrument selection and implementation choices. Computationally simple, it extends readily to higher-order autoregressive and vector autoregressive settings. Monte Carlo simulations show that the estimator exhibits lower finite-sample bias than GMM in shorter panels, including for roots at and near unity. In three applications from political economy and macroeconomics—spanning diverse panels, outcomes, and persistence levels—our estimator yields stable, economically meaningful estimates robust to specification choices. By contrast, standard GMM methods display considerable sensitivity to instrument lags, collapsing, and the choice between difference and system variant, often producing substantively different results under comparable setups. |
| Keywords: | Dynamic panel data; Instrumental variabels |
| JEL: | C23 C33 |
| Date: | 2026–03–20 |
| URL: | https://d.repec.org/n?u=RePEc:hhs:iuiwop:1555 |
| By: | Minkey Chang; Jae-Young Kim |
| Abstract: | We propose the Identifiable Variational Dynamic Factor Model (iVDFM), which learns latent factors from multivariate time series with identifiability guarantees. By applying iVAE-style conditioning to the innovation process driving the dynamics rather than to the latent states, we show that factors are identifiable up to permutation and component-wise affine (or monotone invertible) transformations. Linear diagonal dynamics preserve this identifiability and admit scalable computation via companion-matrix and Krylov methods. We demonstrate improved factor recovery on synthetic data, stable intervention accuracy on synthetic SCMs, and competitive probabilistic forecasting on real-world benchmarks. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.22886 |
| By: | Georg Keilbar; Sonja Greven |
| Abstract: | We propose a novel framework for conducting causal inference based on counterfactual densities. While the current paradigm of causal inference is mostly focused on estimating average treatment effects (ATEs), which restricts the analysis to the first moment of the outcome variable, our density-based approach is able to detect causal effects based on general distributional characteristics. Following the Oaxaca-Blinder decomposition approach, we consider two types of counterfactual density effects that together explain observed discrepancies between the densities of the treated and control group. First, the distribution effect is the counterfactual effect of changing the conditional density of the control group to that of the treatment group, while keeping the covariates fixed at the treatment group distribution. Second, the covariate effect represents the effect of a hypothetical change in the covariate distribution. Both effects have a causal interpretation under the classical unconfoundedness and overlap assumptions. Methodologically, our approach is based on analyzing the conditional densities as elements of a Bayes Hilbert space, which preserves the non-negativity and integration-to-one constraints. We specify a flexible functional additive regression model estimating the conditional densities. We apply our method to analyze the German East--West income gap, i.e., the observed differences in wages between East Germans and West Germans. While most of the existing studies focus on the average differences and neglect other distributional characteristics, our density-based approach is suited to detect all nuances of the counterfactual distributions, including differences in probability masses at zero. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.28470 |
| By: | Matthew Read (Reserve Bank of Australia) |
| Abstract: | I propose identifying structural vector autoregressions using 'shock-percentile' restrictions. These restrictions require the realisation of a structural shock in a selected episode to lie in the tail of the shock's historical distribution, representing the belief that a relatively large shock has occurred. I argue that shock-percentile restrictions are an attractive alternative to imposing numeric bounds on shock magnitudes, which are difficult to credibly elicit. Simulations demonstrate the potential for shock-percentile restrictions to provide identifying information. In two empirical applications, I exploit shock-percentile restrictions to disentangle the relationship between uncertainty and real activity, and to sharpen identification of the macroeconomic effects of US monetary policy. |
| Keywords: | monetary policy; narrative restrictions; set identification; structural vector autoregression; uncertainty |
| JEL: | C32 D80 E32 E44 E52 |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:rba:rbardp:rdp2026-01 |
| By: | Anish Agarwal; Jungjun Choi; Ming Yuan |
| Abstract: | We introduce a flexible framework for high-dimensional matrix estimation to incorporate side information for both rows and columns. Existing approaches, such as inductive matrix completion, often impose restrictive structure-for example, an exact low-rank covariate interaction term, linear covariate effects, and limited ability to exploit components explained only by one side (row or column) or by neither-and frequently omit an explicit noise component. To address these limitations, we propose to decompose the underlying matrix as the sum of four complementary components: (possibly nonlinear) interaction between row and column characteristics; row characteristic-driven component, column characteristic-driven component, and residual low-rank structure unexplained by observed characteristics. By combining sieve-based projection with nuclear-norm penalization, each component can be estimated separately and these estimated components can then be aggregated to yield a final estimate. We derive convergence rates that highlight robustness across a range of model configurations depending on the informativeness of the side information. We further extend the method to partially observed matrices under both missing-at-random and missing-not-at-random mechanisms, including block-missing patterns motivated by causal panel data. Simulations and a real-data application to tobacco sales show that leveraging side information improves imputation accuracy and can enhance treatment-effect estimation relative to standard low-rank and spectral-based alternatives. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.24833 |
| By: | Joseph Marshall |
| Abstract: | This paper studies how to estimate an individual's taste for forming a connection with another individual in a network. It compares the difficulty of estimation with and without the assumption that utility is transferable between individuals, and with and without the assumption that regressors are symmetric across individuals in the pair. I show that when pair-specific regressors are symmetric, the sufficient conditions for consistency and asymptotic normality of the maximum likelihood estimator that assumes transferable utility (TU-MLE) are also sufficient for the maximum likelihood estimator that does not assume transferable utility (NTU-MLE). When regressors are asymmetric, I provide sufficient conditions for the consistency and asymptotic normality of the NTU-MLE. I also provide a specification test to assess the validity of the transferable utility assumption. Two applications from different fields of economics demonstrate the value of my results. I find evidence of researchers using the TU-MLE when the transferable utility assumption is violated, and evidence of researchers using NTU-model-based estimators when the validity of the transferable utility assumption cannot be rejected. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.25641 |
| By: | Jia-Han Shih; Simon M. S. Lo; Ralf A. Wilke |
| Abstract: | Single-index models or time-to-event models are frequently applied in empirical research. These models are non-identifiable in presence of unknown (dependent) censoring or competing risks and do not give informative results in empirical analysis unless rather strong, non-testable restrictions hold. Little is known, whether the known robustness properties of the single-index model carry over to models with dependent censoring or competing risks. This paper shows that the ratio of partial covariate effects on the margins is identifiable in nonparametric models with unknown dependent censoring or nonparametric competing risks models with nonparametric dependence structure, provided an exclusion restriction holds. Commonly used (semi)parametric models for the margin and independent censoring, such as Cox proportional hazards, accelerated failure time or proportional odds models, can be used to obtain relative covariate effects despite their misspecified censoring mechanism. Several nonparametric estimators for the general model are introduced and their numerical properties are studied. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.22914 |
| By: | Haoge Chang; Zeyang Yu |
| Abstract: | This article studies randomization inference for treatment effects in randomized controlled trials with attrition, where outcomes are observed for only a subset of units. We assume monotonicity in reporting behavior as in \cite{lee2009training} and focus on the average treatment effect for always-reporters (AR-ATE), defined as units whose outcomes are observed under both treatment and control. Because always-reporter status is only partially revealed by observed assignment and response patterns, we propose a worst-case randomization test that maximizes the randomization p-value over all always-reporter configurations consistent with the data, with an optional pretest to prune implausible configurations. Using studentized Hajek- and chi-square-type statistics, we show the resulting procedure is finite-sample valid for the sharp null and asymptotically valid for the weak null. We also discuss computational implementations for discrete outcomes and integer-programming-based bounds for continuous outcomes. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.24970 |
| By: | Marcelo J. Moreira; Geert Ridder; Mahrad Sharifvaghefi |
| Abstract: | We characterize the maximal attainable power-size gap in overidentified instrumental variables models with heteroskedastic or autocorrelated (HAC) errors. Using total variation distance and Kraft's theorem, we define the decision theoretic frontier of the testing problem. We show that Lagrange multiplier and conditional quasi likelihood ratio tests can have power arbitrarily close to size even when the null and alternative are well separated, because they do not fully exploit the reduced-form likelihood. In contrast, the conditional likelihood ratio (CLR) test uses the full reduced-form likelihood. We prove that the power-size gap of CLR converges to one if and only if the testing problem becomes trivial in total variation distance, so that CLR attains the decision theoretic frontier whenever any test can. An empirical illustration based on Yogo (2004) shows that these failures arise in empirically relevant configurations. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.21004 |
| By: | Roberto Fuentes-Mart\'inez; Irene Crimaldi |
| Abstract: | A model-free measure of Granger causality in expectiles is proposed, generalizing the traditional mean-based measure to arbitrary positions of the conditional distribution. Expectiles are the only law-invariant risk measures that are both coherent and elicitable, making them particularly well-suited for studying distributional Granger causality where risk quantification and forecast evaluation are both relevant. Based on this measure, a test is developed using M-vine copula models that accounts for multivariate Granger causality with $d+1$ series under non-linear and non-Gaussian dependence, without imposing parametric assumptions on the joint distribution. Strong consistency of the test statistic is established under some regularity conditions. In finite samples, simulations show accurate size control and power increasing with sample size. A key advantage is the joint testing capability: causal relationships invisible to pairwise tests can be detected, as demonstrated both theoretically and empirically. Two applications to international stock market indices at the global and Asian regional level illustrate the practical relevance of the proposed framework. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.23294 |
| By: | Bulat Gafarov; Takuya Ura |
| Abstract: | It has become standard for empirical studies to conduct inference robust to cluster dependence and heterogeneity. With a small number of clusters, the normal approximation for the $t$-statistics of regression coefficients may be poor. This paper tackles this problem using a critical value based on the conditional Cram\'er-Edgeworth expansion for the $t$-statistics. Our approach guarantees third-order refinement, regardless of whether a regressor is discrete or not, and, unlike the cluster pairs bootstrap, avoids resampling data. Simulations show that our proposal can make a difference in size control with as few as 10 clusters. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.24786 |
| By: | Martin Bruns (School of Economics, University of East Anglia); Helmut Lütkepohl (DIW Berlin & FU Berlin) |
| Abstract: | In structural vector autoregressive analysis it has become quite popular to identify some structural shocks of interest by external instruments or proxies. This study points out a range of areas where such proxies have been used and sketches the way the proxies have been constructed. It reviews identification and estimation methods that have been considered in this context. Moreover, it points out some features such as heteroskedasticity, nonfundamentalness of the shocks and violations of the standard assumptions for proxies that may result in complications. |
| Keywords: | Structural vector autoregression, proxy VAR, local projection, weak instruments, internal instruments, external instruments, fundamental shocks |
| JEL: | C32 |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:uea:ueaeco:2026-01 |
| By: | Vadim Ustyuzhanin |
| Abstract: | This paper proposes Covariate-Balanced Weighted Stacked Difference-in-Differences (CBWSDID), a design-based extension of weighted stacked DID for settings in which untreated trends may be conditionally rather than unconditionally parallel. The estimator separates within-subexperiment design adjustment from across-subexperiment aggregation: matching or weighting improves treated-control comparability within each stacked subexperiment, while the corrective stacked weights of Wing et al. recover the target aggregate ATT. I show that the same logic extends from absorbing treatment to repeated $0 \to 1$ and $1 \to 0$ episodes under a finite-memory assumption. The paper develops the identifying framework, discusses inference, presents simulation evidence, and illustrates the estimator in applications based on Trounstine (2020) and Acemoglu et al. (2019). Across these examples, CBWSDID serves as a bridge between weighted stacked DID and design-based panel matching. The accompanying R package cbwsdid is available on GitHub. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.02293 |
| By: | Demetrio Lacava |
| Abstract: | This paper introduces a new extension of the Conditional Autoregressive Value at Risk (CAViaR) model aimed at improving tail risk forecasting across assets. The proposed component-based model, CAViaR with Spillover Effects (CAViaR-SE), decomposes the conditional Value at Risk into a proper-risk component and a spillover component driven by a linear combination of tail risks from influential assets. These assets are selected via a recursive partial correlation algorithm, allowing multiple spillover sources with minimal parameterization. The spillover component acts as a predictable quantile shifter, directly affecting the conditional quantile dynamics rather than the volatility scale. Empirical results on Dow Jones Industrial Average stocks show that spillover effects account for a substantial share of total tail risk and significantly improve out-of-sample tail risk forecasts. Backtesting procedures, together with Model Confidence Set (MCS) analysis, confirm that CAViaR-SE provides well-calibrated risk measures and statistically superior forecasts compared to standard and augmented CAViaR models. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.25217 |
| By: | Wayne Gao |
| Abstract: | Normalization is ubiquitous in economics, and a growing literature shows that ``normalizations'' can matter for interpretation, counterfactual analysis, misspecification, and inference. This paper provides a general framework for these issues, based on the formalized notion of modeling equivalence that partitions the space of unknowns into equivalence classes, and defines normalization as a WLOG selection of one representative from each class. A counterfactual parameter is normalization-free if and only if it is constant on equivalence classes; otherwise any point identification is created by the normalization rather than by the model. Applications to discrete choice, demand estimation, and network formation illustrate the insights made explicit through this criterion. We then study two further sources of fragility: an extension trilemma establishes that fidelity, invariance, and regularity cannot simultaneously hold at a boundary singularity, while a normalization can itself introduce a coordinate singularity that distorts the topological and metric structures of the parameter space, with consequences for estimation and inference. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.27762 |
| By: | Anton Malandii; Stan Uryasev |
| Abstract: | This paper introduces \emph{biased mean regression}, estimating the \emph{biased mean}, i.e., $\mathbb{E}[Y] + x$, where $x \in \mathbb{R}$. The approach addresses a fundamental statistical problem that covers numerous applications. For instance, it can be used to estimate factors driving portfolio loss exceeding the expected loss by a specified amount (e.g., $ x=\$10 billion$) or to estimate factors impacting a specific excess release of radiation in the environment, where nuclear safety regulations specify different severity levels. The estimation is performed by minimizing the so-called \emph{superexpectation error}. We establish two equivalence results that connect the method to popular paradigms: (i) biased mean regression is equivalent to quantile regression for an appropriate parameterization and is equivalent to ordinary least squares when $x=0$; (ii) in portfolio optimization, minimizing \emph{superexpectation risk}, associated with the superexpectation error, is equivalent to CVaR optimization. The approach is computationally attractive, as minimizing the superexpectation error reduces to linear programming (LP), thereby offering algorithmic and modeling advantages. It is also a good alternative to ordinary least squares (OLS) regression. The approach is based on the \emph{Risk Quadrangle} (RQ) framework, which links four stochastic functionals -- error, regret, risk, and deviation -- through a statistic. For the biased mean quadrangle, the statistic is the biased mean. We study properties of the new quadrangle, such as \emph{subregularity}, and establish its relationship to the quantile quadrangle. Numerical experiments confirm the theoretical statements and illustrate the practical implications. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.26901 |
| By: | Emanuele Lopetuso; Massimiliano Caporin |
| Abstract: | Traditional econometric analyzes represent observations as vectors despite the inherent complexity of empirical data structures. When data are organized along dual classification dimensions, a matrix representation provides a more natural and interpretable framework. Building on recent advances in matrix autoregressive (MAR) modeling, this study introduces a novel error correction representation tailored for matrix-structured data. Through comparative analysis with existing methodologies, we demonstrate two critical advancements. First, the proposed model preserves the interpretative foundations of conventional cointegration analysis, with coefficients that explicitly capture dynamics rooted in adjustment toward steady-state positions. Second, in contrast to previous formulations, our error correction framework allows for an equivalent matrix autoregressive representation, preserving the fundamental structure of the data in both specifications. This ensures that the matrix representation reflects an intrinsic characteristic of the data. |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2604.00723 |
| By: | Valli, Roberto (ETH Zürich) |
| Abstract: | Difference-in-differences designs often study place-based treatments that can trigger migration in and out of treated areas. When treatment changes who is observed, aggregate and within-individual DiD no longer retrieve the average treatment effect on the treated. This paper uses principal stratification to characterize three estimands under treatment-induced migration: a locality-level treatment effect, a stayer average treatment effect, and an individual ATT for the pre-treatment population. Aggregate DiD identifies the locality-level effect under aggregate parallel trends, while within-individual DiD identifies the stayer effect under stayer parallel trends. Together, the two assumptions imply parallel trends for escapees, a non-trivial restriction on the group whose departure drives compositional change. With panel data, the stayer effect and the compositional term are point-identified, and the ATT lies in a one-parameter sensitivity region. With repeated cross-sections, Lee-type bounds apply in the one-sided exit case. An appendix extends these results to natural turnover and treated-control interference. |
| Date: | 2026–03–23 |
| URL: | https://d.repec.org/n?u=RePEc:osf:socarx:s7pw3_v1 |
| By: | Victor Medina-Olivares; Wangzhen Xia; Stefan Lessmann; Nadja Klein |
| Abstract: | We propose a semi-structured discrete-time multi-state model to analyse mortgage delinquency transitions. This model combines an easy-to-understand structured additive predictor, which includes linear effects and smooth functions of time and covariates, with a flexible neural network component that captures complex nonlinearities and higher-order interactions. To ensure identifiability when covariates are present in both components, we orthogonalise the unstructured part relative to the structured design. For discrete-time competing transitions, we derive exact transformations that map binary logistic models to valid competing transition probabilities, avoiding the need for continuous-time approximations. In simulations, our framework effectively recovers structured baseline and covariate effects while using the neural component to detect interaction patterns. We demonstrate the method using the Freddie Mac Single-Family Loan-Level Dataset, employing an out-of-time test design. Compared with a structured generalised additive benchmark, the semi-structured model provides modest but consistent gains in discrimination across the earliest prediction spans, while maintaining similar Brier scores. Adding macroeconomic indicators provides limited incremental benefit in this out-of-time evaluation and does not materially change the estimated borrower-, loan-, or duration-driven effects. Overall, semi-structured multi-state modelling offers a practical compromise between transparent effect estimates and flexible pattern learning, with potential applications beyond credit-transition forecasting. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.26309 |
| By: | Karun Adusumilli; Maximilian Kasy; Ashia Wilson |
| Abstract: | We derive the asymptotic risk function of regularized empirical risk minimization (ERM) estimators tuned by $n$-fold cross-validation (CV). The out-of-sample prediction loss of such estimators converges in distribution to the squared-error loss (risk function) of shrinkage estimators in the normal means model, tuned by Stein's unbiased risk estimate (SURE). This risk function provides a more fine-grained picture of predictive performance than uniform bounds on worst-case regret, which are common in learning theory: it quantifies how risk varies with the true parameter. As key intermediate steps, we show that (i) $n$-fold CV converges uniformly to SURE, and (ii) while SURE typically has multiple local minima, its global minimum is generically well separated. Well-separation ensures that uniform convergence of CV to SURE translates into convergence of the tuning parameter chosen by CV to that chosen by SURE. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.20388 |
| By: | David Bruns-Smith |
| Abstract: | The Riesz representer is a central object in semiparametric statistics and debiased/doubly-robust estimation. Two literatures in econometrics have highlighted the role for directly estimating Riesz representers: the automatic debiased machine learning literature (as in Chernozhukov et al., 2022b), and an independent literature on sieve methods for conditional moment models (as in Chen et al., 2014). These two literatures solve distinct optimization problems that in the population both have the Riesz representer as their solution. We show that with unregularized or ridge-regularized linear, sieve, or RKHS models, the two resulting estimators are numerically equivalent. However, for other regularization schemes such as the Lasso, or more general machine learning function classes including neural networks, the estimators are not necessarily equivalent. In the latter case, the Chen et al. (2014) formulation yields a novel constrained optimization problem for directly estimating Riesz representers with machine learning. Drawing on results from Birrell et al. (2022), we conjecture that this approach may offer statistical advantages at the cost of greater computational complexity. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.20936 |
| By: | Koichiro Kamada (Faculty of Business and Commerce, Keio University) |
| Abstract: | We propose a simple method for estimating multiple natural rates in a system of simultaneous equations. Our estimators of natural rates are closely related to the HP filter and accessible by many practitioners. As an application, Japan’s potential output and natural foreign exchange rate are estimated. It is shown that Japan’s potential output has been growing, but the natural foreign exchange rate has experienced stepwise downward shifts since the beginning of the 21st century. While Japan suffered the long-lasting stagnation, emerging markets, particularly China, achieved tremendous economic growth. The declines in the natural foreign exchange rate indicate Japan’s lost competitiveness in the world economy clearly. |
| Keywords: | Phillips curve, net export, potential output, output gap, exchange rate, productivity, international competitiveness, HP filter |
| JEL: | C13 C32 E31 E32 F14 F41 O47 |
| Date: | 2026–03–18 |
| URL: | https://d.repec.org/n?u=RePEc:keo:dpaper:dp2026-005 |
| By: | Manuel Quintero; Advik Shreekumar; William T. Stephenson; Tamara Broderick |
| Abstract: | Scientists often want to explain why an outcome is different in two groups. For instance, differences in patient mortality rates across two hospitals could be due to differences in the patients themselves (covariates) or differences in medical care (outcomes given covariates). The Oaxaca--Blinder decomposition (OBD) is a standard tool to tease apart these factors. It is well known that the OBD requires choosing one of the groups as a reference, and the numerical answer can vary with the reference. To the best of our knowledge, there has not been a systematic investigation into whether the choice of OBD reference can yield different substantive conclusions and how common this issue is. In the present paper, we give existence proofs in real and simulated data that the OBD references can yield substantively different conclusions and that these differences are not entirely driven by model misspecification or small data. We prove that substantively different conclusions occur in up to half of the parameter space, but find these discrepancies rare in the real-data analyses we study. We explain this empirical rarity by examining how realistic data-generating processes can be biased towards parameters that do not change conclusions under the OBD. |
| Date: | 2026–03 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2603.29972 |