|
on Econometrics |
| By: | Paul Goldsmith-Pinkham; Peter Hull; Michal Kolesár |
| Abstract: | We develop a step-by-step guide to leniency (a.k.a. judge or examiner instrument) designs, drawing on recent econometric literatures. The unbiased jackknife instrumental variables estimator (UJIVE) is purpose-built for leveraging exogenous leniency variation, avoiding subtle biases even in the presence of many decision-makers or controls. We show how UJIVE can also be used to assess key assumptions underlying leniency designs, including quasi-random assignment and average first-stage monotonicity, and to probe the external validity of treatment effect estimates. We further discuss statistical inference, arguing that non-clustered standard errors are often appropriate. A reanalysis of Farre-Mensa et al. (2020), using quasi-random examiner assignment to estimate the value of patents to startups, illustrates our checklist. |
| JEL: | C01 G0 H0 J0 K0 |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:34473 |
| By: | Sofiia Dolgikh; Bodan Potanin |
| Abstract: | We propose plug-in (PI) and double machine learning (DML) estimators of average treatment effect (ATE), average treatment effect on the treated (ATET) and local average treatment effect (LATE) in the multivariate sample selection model with ordinal selection equations. Our DML estimators are doubly-robust and based on the efficient influence functions. Finite sample properties of the proposed estimators are studied and compared on simulated data. Specifically, the results of the analysis suggest that without addressing multivariate sample selection, the estimates of the causal parameters may be highly biased. However, the proposed estimators allow us to avoid these biases. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.12640 |
| By: | Paul Goldsmith-Pinkham; Tianshu Lyu |
| Abstract: | Financial event studies, ubiquitous in finance research, typically use linear factor models with known factors to estimate abnormal returns and identify causal effects of information events. This paper demonstrates that when factor models are misspecified -- an almost certain reality -- traditional event study estimators produce inconsistent estimates of treatment effects. The bias is particularly severe during volatile periods, over long horizons, and when event timing correlates with market conditions. We derive precise conditions for identification and expressions for asymptotic bias. As an alternative, we propose synthetic control methods that construct replicating portfolios from control securities without imposing specific factor structures. Revisiting four empirical applications, we show that some established findings may reflect model misspecification rather than true treatment effects. While traditional methods remain reliable for short-horizon studies with random event timing, our results suggest caution when interpreting long-horizon or volatile-period event studies and highlight the importance of quasi-experimental designs when available. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.15123 |
| By: | Andrei Zeleneev; Weisheng Zhang |
| Abstract: | Interactive fixed effects are routinely controlled for in linear panel models. While an analogous fixed effects (FE) estimator for nonlinear models has been available in the literature (Chen, Fernandez-Val and Weidner, 2021), it sees much more limited use in applied research because its implementation involves solving a high-dimensional non-convex problem. In this paper, we complement the theoretical analysis of Chen, Fernandez-Val and Weidner (2021) by providing a new computationally efficient estimator that is asymptotically equivalent to their estimator. Unlike the previously proposed FE estimator, our estimator avoids solving a high-dimensional optimization problem and can be feasibly computed in large nonlinear panels. Our proposed method involves two steps. In the first step, we convexify the optimization problem using nuclear norm regularization (NNR) and obtain preliminary NNR estimators of the parameters, including the fixed effects. Then, we find the global solution of the original optimization problem using a standard gradient descent method initialized at these preliminary estimates. Thus, in practice, one can simply combine our computationally efficient estimator with the inferential theory provided in Chen, Fernandez-Val and Weidner (2021) to construct confidence intervals and perform hypothesis testing. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.15427 |
| By: | Amaze Lusompa |
| Abstract: | It is well known that model selection via cross validation can be biased for time series models. However, many researchers have argued that this bias does not apply when using cross-validation with vector autoregressions (VAR) or with time series models whose errors follow a martingale-like structure. I show that even under these circumstances, performing cross-validation on time series data will still generate bias in general. |
| Keywords: | time series; model selection; model validation; martingale |
| JEL: | C52 C50 C10 |
| Date: | 2025–11–24 |
| URL: | https://d.repec.org/n?u=RePEc:fip:fedkrw:102151 |
| By: | Masahiro Tanaka |
| Abstract: | We develop a computationally efficient framework for quasi-Bayesian inference based on linear moment conditions. The approach employs a delayed acceptance Markov chain Monte Carlo (DA-MCMC) algorithm that uses a surrogate target kernel and a proposal distribution derived from an approximate conditional posterior, thereby exploiting the structure of the quasi-likelihood. Two implementations are introduced. DA-MCMC-Exact fully incorporates prior information into the proposal distribution and maximizes per-iteration efficiency, whereas DA-MCMC-Approx omits the prior in the proposal to reduce matrix inversions, improving numerical stability and computational speed in higher dimensions. Simulation studies on heteroskedastic linear regressions show substantial gains over standard MCMC and conventional DA-MCMC baselines, measured by multivariate effective sample size per iteration and per second. The Approx variant yields the best overall throughput, while the Exact variant attains the highest per-iteration efficiency. Applications to two empirical instrumental variable regressions corroborate these findings: the Approx implementation scales to larger designs where other methods become impractical, while still delivering precise inference. Although developed for moment-based quasi-posteriors, the proposed approach also extends to risk-based quasi-Bayesian formulations when first-order conditions are linear and can be transformed analogously. Overall, the proposed algorithms provide a practical and robust tool for quasi-Bayesian analysis in statistical applications. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.17117 |
| By: | Nan Liu; Yanbo Liu; Yuya Sasaki; Yuanyuan Wan |
| Abstract: | We develop methods for nonparametric uniform inference in cost-sensitive binary classification, a framework that encompasses maximum score estimation, predicting utility maximizing actions, and policy learning. These problems are well known for slow convergence rates and non-standard limiting behavior, even under point identified parametric frameworks. In nonparametric settings, they may further suffer from failures of identification. To address these challenges, we introduce a strictly convex surrogate loss that point-identifies a representative nonparametric policy function. We then estimate this surrogate policy to conduct inference on both the optimal classification policy and the optimal policy value. This approach enables Gaussian inference, substantially simplifying empirical implementation relative to working directly with the original classification problem. In particular, we establish root-$n$ asymptotic normality for the optimal policy value and derive a Gaussian approximation for the optimal classification policy at the standard nonparametric rate. Extensive simulation studies corroborate the theoretical findings. We apply our method to the National JTPA Study to conduct inference on the optimal treatment assignment policy and its associated welfare. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.14700 |
| By: | Toru Kitagawa; Yizhou Kuang |
| Abstract: | Leaving posterior sensitivity concerns aside, non-identifiability of the parameters does not raise a difficulty for Bayesian inference as far as the posterior is proper, but multi-modality or flat regions of the posterior induced by the lack of identification leaves a challenge for modern Bayesian computation. Sampling methods often struggle with slow or non-convergence when dealing with multiple modes or flat regions of the target distributions. This paper develops a novel Markov chain Monte Carlo (MCMC) approach for non-identified models, leveraging the knowledge of observationally equivalent sets of parameters, and highlights an important role that identification plays in modern Bayesian analysis. We show that our proposal overcomes the issues of being trapped in a local mode and achieves a faster rate of convergence than the existing MCMC techniques including random walk Metropolis-Hastings and Hamiltonian Monte Carlo. The gain in the speed of convergence is more significant as the dimension or cardinality of the identified sets increases. Simulation studies show its superior performance compared to other popular computational methods including Hamiltonian Monte Carlo and sequential Monte Carlo. We also demonstrate that our method uncovers non-trivial modes in the target distribution in a structural vector moving-average (SVMA) application. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.12847 |
| By: | Jessy Xinyi Han; Devavrat Shah |
| Abstract: | Estimating causal effects on time-to-event outcomes from observational data is particularly challenging due to censoring, limited sample sizes, and non-random treatment assignment. The need for answering such "when-if" questions--how the timing of an event would change under a specified intervention--commonly arises in real-world settings with heterogeneous treatment adoption and confounding. To address these challenges, we propose Synthetic Survival Control (SSC) to estimate counterfactual hazard trajectories in a panel data setting where multiple units experience potentially different treatments over multiple periods. In such a setting, SSC estimates the counterfactual hazard trajectory for a unit of interest as a weighted combination of the observed trajectories from other units. To provide formal justification, we introduce a panel framework with a low-rank structure for causal survival analysis. Indeed, such a structure naturally arises under classical parametric survival models. Within this framework, for the causal estimand of interest, we establish identification and finite sample guarantees for SSC. We validate our approach using a multi-country clinical dataset of cancer treatment outcomes, where the staggered introduction of new therapies creates a quasi-experimental setting. Empirically, we find that access to novel treatments is associated with improved survival, as reflected by lower post-intervention hazard trajectories relative to their synthetic counterparts. Given the broad relevance of survival analysis across medicine, economics, and public policy, our framework offers a general and interpretable tool for counterfactual survival inference using observational data. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.14133 |
| By: | Jeonghwan Lee; Cong Ma |
| Abstract: | Distribution shift between the training domain and the test domain poses a key challenge for modern machine learning. An extensively studied instance is the \emph{covariate shift}, where the marginal distribution of covariates differs across domains, while the conditional distribution of outcome remains the same. The doubly-robust (DR) estimator, recently introduced by \cite{kato2023double}, combines the density ratio estimation with a pilot regression model and demonstrates asymptotic normality and $\sqrt{n}$-consistency, even when the pilot estimates converge slowly. However, the prior arts has focused exclusively on deriving asymptotic results and has left open the question of non-asymptotic guarantees for the DR estimator. This paper establishes the first non-asymptotic learning bounds for the DR covariate shift adaptation. Our main contributions are two-fold: (\romannumeral 1) We establish \emph{structure-agnostic} high-probability upper bounds on the excess target risk of the DR estimator that depend only on the $L^2$-errors of the pilot estimates and the Rademacher complexity of the model class, without assuming specific procedures to obtain the pilot estimate, and (\romannumeral 2) under \emph{well-specified parameterized models}, we analyze the DR covariate shift adaptation based on modern techniques for non-asymptotic analysis of MLE, whose key terms governed by the Fisher information mismatch term between the source and target distributions. Together, these findings bridge asymptotic efficiency properties and a finite-sample out-of-distribution generalization bounds, providing a comprehensive theoretical underpinnings for the DR covariate shift adaptation. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.11003 |
| By: | Patrick M. Kline |
| Abstract: | Economists often rely on estimates of linear fixed effects models developed by other teams of researchers. Assessing the uncertainty in these estimates can be challenging. I propose a form of sample splitting for network data that breaks two-way fixed effects estimates into statistically independent branches, each of which provides an unbiased estimate of the parameters of interest. These branches facilitate uncertainty quantification, moment estimation, and shrinkage. Algorithms are developed for efficiently extracting branches from large datasets. I illustrate these techniques using a benchmark dataset from Veneto, Italy that has been widely used to study firm wage effects. |
| JEL: | C01 J30 |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:34486 |
| By: | M. Stocker; W. Ma{\l}gorzewicz; M. Fontana; S. Ben Taieb |
| Abstract: | Conformal prediction is a powerful post-hoc framework for uncertainty quantification that provides distribution-free coverage guarantees. However, these guarantees crucially rely on the assumption of exchangeability. This assumption is fundamentally violated in time series data, where temporal dependence and distributional shifts are pervasive. As a result, classical split-conformal methods may yield prediction intervals that fail to maintain nominal validity. This review unifies recent advances in conformal forecasting methods specifically designed to address nonexchangeable data. We first present a theoretical foundation, deriving finite-sample guarantees for split-conformal prediction under mild weak-dependence conditions. We then survey and classify state-of-the-art approaches that mitigate serial dependence by reweighting calibration data, dynamically updating residual distributions, or adaptively tuning target coverage levels in real time. Finally, we present a comprehensive simulation study that compares these techniques in terms of empirical coverage, interval width, and computational cost, highlighting practical trade-offs and open research directions. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.13608 |
| By: | Mountford, Andrew |
| Abstract: | Restrictions on the contemporaneous effects matrix used to identify fundamental shocks in a structural VAR, also determine the mapping from the structural constant terms to the reduced form constant terms. In some models one will have priors about these structural constant terms and these should therefore be included in a Bayesian estimation procedure. We illustrate the significance of this using a standard 3 variable VAR estimated in Baumeister and Hamilton (2018). We show that imposing priors over the structural constant terms can lead to a more intuitive estimated monetary policy rule and a larger role for monetary policy in describing the evolution of the data, particularly for inflation. |
| Keywords: | Vector Autoregressions, Historical Decompositions, Monetary Policy |
| JEL: | C32 E00 E50 |
| Date: | 2025–11–14 |
| URL: | https://d.repec.org/n?u=RePEc:pra:mprapa:126806 |
| By: | Marina Khismatullina; Bernhard van der Sluis |
| Abstract: | This paper proposes a novel framework to test for slope heterogeneity between time-varying coefficients in panel data models. Our test not only allows us to detect whether the coefficient functions are the same across all units or not, but also determines which of them are different and where these differences are located. We establish the asymptotic validity of our multiscale test. As an extension of the proposed procedure, we show how to use the results to uncover latent group structures in the model. We apply our methods to test for heterogeneity in the effect of U.S. monetary shocks on 49 foreign economies and itself. We find evidence that such heterogeneity indeed exists and we discuss the clustering results for two groups. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.12600 |
| By: | Ferreira Batista Martins, Igor (Örebro University School of Business); Virbickaitè, Audronè (CUNEF University, Madrid, Spain); Nguyen, Hoang (Linköping University); Freitas Lopes, Hedibert (nsper Institute of Education and Research, Sao Paulo, Brazil) |
| Abstract: | We propose a high-frequency stochastic volatility model that integrates persistent component, intraday periodicity, and volume-driven time-of-day effects. By allowing intraday volatility patterns to respond to lagged trading activity, the model captures economically and statistically relevant departures from traditional intraday seasonality effects. We find that the volumedriven component accounts for a substantial share of intraday volatility for futures data across equity indexes, currencies, and commodities. Out-of-sample, our forecasts achieve near-zero intercepts, unit slopes, and the highest R2 values in Mincer-Zarnowitz regressions, while horserace regressions indicate that competing forecasts add little information once our predictions are included. These statistical improvements translate into economically meaningful gains, as volatility-managed portfolio strategies based on our model consistently improve Sharpe ratios. Our results highlight the value of incorporating lagged trading activity into high-frequency volatility models. |
| Keywords: | Intraday volatility; high-frequency; volume; periodicity. |
| JEL: | C11 C22 C53 C58 |
| Date: | 2025–11–21 |
| URL: | https://d.repec.org/n?u=RePEc:hhs:oruesi:2025_014 |
| By: | Emmanuel Flachaire; Bertille Picard |
| Abstract: | The Kitagawa-Oaxaca-Blinder decomposition splits the difference in means between two groups into an explained part, due to observable factors, and an unexplained part. In this paper, we reformulate this framework using potential outcomes, highlighting the critical role of the reference outcome. To address limitations like common support and model misspecification, we extend Neumark's (1988) weighted reference approach with a doubly robust estimator. Using Neyman orthogonality and double machine learning, our method avoids trimming and extrapolation. This improves flexibility and robustness, as illustrated by two empirical applications. Nevertheless, we also highlight that the decomposition based on the Neumark reference outcome is particularly sensitive to the inclusion of irrelevant explanatory variables. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.13433 |
| By: | Daisuke Kurisu; Yuta Okamoto; Taisuke Otsu |
| Abstract: | Since the seminal work by Beresteanu and Molinari(2008), the random set theory and related inference methods have been widely applied in partially identified econometric models. Meanwhile, there is an emerging field in statistics for studying random objects in metric spaces, called metric statistics. This paper clarifies a relationship between two fundamental concepts in these literatures, the Aumann and Fr\'echet means, and presents some applications of metric statistics to econometric problems involving random sets. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.13440 |
| By: | Christian Matthes; Naoya Nagasaka |
| Abstract: | Cross-sectional data have proven to be increasingly useful for macroeconomic research. However, their use often leads to the 'missing intercept' problem in which aggregate general equilibrium effects and policy responses are absorbed into fixed effects. We present a statistical approach to jointly estimate aggregate and idiosyncratic effects within a panel framework, leveraging identification strategies coming from both cross-sectional or time-series settings. We then apply our methodology to study government spending multipliers (Nakamura and Steinsson, 2014) and wealth effects from stock returns (Chodorow-Reich et al., 2021). |
| Keywords: | Fixed Effects; Aggregate Effects; Government Spending; Regional Data; Bayesian Analysis |
| JEL: | C11 C50 E62 H50 R12 |
| Date: | 2025–10–03 |
| URL: | https://d.repec.org/n?u=RePEc:fip:fedrwp:102112 |
| By: | Ping Wu; Dan Zhu |
| Abstract: | Financial markets are interconnected, with micro-currents propagating across global markets and shaping economic trends. This paper moves beyond traditional stock market indices to examine cross-sectional return distributions-15 in our empirical application, each representing a distinct global market. To facilitate this analysis, we develop a matrix functional VAR method with interpretable factors extracted from cross-sectional return distributions. Our approach extends the existing framework from modeling a single function to multiple functions, allowing for a richer representation of cross-sectional dependencies. By jointly modeling these distributions with U.S. macroeconomic indicators, we uncover the predictive power of financial market in forecasting macro-economic dynamics. Our findings reveal that U.S. contractionary monetary policy not only lowers global stock returns, as traditionally understood, but also dampens cross-sectional return kurtosis, highlighting an overlooked policy transmission. This framework enables conditional forecasting, equipping policymakers with a flexible tool to assess macro-financial linkages under different economic scenarios. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.17140 |