|
on Econometrics |
| By: | Koichiro Moriya; Akihiko Noda |
| Abstract: | This paper proposes a new multivariate model specification test that generalizes Durbin regression to a seemingly unrelated regression framework and reframes the Durbin approach as a GLS-class estimator. The proposed estimator explicitly models cross-equation dependence and the joint second-order dynamics of regressors and disturbances. It remains consistent under a comparatively weak dependence condition in which conventional OLS- and GLS-based estimators can be inconsistent, and it is asymptotically efficient under stronger conditions. Monte Carlo experiments indicate that the associated Wald test achieves improved size control and competitive power in finite samples, especially when combined with a bootstrap-based bias correction. An empirical application further illustrates that the proposed procedure delivers stable inference and is practically useful for multi-equation specification testing. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.21272 |
| By: | Ting-Chih Hung; Yu-Chang Chen |
| Abstract: | We study the identification and estimation of long-term treatment effects under unobserved confounding by combining an experimental sample, where the long-term outcome is missing, with an observational sample, where the treatment assignment is unobserved. While standard surrogate index methods fail when unobserved confounders exist, we establish novel identification results by leveraging proxy variables for the unobserved confounders. We further develop multiply robust estimation and inference procedures based on these results. Applying our method to the Job Corps program, we demonstrate its ability to recover experimental benchmarks even when unobserved confounders bias standard surrogate index estimates. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.17712 |
| By: | Rowan Cherodian; Guy Tchuente |
| Abstract: | We study instrumental-variable designs where policy reforms strongly shift the distribution of an endogenous variable but only weakly move its mean. We formalize this by introducing distributional relevance: instruments may be purely distributional. Within a triangular model, distributional relevance suffices for nonparametric identification of average structural effects via a control function. We then propose Quantile Least Squares (Q-LS), which aggregates conditional quantiles of X given Z into an optimal mean-square predictor and uses this projection as an instrument in a linear IV estimator. We establish consistency, asymptotic normality, and the validity of standard 2SLS variance formulas, and we discuss regularization across quantiles. Monte Carlo designs show that Q-LS delivers well-centered estimates and near-correct size when mean-based 2SLS suffers from weak instruments. In Health and Retirement Study data, Q-LS exploits Medicare Part D-induced distributional shifts in out-of-pocket risk to sharpen estimates of its effects on depression. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.16865 |
| By: | Silvia De Nicol\`o; Beatrice Biondi; Mario Mazzocchi |
| Abstract: | The paper studies identification in triple-difference designs when spillover effects contaminate one or more control groups. We show that, under conventional identifying assumptions, the triple-difference model fails to identify both the treatment effect and the spillover effect under such interference. To overcome this limitation, we propose an alternative specification, the double-triple-difference model, and explicitly formalize identifying assumptions and spillover structures required for consistent identification of both effects. We derive formal identification results and assess the performance of the proposed model through Monte Carlo simulations. An empirical application evaluating a Special Economic Zone in Italy is provided. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.15764 |
| By: | Pedro Picchetti |
| Abstract: | This paper develops a finite population framework for analyzing causal effects in settings with imperfect compliance where multiple treatments affect the outcome of interest. Two prominent examples are factorial designs and panel experiments with imperfect compliance. I define finite population causal effects that capture the relative effectiveness of alternative treatment sequences. I provide nonparametric estimators for a rich class of factorial and dynamic causal effects and derive their finite population distributions as the sample size increases. Monte Carlo simulations illustrate the desirable properties of the estimators. Finally, I use the estimator for causal effects in factorial designs to revisit a famous voter mobilization experiment that analyzes the effects of voting encouragement through phone calls on turnout. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.16749 |
| By: | Victor Aguirregabiria; Hui Liu; Yao Luo |
| Abstract: | We propose a fast algorithm for computing the GMM estimator in the BLP demand model (Berry, Levinsohn, and Pakes, 1995). Inspired by nested pseudo-likelihood methods for dynamic discrete choice models, our approach avoids repeatedly solving the inverse demand system by swapping the order of the GMM optimization and the fixed-point computation. We show that, by fixing consumer-level outside-option probabilities, BLP’s market-share–mean-utility inversion becomes closed-form and, crucially, separable across products, yielding a nested pseudo-GMM algorithm with analytic gradients. The resulting estimator scales dramatically better with the number of products and is naturally suited for parallel and multithreaded implementation. In the inner loop, outside-option probabilities are treated as fixed objects while a pseudo-GMM criterion is minimized with respect to the structural parameters, substantially reducing computational cost. Monte Carlo simulations and an empirical application show that our method is significantly faster than the fastest existing alternatives, with efficiency gains that grow more than proportionally in the number of products. We provide MATLAB and Julia code to facilitate implementation. |
| Keywords: | Random Coefficients Logit; Sufficient Statistics; Market Share Inversion; Newton-Kantorovich Iteration; Asymptotic Properties; LCBO |
| JEL: | C23 C25 C51 C61 D12 L11 |
| Date: | 2026–02–04 |
| URL: | https://d.repec.org/n?u=RePEc:tor:tecipa:tecipa-819 |
| By: | Junnan He; Jean-Marc Robin |
| Abstract: | We study a ridge estimator for the high-dimensional two-way fixed effect regression model with a sparse bipartite network. We develop concentration inequalities showing that when the ridge parameters increase as the log of the network size, the bias, and the variance-covariance matrix of the vector of estimated fixed effects converge to deterministic equivalents that depend only on the expected network. We provide simulations and an application using administrative data on wages for worker-firm matches. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.04101 |
| By: | Ilya Archakov |
| Abstract: | We construct and analyze an estimator of association between random variables based on their similarity in both direction and magnitude. Under special conditions, the proposed measure becomes a robust and consistent estimator of the linear correlation, for which an exact sampling distribution is available. This distribution is intrinsically insensitive to heavy tails and outliers, thereby facilitating robust inference for correlations. The measure can be naturally extended to higher dimensions, where it admits an interpretation as an indicator of joint similarity among multiple random variables. We investigate the empirical performance of the proposed measure with financial return data at both high and low frequencies. Specifically, we apply the new estimator to construct confidence intervals for correlations based on intraday returns and to develop a new specification for multivariate GARCH models. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.12198 |
| By: | Jesse Hoekstra; Frank Windmeijer |
| Abstract: | For subvector inference in the linear instrumental variables model under homoskedasticity but allowing for weak instruments, Guggenberger, Kleibergen, and Mavroeidis (2019) (GKM) propose a conditional subvector Anderson and Rubin (1949) (AR) test that uses data-dependent critical values that adapt to the strength of the parameters not under test. This test has correct size and strictly higher power than the test that uses standard asymptotic chi-square critical values. The subvector AR test is the minimum eigenvalue of a data dependent matrix. The GKM critical value function conditions on the largest eigenvalue of this matrix. We consider instead the data dependent critical value function conditioning on the second-smallest eigenvalue, as this eigenvalue is the appropriate indicator for weak identification. We find that the data dependent critical value function of GKM also applies to this conditioning and show that this test has correct size and power strictly higher than the GKM test when the number of parameters not under test is larger than one. Our proposed procedure further applies to the subvector AR test statistic that is robust to an approximate kronecker product structure of conditional heteroskedasticity as proposed by Guggenberger, Kleibergen, and Mavroeidis (2024), carrying over its power advantage to this setting as well. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.17843 |
| By: | Oliver Snellman |
| Abstract: | The paper develops a Transformer architecture for estimating dynamic factors from multivariate time series data under flexible identification assumptions. Performance on small datasets is improved substantially by using a conventional factor model as prior information via a regularization term in the training objective. The results are interpreted with Attention matrices that quantify the relative importance of variables and their lags for the factor estimate. Time variation in Attention patterns can help detect regime switches and evaluate narratives. Monte Carlo experiments suggest that the Transformer is more accurate than the linear factor model, when the data deviate from linear-Gaussian assumptions. An empirical application uses the Transformer to construct a coincident index of U.S. real economic activity. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.12039 |
| By: | Xinran Liu |
| Abstract: | Standard Distributional Synthetic Controls (DSC) estimate counterfactual distributions by minimizing the Euclidean $L_2$ distance between quantile functions. We demonstrate that this geometric reliance renders estimators fragile: they lack informative gradients under support mismatch and produce structural artifacts when outcomes are multimodal. This paper proposes a robust estimator grounded in Optimal Transport (OT). We construct the synthetic control by minimizing the Wasserstein-1 distance between probability measures, implemented via a Wasserstein Generative Adversarial Network (WGAN). We establish the formal point identification of synthetic weights under an affine independence condition on the donor pool. Monte Carlo simulations confirm that while standard estimators exhibit catastrophic variance explosions under heavy-tailed contamination and support mismatch, our WGAN-based approach remains consistent and stable. Furthermore, we show that our measure-based method correctly recovers complex bimodal mixtures where traditional quantile averaging fails structurally. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.17296 |
| By: | Justin Young; Muthoni Ngatia; Eleanor Wiske Dillon |
| Abstract: | Recent developments in causal machine learning methods have made it easier to estimate flexible relationships between confounders, treatments and outcomes, making unconfoundedness assumptions in causal analysis more palatable. How successful are these approaches in recovering ground truth baselines? In this paper we analyze a new data sample including an experimental rollout of a new feature at a large technology company and a simultaneous sample of users who endogenously opted into the feature. We find that recovering ground truth causal effects is feasible -- but only with careful modeling choices. Our results build on the observational causal literature beginning with LaLonde (1986), offering best practices for more credible treatment effect estimation in modern, high-dimensional datasets. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.11845 |
| By: | Jason B. Cho; David S. Matteson |
| Abstract: | We introduce BASTION (Bayesian Adaptive Seasonality and Trend DecompositION), a flexible Bayesian framework for decomposing time series into trend and multiple seasonality components. We cast the decomposition as a penalized nonparametric regression and establish formal conditions under which the trend and seasonal components are uniquely identifiable, an issue only treated informally in the existing literature. BASTION offers three key advantages over existing decomposition methods: (1) accurate estimation of trend and seasonality amidst abrupt changes, (2) enhanced robustness against outliers and time-varying volatility, and (3) robust uncertainty quantification. We evaluate BASTION against established methods, including TBATS, STR, and MSTL, using both simulated and real-world datasets. By effectively capturing complex dynamics while accounting for irregular components such as outliers and heteroskedasticity, BASTION delivers a more nuanced and interpretable decomposition. To support further research and practical applications, BASTION is available as an R package at https://github.com/Jasoncho0914/BASTION |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.18052 |
| By: | Tjeerd De Vries |
| Abstract: | We propose a projection method to estimate risk-neutral moments from option prices. We derive a finite-sample bound implying that the projection estimator attains (up to a constant) the smallest pricing error within the span of traded option payoffs. This finite-sample optimality is not available for the widely used Carr--Madan approximation. Simulations show sizable accuracy gains for key quantities such as VIX and SVIX. We then extend the framework to multiple underlyings, deriving necessary and sufficient conditions under which simple options complete the market in higher dimensions, and providing estimators for joint moments. In our empirical application, we recover risk-neutral correlations and joint tail risk from FX options alone, addressing a longstanding measurement problem raised by Ross (1976). Our joint tail-risk measure predicts future joint currency crashes and identifies periods in which currency portfolios are particularly useful for hedging. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.14852 |
| By: | Etienne Wijler (Vrije Universiteit Amsterdam); Andre Lucas (Vrije Universiteit Amsterdam and Tinbergen Institute) |
| Abstract: | We develop a data-driven procedure to identify which correlations in high-dimensional dynamic systems should be time-varying, constant, or zero. The method integrates a vine-based multivariate partial correlation model with sequential penalized estimation. Applied to 50 US equities and systematic risk factors, results indicate that asset-level correlation dynamics are primarily induced by time-varying exposures to systematic factors. We further uncover persistent, non-zero, and occasionally time-varying partial correlations within industries, even after controlling for standard risk and industry factors. Finally, we show how the new methodology may be used to explore the relevance of systematic risk factors in an impartial way. |
| Keywords: | conditional correlations, score-driven models, financial market structure, regularization |
| JEL: | C32 C58 |
| Date: | 2025–09–19 |
| URL: | https://d.repec.org/n?u=RePEc:tin:wpaper:20250051 |
| By: | Jan Magnus (Vrije Universiteit Amsterdam and Tinbergen Institute); Andrey L. Vasnev (University of Sydney) |
| Abstract: | Given several studies (inputs) of some phenomenon of interest, each input presents an estimate of a key parameter with an associated estimated precision. The random-effects model used in meta-analysis estimates this parameter based on a decomposition of the error term into within-input noise and across-input noise. Our interest is in the precision of this estimator, which leads to a confidence interval of the parameter. But we shall also be interested in the precision when we transform the inputs into one input, which leads to a (much wider) prediction interval. We review and extend the meta-analysis framework in a maximum-likelihood context, paying special attention to conflict between the inputs, correlation between the inputs, and the difference between confidence and prediction intervals and the corresponding notions of precision. We illustrate our approach with two meta-analyses from the world of clinical trials and finance. |
| Keywords: | Conflicting evidence, confidence interval, prediction interval, information aggregation, meta-analysis, random-effects model, nonstandard errors |
| JEL: | C13 C53 G10 I19 |
| Date: | 2025–08–26 |
| URL: | https://d.repec.org/n?u=RePEc:tin:wpaper:20250048 |
| By: | Paker, Meredith; Stephenson, Judy; Wallis, Patrick |
| Abstract: | Understanding long-run economic growth requires reliable historical data, yet the vast majority of long-run economic time series are drawn from incomplete records with significant temporal and geographic gaps. Conventional solutions to these gaps rely on linear regressions that risk bias or overfitting when data are scarce. We introduce “past predictive modeling, ” a framework that leverages machine learning and out-of-sample predictive modeling techniques to reconstruct representative historical time series from scarce data. Validating our approach using nominal wage data from England, 1300-1900, we show that this new method leads to more accurate and generalizable estimates, with bootstrapped standard errors 72% lower than benchmark linear regressions. Beyond just bettering accuracy, these improved wage estimates for England yield new insights into the impact of the Black Death on inequality, the economic geography of pre-industrial growth, and productivity over the long-run. |
| Keywords: | machine learning; predictive modeling; wages; black death; industrial revolution |
| JEL: | J31 C53 N33 N13 N63 |
| Date: | 2025–06–13 |
| URL: | https://d.repec.org/n?u=RePEc:ehl:wpaper:128852 |
| By: | Mohammad Ghaderi |
| Abstract: | This paper introduces the attention-entropy random utility (AERU) model, a behavioral model of discrete choice in which a decision-maker endogenously allocates attention across subsets of attributes in order to increase subjective confidence by reducing ex post choice uncertainty, and subsequently chooses an option based solely on the attended information. By endogenizing attention, the decision problem is reformulated from “which alternative to choose” to “which informational cues to process, ” with the observed choice emerging as the outcome of this attentional allocation. The AERU framework nests random utility model (RUM)-like behavior under transparent conditions, yet it is not restricted by Luce’s independence of irrelevant alternatives (IIA), order-independence, or regularity. This flexibility enables AERU to capture key context effects in a disciplined manner and to generate sharp, testable predictions regarding the conditions for each context effect. From an empirical standpoint, AERU preserves the parsimony of the multinomial logit, requiring only a single additional attention parameter. Employing a scalable estimation procedure based on block coordinate ascent combined with a quasi-Newton method, I provide results from computational experiments demonstrating that AERU can produce better in-sample and out-of-sample predictions. Overall, AERU provides a flexible, parsimonious, and interpretable model of boundedly rational choice with a clear behavioral foundation and implications for context effects. |
| Keywords: | discrete choice, endogenous attention, entropy, subjective confidence, random utility, context effects, regularity |
| JEL: | D91 C35 D01 D83 C63 |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:upf:upfgen:1936 |