nep-ecm New Economics Papers
on Econometrics
Issue of 2026–01–26
forty-two papers chosen by
Sune Karlsson, Örebro universitet


  1. Automatic Debiased Machine Learning of Structural Parameters with General Conditional Moments By Facundo Arga\~naraz
  2. Raking for estimation and inference in panel models with nonignorable attrition and refreshment By Grigory Franguridi; Jinyong Hahn; Pierre Hoonhout; Arie Kapteyn; Geert Ridder
  3. Double Machine Learning of Continuous Treatment Effects with General Instrumental Variables By Shuyuan Chen; Peng Zhang; Yifan Cui
  4. Distribution-Matching Posterior Inference for Incomplete Structural Models By Takashi Kano
  5. Estimating Program Participation with Partial Validation By Augustine Denteh; Pierre E. Nguimkeu
  6. Order-Constrained Spectral Causality in Multivariate Time Series By Alejandro Rodriguez Dominguez
  7. Estimation of a Dynamic Tobit Model with a Unit Root By Anna Bykhovskaya; James A. Duffy
  8. Robust Two-Sample Mean Inference under Serial Dependence By Ulrich Hounyo; Min Seong Kim
  9. Lee Bounds for Random Objects By Daisuke Kurisu; Yuta Okamoto; Taisuke Otsu
  10. Automatic debiased machine learning and sensitivity analysis for sample selection models By Jakob Bjelac; Victor Chernozhukov; Phil-Adrian Klotz; Jannis Kueck; Theresa M. A. Schmitz
  11. Event Studies with Feedback By Irene Botosaru; Laura Liu
  12. Distributionally Robust Treatment Effect By Ruonan Xu; Xiye Yang
  13. Semiparametric inference for inequality measures under nonignorable nonresponse using callback data By Xinyu Wang; Chunlin Wang; Tao Yu; Pengfei Li
  14. Testing shock independence in Gaussian structural VARs By Dante Amengual; Gabriele Fiorentini; Enrique Sentana
  15. From Unstructured Data to Demand Counterfactuals: Theory and Practice By Timothy Christensen; Giovanni Compiani
  16. Variational Regularized Bilevel Estimation for Exponential Random Graph Models By Yoon Choi
  17. Testing the Significance of the Difference-in-Differences Coefficient via Doubly Randomised Inference By Stanis{\l}aw Marek Sergiusz Halkiewicz; Andrzej Ka{\l}u\.za
  18. Ill-Conditioned Orthogonal Scores in Double Machine Learning By Gabriel Saco
  19. Learning about Treatment Effects with Prior Studies: A Bayesian Model Averaging Approach By Frederico Finan; Demian Pouzo
  20. Linear Regression in a Nonlinear World By Nadav Kunievsky
  21. A Shrinkage Factor-Augmented VAR for High-Dimensional Macro–Fiscal Dynamics By Kyriakopoulou, Dimitra
  22. Policy-Aligned Estimation of Conditional Average Treatment Effects By Artem Timoshenko; Caio Waisman
  23. Transfer Learning (Il)liquidity By Andrea Conti; Giacomo Morelli
  24. One Instrument, Many Treatments: Instrumental Variables Identification of Multiple Causal Effects By Joshua Angrist; Andres Santos; Otávio Tecchio
  25. Reinforcement Learning Based Computationally Efficient Conditional Choice Simulation Estimation of Dynamic Discrete Choice Models By Ahmed Khwaja; Sonal Srivastava
  26. Heterogeneous Effects of Endogenous Treatments with Interference and Spillovers in a Large Network By Lin Chen; Yuya Sasaki
  27. Time-Aware Synthetic Control By Saeyoung Rho; Cyrus Illick; Samhitha Narasipura; Alberto Abadie; Daniel Hsu; Vishal Misra
  28. Bounds on inequality with incomplete data By James Banks; Thomas Glinnan; Tatiana Komarova
  29. Empirical Asset Pricing with Score-Driven Conditional Betas By Thomas Giroux; Julien Royer; Olivier David Zerbib
  30. On lead-lag estimation of non-synchronously observed point processes By Takaaki Shiotani; Takaki Hayashi; Yuta Koike
  31. Limitations of Randomization Tests in Finite Samples By Deniz Dutz; Xinyi Zhang
  32. Como medir o invis\'ivel? Guerras, pizzarias do Pent\'agono e o uso de vari\'aveis proxy em econometria By Guilherme Vianna; Victor Rangel
  33. Covariate Augmented CUSUM Bubble Monitoring Procedures By Astill, Sam; Taylor, AM Robert; Zu Yang
  34. Corrected Forecast Combinations By Chu-An Liu; Andrey L. Vasnev
  35. Learning and Testing Exposure Mappings of Interference using Graph Convolutional Autoencoder By Martin Huber; Jannis Kueck; Mara Mattes
  36. Selecting and Testing Asset Pricing Models: A Stepwise Approach By Guanhao Feng; Wei Lan; Hansheng Wang; Jun Zhang
  37. Dynamic Mortality Forecasting via Mixed-Frequency State-Space Models By Runze Li; Rui Zhou; David Pitt
  38. Large-dimensional cointegrated threshold factor models: The Global Term Structure of Interest Rates By Paulo M.M. Rodrigues; Daniel Abreu
  39. Efficiency versus Robustness under Tail Misspecification: Importance Sampling and Moment-Based VaR Bracketing By Aditri
  40. Limits To (Machine) Learning By Zhimin Chen; Bryan T. Kelly; Semyon Malamud
  41. ProbFM: Probabilistic Time Series Foundation Model with Uncertainty Decomposition By Arundeep Chinta; Lucas Vinh Tran; Jay Katukuri
  42. The Fourier estimator of spot volatility: Unbounded coefficients and jumps in the price process By L. J. Espinosa Gonz\'alez; Erick Trevi\~no Aguilar

  1. By: Facundo Arga\~naraz
    Abstract: This paper proposes a method to automatically construct or estimate Neyman-orthogonal moments in general models defined by a finite number of conditional moment restrictions (CMRs), with possibly different conditioning variables and endogenous regressors. CMRs are allowed to depend on non-parametric components, which might be flexibly modeled using Machine Learning tools, and non-linearly on finite-dimensional parameters. The key step in this construction is the estimation of Orthogonal Instrumental Variables (OR-IVs) -- "residualized" functions of the conditioning variables, which are then combined to obtain a debiased moment. We argue that computing OR-IVs necessarily requires solving potentially complicated functional equations, which depend on unknown terms. However, by imposing an approximate sparsity condition, our method finds the solutions to those equations using a Lasso-type program and can then be implemented straightforwardly. Based on this, we introduce a GMM estimator of finite-dimensional parameters (structural parameters) in a two-step framework. We derive theoretical guarantees for our construction of OR-IVs and show $\sqrt{n}$-consistency and asymptotic normality for the estimator of the structural parameters. Our Monte Carlo experiments and an empirical application on estimating firm-level production functions highlight the importance of relying on inference methods like the one proposed.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.08423
  2. By: Grigory Franguridi; Jinyong Hahn; Pierre Hoonhout; Arie Kapteyn; Geert Ridder
    Abstract: In panel data subject to nonignorable attrition, auxiliary (refreshment) sampling may restore full identification under weak assumptions on the attrition process. Despite their generality, these identification strategies have seen limited empirical use, largely because the implied estimation procedure requires solving a functional minimization problem for the target density. We show that this problem can be solved using the iterative proportional fitting (raking) algorithm, which converges rapidly even with continuous and moderately high-dimensional data. This resulting density estimator is then used as input into a parametric moment condition. We establish consistency and convergence rates for both the raking-based density estimator and the resulting moment estimator when the distributions of the observed data are parametric. We also derive a simple recursive procedure for estimating the asymptotic variance. Finally, we demonstrate the satisfactory performance of our estimator in simulations and provide an empirical illustration using data from the Understanding America Study panel.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.13270
  3. By: Shuyuan Chen; Peng Zhang; Yifan Cui
    Abstract: Estimating causal effects of continuous treatments is a common problem in practice, for example, in studying dose-response functions. Classical analyses typically assume that all confounders are fully observed, whereas in real-world applications, unmeasured confounding often persists. In this article, we propose a novel framework for local identification of dose-response functions using instrumental variables, thereby mitigating bias induced by unobserved confounders. We introduce the concept of a uniform regular weighting function and consider covering the treatment space with a finite collection of open sets. On each of these sets, such a weighting function exists, allowing us to identify the dose-response function locally within the corresponding region. For estimation, we develop an augmented inverse probability weighting score for continuous treatments under a debiased machine learning framework with instrumental variables. We further establish the asymptotic properties when the dose-response function is estimated via kernel regression or empirical risk minimization. Finally, we conduct both simulation and empirical studies to assess the finite-sample performance of the proposed methods.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.01471
  4. By: Takashi Kano
    Abstract: This paper introduces a Bayesian inference framework for incomplete structural models, termed distribution-matching posterior inference (DMPI). Extending the minimal econometric interpretation (MEI), DMPI constructs a divergence-based quasi-likelihood using the Jensen-Shannon divergence between theoretical and empirical population-moment distributions, based on a Dirichlet-multinomial structure with additive smoothing. The framework accommodates model misspecification and stochastic singularity. Posterior inference is implemented via a sequential Monte Carlo algorithm with Metropolis-Hastings mutation that jointly samples structural parameters and theoretical moment distributions. Monte Carlo experiments using misspecified New Keynesian (NK) models demonstrate that DMPI yields robust inference and improves distribution-matching coherence by probabilistically down-weighting moment distributions inconsistent with the structural model. An empirical application to U.S. data shows that a parsimonious stochastic singular NK model provides a better fit to business-cycle moments than an overparameterized full-rank counterpart.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.01077
  5. By: Augustine Denteh; Pierre E. Nguimkeu
    Abstract: This paper considers the estimation of binary choice models when survey responses are possibly misclassified but one of the response category can be validated. Partial validation may occur when survey questions about participation include follow-up questions on that particular response category. In this case, we show that the initial two-sided misclassification problem can be transformed into a one-sided one, based on the partially validated responses. Using the updated responses naively for estimation does not solve or mitigate the misclassification bias, and we derive the ensuing asymptotic bias under general conditions. We then show how the partially validated responses can be used to construct a model for participation and propose consistent and asymptotically normal estimators that overcome misclassification error. Monte Carlo simulations are provided to demonstrate the finite sample performance of the proposed and selected existing methods. We provide an empirical illustration on the determinants of health insurance coverage in Ghana. We discuss implications for the design of survey questionnaires that allow researchers to overcome misclassification biases without recourse to relatively costly and often imperfect validation data.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.14616
  6. By: Alejandro Rodriguez Dominguez
    Abstract: We introduce an operator-theoretic framework for causal analysis in multivariate time series based on order-constrained spectral non-invariance. Directional influence is defined as sensitivity of second-order dependence operators to admissible, order-preserving temporal deformations of a designated source component, yielding an intrinsically multivariate causal notion summarized through orthogonally invariant spectral functionals. Under linear Gaussian assumptions, the criterion coincides with linear Granger causality, while beyond this regime it captures collective and nonlinear directional dependence not reflected in pairwise predictability. We establish existence, uniform consistency, and valid inference for the resulting non-smooth supremum--infimum statistics using shift-based randomization that exploits order-induced group invariance, yielding finite-sample exactness under exact invariance and asymptotic validity under weak dependence without parametric assumptions. Simulations demonstrate correct size and strong power against distributed and bulk-dominated alternatives, including nonlinear dependence missed by linear Granger tests with appropriate feature embeddings. An empirical application to a high-dimensional panel of daily financial return series spanning major asset classes illustrates system-level causal monitoring in practice. Directional organization is episodic and stress-dependent, causal propagation strengthens while remaining multi-channel, dominant causal hubs reallocate rapidly, and statistically robust transmission channels are sparse and horizon-heterogeneous even when aggregate lead--lag asymmetry is weak. The framework provides a scalable and interpretable complement to correlation-, factor-, and pairwise Granger-style analyses for complex systems.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.01216
  7. By: Anna Bykhovskaya; James A. Duffy
    Abstract: This paper studies robust estimation in the dynamic Tobit model under local-to-unity (LUR) asymptotics. We show that both Gaussian maximum likelihood (ML) and censored least absolute deviations (CLAD) estimators are consistent, extending results from the stationary case where ordinary least squares (OLS) is inconsistent. The asymptotic distributions of MLE and CLAD are derived; for the short-run parameters they are shown to be Gaussian, yielding standard normal t-statistics. In contrast, although OLS remains consistent under LUR, its t-statistics are not standard normal. These results enable reliable model selection via sequential t-tests based on ML and CLAD, paralleling the linear autoregressive case. Applications to financial and epidemiological time series illustrate their practical relevance.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.12110
  8. By: Ulrich Hounyo; Min Seong Kim
    Abstract: We propose robust two-sample tests for comparing means in time series. The framework accommodates a wide range of applications, including structural breaks, treatment-control comparisons, and group-averaged panel data. We first consider series HAR two-sample t-tests, where standardization employs orthonormal basis projections, ensuring valid inference under heterogeneity and nonparametric dependence structures. We propose a Welch-type t-approximation with adjusted degrees of freedom to account for long-run variance heterogeneity across the series. We further develop a series-based HAR wild bootstrap test, extending traditional wild bootstrap methods to the time-series setting. Our bootstrap avoids resampling blocks of observations and delivers superior finite-sample performance.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.11259
  9. By: Daisuke Kurisu; Yuta Okamoto; Taisuke Otsu
    Abstract: In applied research, Lee (2009) bounds are widely applied to bound the average treatment effect in the presence of selection bias. This paper extends the methodology of Lee bounds to accommodate outcomes in a general metric space, such as compositional and distributional data. By exploiting a representation of the Fr\'echet mean of the potential outcome via embedding in an Euclidean or Hilbert space, we present a feasible characterization of the identified set of the causal effect of interest, and then propose its analog estimator and bootstrap confidence region. The proposed method is illustrated by numerical examples on compositional and distributional data.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.09453
  10. By: Jakob Bjelac; Victor Chernozhukov; Phil-Adrian Klotz; Jannis Kueck; Theresa M. A. Schmitz
    Abstract: In this paper, we extend the Riesz representation framework to causal inference under sample selection, where both treatment assignment and outcome observability are non-random. Formulating the problem in terms of a Riesz representer enables stable estimation and a transparent decomposition of omitted variable bias into three interpretable components: a data-identified scale factor, outcome confounding strength, and selection confounding strength. For estimation, we employ the ForestRiesz estimator, which accounts for selective outcome observability while avoiding the instability associated with direct propensity score inversion. We assess finite-sample performance through a simulation study and show that conventional double machine learning approaches can be highly sensitive to tuning parameters due to their reliance on inverse probability weighting, whereas the ForestRiesz estimator delivers more stable performance by leveraging automatic debiased machine learning. In an empirical application to the gender wage gap in the U.S., we find that our ForestRiesz approach yields larger treatment effect estimates than a standard double machine learning approach, suggesting that ignoring sample selection leads to an underestimation of the gender wage gap. Sensitivity analysis indicates that implausibly strong unobserved confounding would be required to overturn our results. Overall, our approach provides a unified, robust, and computationally attractive framework for causal inference under sample selection.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.08643
  11. By: Irene Botosaru; Laura Liu
    Abstract: Event studies often conflate direct treatment effects with indirect effects operating through endogenous covariate adjustment. We develop a dynamic panel event study framework that separates these effects. The framework allows for persistent outcomes and treatment effects and for covariates that respond to past outcomes and treatment exposure. Under sequential exogeneity and homogeneous feedback, we establish point identification of common parameters governing outcome and treatment effect dynamics, the distribution of heterogeneous treatment effects, and the covariate feedback process. We propose an algorithm for dynamic decomposition that enables researchers to assess the relative importance of each effect in driving treatment effect dynamics.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.05493
  12. By: Ruonan Xu; Xiye Yang
    Abstract: Using only retrospective data, we propose an estimator for predicting the treatment effect for the same treatment/policy to be implemented in another location or time period, which requires no input from the target population. More specifically, we minimize the worst-case mean square error for the prediction of treatment effect within a class of distributions inside the Wasserstein ball centered on the source distribution. Since the joint distribution of potential outcomes is not identified, we pick the best and worst copulas of the marginal distributions of two potential outcomes as our optimistic and pessimistic optimization objects for partial identification. As a result, we can attain the upper and lower bounds of the minimax optimizer. The minimax solution differs depending on whether treatment effects are homogeneous or heterogeneous. We derive the consistency and asymptotic distribution of the bound estimators, provide a two-step inference procedure, and discuss the choice of the Wasserstein ball radius.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.12781
  13. By: Xinyu Wang; Chunlin Wang; Tao Yu; Pengfei Li
    Abstract: This paper develops semiparametric methods for estimation and inference of widely used inequality measures when survey data are subject to nonignorable nonresponse, a challenging setting in which response probabilities depend on the unobserved outcomes. Such nonresponse mechanisms are common in household surveys and invalidate standard inference procedures due to selection bias and lack of population representativeness. We address this problem by exploiting callback data from repeated contact attempts and adopting a semiparametric model that leaves the outcome distribution unspecified. We construct semiparametric full-likelihood estimators for the underlying distribution and the associated inequality measures, and establish their large-sample properties for a broad class of functionals, including quantiles, the Theil index, and the Gini index. Explicit asymptotic variance expressions are derived, enabling valid Wald-type inference under nonignorable nonresponse. To facilitate implementation, we propose a stable and computationally convenient expectation-maximization algorithm, whose steps either admit closed-form expressions or reduce to fitting a standard logistic regression model. Simulation studies demonstrate that the proposed procedures effectively correct nonresponse bias and achieve near-benchmark efficiency. An application to Consumer Expenditure Survey data illustrates the practical gains from incorporating callback information when making inference on inequality measures.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.10501
  14. By: Dante Amengual (CEMFI, Centro de Estudios Monetarios y Financieros); Gabriele Fiorentini (Università di Firenze and RCEA); Enrique Sentana (CEMFI, Centro de Estudios Monetarios y Financieros)
    Abstract: We propose specification tests for Gaussian SVAR models identified with short- and long-run restrictions that assess the theoretical justification of the chosen identification scheme by checking the independence of the structural shocks. We consider both moment tests that focus on their coskewness and cokurtosis and contingency table tests with discrete and continuous grids. Our simulations confirm the finite sample reliability of resampling versions of our proposals, and their power against interesting alternatives. We also apply them to two influential studies: Kilian (2009) with short-run restrictions in oil markets and Blanchard and Quah (1989) with long-run ones for the aggregate economy.
    Keywords: Consistent test, coskewness, cokurtosis, independence test, moment tests, oil market, pseudo maximum likelihood estimators, supply and demand shocks.
    JEL: C32 C52 E32 Q41 Q43
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:cmf:wpaper:wp2025_2532
  15. By: Timothy Christensen; Giovanni Compiani
    Abstract: Empirical models of demand for differentiated products rely on low-dimensional product representations to capture substitution patterns. These representations are increasingly proxied by applying ML methods to high-dimensional, unstructured data, including product descriptions and images. When proxies fail to capture the true dimensions of differentiation that drive substitution, standard workflows will deliver biased counterfactuals and invalid inference. We develop a practical toolkit that corrects this bias and ensures valid inference for a broad class of counterfactuals. Our approach applies to market-level and/or individual data, requires minimal additional computation, is efficient, delivers simple formulas for standard errors, and accommodates data-dependent proxies, including embeddings from fine-tuned ML models. It can also be used with standard quantitative attributes when mismeasurement is a concern. In addition, we propose diagnostics to assess the adequacy of the proxy construction and dimension. The approach yields meaningful improvements in predicting counterfactual substitution in both simulations and an empirical application.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.05374
  16. By: Yoon Choi
    Abstract: I propose an estimation algorithm for Exponential Random Graph Models (ERGM), a popular statistical network model for estimating the structural parameters of strategic network formation in economics and finance. Existing methods often produce unreliable estimates of parameters for the triangle, a key network structure that captures the tendency of two individuals with friends in common to connect. Such unreliable estimates may lead to untrustworthy policy recommendations for networks with triangles. Through a variational mean-field approach, my algorithm addresses the two well-known difficulties when estimating the ERGM, the intractability of its normalizing constant and model degeneracy. In addition, I introduce $\ell_2$ regularization that ensures a unique solution to the mean-field approximation problem under suitable conditions. I provide a non-asymptotic optimization convergence rate analysis for my proposed algorithm under mild regularity conditions. Through Monte Carlo simulations, I demonstrate that my method achieves a perfect sign recovery rate for triangle parameters for small and mid-sized networks under perturbed initialization, compared to a 50% rate for existing algorithms. I provide the sensitivity analysis of estimates of ERGM parameters to hyperparameter choices, offering practical insights for implementation.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.07176
  17. By: Stanis{\l}aw Marek Sergiusz Halkiewicz; Andrzej Ka{\l}u\.za
    Abstract: This article develops a significance test for the Difference-in-Differences (DiD) estimator based on doubly randomised inference, in which both the treatment and time indicators are permuted to generate an empirical null distribution of the DiD coefficient. Unlike classical $t$-tests or single-margin permutation procedures, the proposed method exploits a substantially enlarged randomization space. We formally characterise this expansion and show that dual randomization increases the number of admissible relabelings by a factor of $\binom{n}{n_T}$, yielding an exponentially richer permutation universe. This combinatorial gain implies a denser and more stable approximation of the null distribution, a result further justified through an information-theoretic (entropy) interpretation. The validity and finite-sample behaviour of the test are examined using multiple empirical datasets commonly analysed in applied economics, including the Indonesian school construction program (INPRES), brand search data, minimum wage reforms, and municipality-level refugee inflows in Greece. Across all settings, doubly randomised inference performs comparably to standard approaches while offering superior small-sample stability and sharper critical regions due to the enlarged permutation space. The proposed procedure therefore provides a robust, nonparametric alternative for assessing the statistical significance of DiD estimates, particularly in designs with limited group sizes or irregular assignment structures.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.06946
  18. By: Gabriel Saco
    Abstract: Double Machine Learning is often justified by nuisance-rate conditions, yet finite-sample reliability also depends on the conditioning of the orthogonal-score Jacobian. This conditioning is typically assumed rather than tracked. When residualized treatment variance is small, the Jacobian is ill-conditioned and small systematic nuisance errors can be amplified, so nominal confidence intervals may look precise yet systematically under-cover. Our main result is an exact identity for the cross-fitted PLR-DML estimator, with no Taylor approximation. From this identity, we derive a stochastic-order bound that separates oracle noise from a conditioning-amplified nuisance remainder and yields a sufficiency condition for root-n-inference. We further connect the amplification factor to semiparametric efficiency geometry via the Riesz representer and use a triangular-array framework to characterize regimes as residual treatment variation weakens. These results motivate an out-of-fold diagnostic that summarizes the implied amplification scale. We do not propose universal thresholds. Instead, we recommend reporting the diagnostic alongside cross-learner sensitivity summaries as a fragility assessment, illustrated in simulation and an empirical example.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.07083
  19. By: Frederico Finan; Demian Pouzo
    Abstract: We establish concentration rates for estimation of treatment effects in experiments that incorporate prior sources of information -- such as past pilots, related studies, or expert assessments -- whose external validity is uncertain. Each source is modeled as a Gaussian prior with its own mean and precision, and sources are combined using Bayesian model averaging (BMA), allowing data from the new experiment to update posterior weights. To capture empirically relevant settings in which prior studies may be as informative as the current experiment, we introduce a nonstandard asymptotic framework in which prior precisions grow with the experiment's sample size. In this regime, posterior weights are governed by an external-validity index that depends jointly on a source's bias and information content: biased sources are exponentially downweighted, while unbiased sources dominate. When at least one source is unbiased, our procedure concentrates on the unbiased set and achieves faster convergence than relying on new data alone. When all sources are biased, including a deliberately conservative (diffuse) prior guarantees robustness and recovers the standard convergence rate.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.09888
  20. By: Nadav Kunievsky
    Abstract: The interpretation of coefficients from multivariate linear regression relies on the assumption that the conditional expectation function is linear in the variables. However, in many cases the underlying data generating process is nonlinear. This paper examines how to interpret regression coefficients under nonlinearity. We show that if the relationships between the variable of interest and other covariates are linear, then the coefficient on the variable of interest represents a weighted average of the derivatives of the outcome conditional expectation function with respect to the variable of interest. If these relationships are nonlinear, the regression coefficient becomes biased relative to this weighted average. We show that this bias is interpretable, analogous to the biases from measurement error and omitted variable bias under the standard linear model.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.13645
  21. By: Kyriakopoulou, Dimitra
    Abstract: We propose a ridge-regularized Factor-Augmented Vector Autoregression (FAVAR) for forecasting macro–fiscal systems in data-rich environments where the cross-sectional dimension is large relative to the available sample. The framework combines principal-component factor extraction with a shrinkage-based VAR for the joint dynamics of observed macro–fiscal variables and latent components. Applying the model to Greece, we show that the extracted factors capture meaningful real and nominal structures, while the ridge-regularized VAR delivers stable impulse responses and coherent short- and medium-term dynamics for variables central to the sovereign debt identity. A recursive out-of-sample evaluation indicates that the ridge-FAVAR systematically improves medium-term forecasting accuracy relative to standard AR benchmarks, particularly for real GDP growth and the interest–growth differential. The results highlight the usefulness of shrinkage-augmented factor models for macro–fiscal forecasting and motivate further econometric work on regularized state-space and structural factor VARs.
    Keywords: FAVAR, Ridge Regression, Forecasting, High-Dimensional Data, Fiscal Policy, Debt Dynamics, Macro–Fiscal Modelling
    JEL: C32 C38 C53 C55 E62 H63
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:pra:mprapa:127158
  22. By: Artem Timoshenko; Caio Waisman
    Abstract: Firms often develop targeting policies to personalize marketing actions and improve incremental profits. Effective targeting depends on accurately separating customers with positive versus negative treatment effects. We propose an approach to estimate the conditional average treatment effects (CATEs) of marketing actions that aligns their estimation with the firm's profit objective. The method recognizes that, for many customers, treatment effects are so extreme that additional accuracy is unlikely to change the recommended actions. However, accuracy matters near the decision boundary, as small errors can alter targeting decisions. By modifying the firm's objective function in the standard profit maximization problem, our method yields a near-optimal targeting policy while simultaneously estimating CATEs. This introduces a new perspective on CATE estimation, reframing it as a problem of profit optimization rather than prediction accuracy. We establish the theoretical properties of the proposed method and demonstrate its performance and trade-offs using synthetic data.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.13400
  23. By: Andrea Conti; Giacomo Morelli
    Abstract: The estimation of the Risk Neutral Density (RND) implicit in option prices is challenging, especially in illiquid markets. We introduce the Deep Log-Sum-Exp Neural Network, an architecture that leverages Deep and Transfer learning to address RND estimation in the presence of irregular and illiquid strikes. We prove key statistical properties of the model and the consistency of the estimator. We illustrate the benefits of transfer learning to improve the estimation of the RND in severe illiquidity conditions through Monte Carlo simulations, and we test it empirically on SPX data, comparing it with popular estimation methods. Overall, our framework shows recovery of the RND in conditions of extreme illiquidity with as few as three option quotes.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.11731
  24. By: Joshua Angrist; Andres Santos; Otávio Tecchio
    Abstract: Many instrumental variables applications specify a single Bernoulli treatment. But instruments may change outcomes through multiple pathways or by varying treatment intensity. Lottery instruments that boost charter school enrollment, for instance, may affect outcomes by lengthening time enrolled in a charter school and by moving students between charter schools of different types. We analyze the identification problem such scenarios present in a framework that generalizes the always-taker/never-taker/complier partition of treatment response types to cover a wide range of multinomial and ordered treatments with heterogenous potential outcomes. This framework yields novel estimators in which a single randomly assigned instrument identifies (i) causal effects averaged over complier types and (ii) a causal conditional expectation function that captures effects for each element in a set of response types. Three empirical applications demonstrate the utility of these results. The first extends an earlier analysis of the Head Start Impact Study allowing for multiple fallbacks. The second examines two causal channels for the impact of post-secondary financial aid on degree completion. The third estimates effects of additional births (an ordered treatment) on mothers’ earnings.
    JEL: C14 C21 C26 I23 I26
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:nbr:nberwo:34607
  25. By: Ahmed Khwaja; Sonal Srivastava
    Abstract: Dynamic discrete choice (DDC) models have found widespread application in marketing. However, estimating these becomes challenging in "big data" settings with high-dimensional state-action spaces. To address this challenge, this paper develops a Reinforcement Learning (RL)-based two-step ("computationally light") Conditional Choice Simulation (CCS) estimation approach that combines the scalability of machine learning with the transparency, explainability, and interpretability of structural models, which is particularly valuable for counterfactual policy analysis. The method is premised on three insights: (1) the CCS ("forward simulation") approach is a special case of RL algorithms, (2) starting from an initial state-action pair, CCS updates the corresponding value function only after each simulation path has terminated, whereas RL algorithms may update for all the state-action pairs visited along a simulated path, and (3) RL focuses on inferring an agent's optimal policy with known reward functions, whereas DDC models focus on estimating the reward functions presupposing optimal policies. The procedure's computational efficiency over CCS estimation is demonstrated using Monte Carlo simulations with a canonical machine replacement and a consumer food purchase model. Framing CCS estimation of DDC models as an RL problem increases their applicability and scalability to high-dimensional marketing problems while retaining both interpretability and tractability.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.02069
  26. By: Lin Chen; Yuya Sasaki
    Abstract: This paper studies the identification and estimation of heterogeneous effects of an endogenous treatment under interference and spillovers in a large single-network setting. We model endogenous treatment selection as an equilibrium outcome that explicitly accounts for spillovers and derive conditions guaranteeing the existence and uniqueness of this equilibrium. We then identify heterogeneous marginal exposure effects (MEEs), which may vary with both the treatment status of neighboring nodes and unobserved heterogeneity. We develop estimation strategies and establish their large-sample properties. Equipped with these tools, we analyze the heterogeneous effects of import competition on U.S. local labor markets in the presence of interference and spillovers. We find negative MEEs, consistent with the existing literature. However, these effects are amplified by spillovers in the presence of treated neighbors and among localities that tend to select into lower levels of import competition. These additional empirical findings are novel and would not be credibly obtainable without the econometric framework proposed in this paper.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.14515
  27. By: Saeyoung Rho; Cyrus Illick; Samhitha Narasipura; Alberto Abadie; Daniel Hsu; Vishal Misra
    Abstract: The synthetic control (SC) framework is widely used for observational causal inference with time-series panel data. SC has been successful in diverse applications, but existing methods typically treat the ordering of pre-intervention time indices interchangeable. This invariance means they may not fully take advantage of temporal structure when strong trends are present. We propose Time-Aware Synthetic Control (TASC), which employs a state-space model with a constant trend while preserving a low-rank structure of the signal. TASC uses the Kalman filter and Rauch-Tung-Striebel smoother: it first fits a generative time-series model with expectation-maximization and then performs counterfactual inference. We evaluate TASC on both simulated and real-world datasets, including policy evaluation and sports prediction. Our results suggest that TASC offers advantages in settings with strong temporal trends and high levels of observation noise.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.03099
  28. By: James Banks; Thomas Glinnan; Tatiana Komarova
    Abstract: We develop a unified, nonparametric framework for sharp partial identification and inference on inequality indices when income or wealth are only coarsely observed -- for example via grouped tables or individual interval reports -- possibly together with linear restrictions such as known means or subgroup totals. First, for a broad class of Schur-convex inequality measures, we characterize extremal allocations and show that sharp bounds are attained by distributions with simple, finite support, reducing the underlying infinite-dimensional problem to finite-dimensional optimization. Second, for indices that admit linear-fractional representations after suitable ordering of the data (including the Gini coefficient, quantile ratios, and the Hoover index), we recast the bound problems as linear or quadratic programs, yielding fast computation of numerically sharp bounds. Third, we establish $\sqrt{n}$ inference for bound endpoints using a uniform directional delta method and a bootstrap procedure for standard errors. In ELSA wealth data with mixed point and interval observations, we obtain sharp Gini bounds of 0.714--0.792 for liquid savings and 0.686--0.767 for a broad savings measure; historical U.S. income tables deliver time-series bounds for the Gini, quantile ratios, and Hoover index under grouped information.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.07709
  29. By: Thomas Giroux (CREST - Centre de Recherche en Économie et Statistique - ENSAI - Ecole Nationale de la Statistique et de l'Analyse de l'Information [Bruz] - GENES - Groupe des Écoles Nationales d'Économie et Statistique - X - École polytechnique - IP Paris - Institut Polytechnique de Paris - ENSAE Paris - École Nationale de la Statistique et de l'Administration Économique - GENES - Groupe des Écoles Nationales d'Économie et Statistique - IP Paris - Institut Polytechnique de Paris - CNRS - Centre National de la Recherche Scientifique); Julien Royer (CREST - Centre de Recherche en Économie et Statistique - ENSAI - Ecole Nationale de la Statistique et de l'Analyse de l'Information [Bruz] - GENES - Groupe des Écoles Nationales d'Économie et Statistique - X - École polytechnique - IP Paris - Institut Polytechnique de Paris - ENSAE Paris - École Nationale de la Statistique et de l'Administration Économique - GENES - Groupe des Écoles Nationales d'Économie et Statistique - IP Paris - Institut Polytechnique de Paris - CNRS - Centre National de la Recherche Scientifique); Olivier David Zerbib (CREST - Centre de Recherche en Économie et Statistique - ENSAI - Ecole Nationale de la Statistique et de l'Analyse de l'Information [Bruz] - GENES - Groupe des Écoles Nationales d'Économie et Statistique - X - École polytechnique - IP Paris - Institut Polytechnique de Paris - ENSAE Paris - École Nationale de la Statistique et de l'Administration Économique - GENES - Groupe des Écoles Nationales d'Économie et Statistique - IP Paris - Institut Polytechnique de Paris - CNRS - Centre National de la Recherche Scientifique)
    Abstract: We develop a novel empirical asset pricing framework to estimate time-varying risk premia, building upon score-driven conditional betas models. First, we extend the theory by establishing the asymptotic distribution of standard test statistics, allowing us to assess the significance of a given factor in the regression. Additionally, we introduce a bootstrap procedure and establish its validity. Second, we propose a two-step estimation procedure to recover time-varying risk premia. We illustrate the performance of our tests and risk premia estimation through simulations. Third, we estimate a time-varying premium associated with a carbon risk factor in the cross-section of U.S. industry portfolios.
    Keywords: Asset Pricing Models, Dynamic Factor Models, Score-Driven Mod- els, Carbon risk
    Date: 2024–04–24
    URL: https://d.repec.org/n?u=RePEc:hal:journl:hal-05415058
  30. By: Takaaki Shiotani; Takaki Hayashi; Yuta Koike
    Abstract: This paper introduces a new theoretical framework for analyzing lead-lag relationships between point processes, with a special focus on applications to high-frequency financial data. In particular, we are interested in lead-lag relationships between two sequences of order arrival timestamps. The seminal work of Dobrev and Schaumburg proposed model-free measures of cross-market trading activity based on cross-counts of timestamps. While their method is known to yield reliable results, it faces limitations because its original formulation inherently relies on discrete-time observations, an issue we address in this study. Specifically, we formulate the problem of estimating lead-lag relationships in two point processes as that of estimating the shape of the cross-pair correlation function (CPCF) of a bivariate stationary point process, a quantity well-studied in the neuroscience and spatial statistics literature. Within this framework, the prevailing lead-lag time is defined as the location of the CPCF's sharpest peak. Under this interpretation, the peak location in Dobrev and Schaumburg's cross-market activity measure can be viewed as an estimator of the lead-lag time in the aforementioned sense. We further propose an alternative lead-lag time estimator based on kernel density estimation and show that it possesses desirable theoretical properties and delivers superior numerical performance. Empirical evidence from high-frequency financial data demonstrates the effectiveness of our proposed method.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.01871
  31. By: Deniz Dutz; Xinyi Zhang
    Abstract: Randomization tests yield exact finite-sample Type 1 error control when the null satisfies the randomization hypothesis. However, achieving these guarantees in practice often requires stronger conditions than the null hypothesis of primary interest. For instance, sign-change tests for mean zero require symmetry and fail to control finite-sample error for non-symmetric mean-zero distributions. We investigate whether such limitations stem from specific test choices or reflect a fundamental inability to construct valid randomization tests for certain hypotheses. We develop a framework providing a simple necessary and sufficient condition for when null hypotheses admit randomization tests. Applying this framework to one-sample tests, we provide characterizations of which nulls satisfy this condition for both finite and continuous supports. In doing so, we prove that certain null hypotheses -- including mean zero -- do not admit randomization tests. We further show that nulls that admit randomization tests based on linear group actions correspond only to subsets of symmetric or normal distributions. Overall, our findings affirm that practitioners are not inadvertently incurring additional Type 1 error when using existing tests and further motivate focusing on the asymptotic validity of randomization tests.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.07099
  32. By: Guilherme Vianna; Victor Rangel
    Abstract: Many economically relevant variables (risk, confidence, uncertainty) are latent and therefore not directly observable, which creates identification challenges in applied regressions. This text formalizes how omitting latent factors generates omitted-variable bias and discusses when including a proxy variable can mitigate it. We distinguish the case of a perfect proxy, which can eliminate the bias, from the more realistic case of an imperfect proxy, where residual bias remains and the estimated effect is attenuated. We propose a practical evaluation protocol based on four properties: relevance, conditional sufficiency, exogeneity, and stability. As an illustration, we use micromobility data from Arlington together with the U.S. Geopolitical Risk Index, estimating cointegration and a bivariate VEC model to interpret local activity as a high-frequency signal of the latent component of geopolitical tension.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.10352
  33. By: Astill, Sam; Taylor, AM Robert; Zu Yang
    Abstract: We explore how information from covariates can be incorporated into the CUSUM based real-time monitoring procedure for explosive asset price bubbles developed in Homm and Breitung (2012). Where dynamic covariates are present in the data generating process, the false positive rate of the basic CUSUM procedure, which is based on the assumption that prices follow a univariate data generating process, under the null of no explosivity will not, in general, be properly controlled, even asymptotically. In contrast, accounting for these relevant covariates in the construction of the CUSUM statistics leads to a procedure whose false positive rate can be controlled using the same asymptotic crossing function as employed by Homm and Breitung (2012). Doing so is also shown to have the potential to significantly increase the chance of detecting an emerging bubble episode in finite samples. We additionally allow for time varying volatility in the innovations driving the model through the use of a kernel-based variance estimator.
    Date: 2026–01–21
    URL: https://d.repec.org/n?u=RePEc:esy:uefcwp:42634
  34. By: Chu-An Liu; Andrey L. Vasnev
    Abstract: This paper proposes corrected forecast combinations when the original combined forecast errors are serially dependent. Motivated by the classic Bates and Granger (1969) example, we show that combined forecast errors can be strongly autocorrelated and that a simple correction--adding a fraction of the previous combined error to the next-period combined forecast--can deliver sizable improvements in forecast accuracy, often exceeding the original gains from combining. We formalize the approach within the conditional risk framework of Gibbs and Vasnev (2024), in which the combined error decomposes into a predictable component (measurable at the forecast origin) and an innovation. We then link this correction to efficient estimation of combination weights under time-series dependence via GLS, allowing joint estimation of weights and an error-covariance structure. Using the U.S. Survey of Professional Forecasters for major macroeconomic indices across various subsamples (including pre and post-2000, GFC, and COVID), we find that a parsimonious correction of the mean forecast with a coefficient around 0.5 is a robust starting point and often yields material improvements in forecast accuracy. For optimal-weight forecasts, the correction substantially mitigates the forecast combination puzzle by turning poorly performing out-of-sample optimal-weight combinations into competitive forecasts.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.09999
  35. By: Martin Huber; Jannis Kueck; Mara Mattes
    Abstract: Interference or spillover effects arise when an individual's outcome (e.g., health) is influenced not only by their own treatment (e.g., vaccination) but also by the treatment of others, creating challenges for evaluating treatment effects. Exposure mappings provide a framework to study such interference by explicitly modeling how the treatment statuses of contacts within an individual's network affect their outcome. Most existing research relies on a priori exposure mappings of limited complexity, which may fail to capture the full range of interference effects. In contrast, this study applies a graph convolutional autoencoder to learn exposure mappings in a data-driven way, which exploit dependencies and relations within a network to more accurately capture interference effects. As our main contribution, we introduce a machine learning-based test for the validity of exposure mappings and thus test the identification of the direct effect. In this testing approach, the learned exposure mapping is used as an instrument to test the validity of a simple, user-defined exposure mapping. The test leverages the fact that, if the user-defined exposure mapping is valid (so that all interference operates through it), then the learned exposure mapping is statistically independent of any individual's outcome, conditional on the user-defined exposure mapping. We assess the finite-sample performance of this proposed validity test through a simulation study.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.05728
  36. By: Guanhao Feng; Wei Lan; Hansheng Wang; Jun Zhang
    Abstract: The asset pricing literature emphasizes factor models that minimize pricing errors but overlooks unselected candidate factors that could enhance the performance of test assets. This paper proposes a framework for factor model selection and testing by (i) selecting the optimal model that spans the joint efficient frontier of test assets and all candidate factors, and (ii) testing pricing performance on both test assets and unselected candidate factors. Our framework updates a baseline model (e.g., CAPM) sequentially by adding or removing factors based on asset pricing tests. Ensuring model selection consistency, our framework utilizes the asset pricing duality: minimizing cross-sectionally unexplained pricing errors aligns with maximizing the Sharpe ratio of the selected factor model. Empirical evidence shows that workhorse factor models fail asset pricing tests, whereas our proposed 8-factor model is not rejected and exhibits robust out-of-sample performance.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.10279
  37. By: Runze Li; Rui Zhou; David Pitt
    Abstract: High-frequency death counts are now widely available and contain timely information about intra-year mortality dynamics, but most stochastic mortality models are still estimated on annual data and therefore update only when annual totals are released. We propose a mixed-frequency state-space (MF--SS) extension of the Lee--Carter framework that jointly uses annual mortality rates and monthly death counts. The two series are linked through a shared latent monthly mortality factor, with the annual period factor defined as the intra-year average of the monthly factors. The latent monthly factor follows a seasonal ARIMA process, and parameters are estimated by maximum likelihood using an EM algorithm with Kalman filtering and smoothing. This setup enables real-time intra-year updates of the latent state and forecasts as new monthly observations arrive without re-estimating model parameters. Using U.S. data for ages 20--90 over 1999--2019, we evaluate intra-year annual nowcasts and one- to five-year-ahead forecasts. The MF--SS model produces both a direct annual forecast and an annual forecast implied by aggregating monthly projections. In our application, the aggregated monthly forecast is typically more accurate. Incorporating monthly information substantially improves intra-year annual nowcasts, especially after the first few months of the year. As a benchmark, we also fit separate annual and monthly Lee--Carter models and combine their forecasts using temporal reconciliation. Reconciliation improves these independent forecasts but adds little to MF--SS forecasts, consistent with MF--SS pooling information across frequencies during estimation. The MF--SS aggregated monthly forecasts generally outperform both unreconciled and temporally reconciled Lee--Carter forecasts and produce more cautious predictive intervals than the reconciled Lee--Carter approach.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.05702
  38. By: Paulo M.M. Rodrigues; Daniel Abreu
    Abstract: We extend the two-level factor model to account for cointegration between groupspecific factors in large datasets. We propose two nonlinear specifications: (i) a threshold vector error correction model (VECM) that accounts for asymmetric responses across regimes; and (ii) a band VECM that captures discontinuous state-dependent adjustment which activates only when deviations from equilibrium exceed a certain threshold. We examine the small-sample performance of both models through Monte Carlo simulations. In an empirical application, we estimate a band factor VECM on a panel of government bond yields from multiple countries, estimating one global factor and two group-specific factors associated with long- and short-term maturities. The results provide evidence of a discontinuous adjustment in the global term structure of interest rates.
    JEL: E43 C38 C32
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:ptu:wpaper:w202528
  39. By: Aditri
    Abstract: Value-at-Risk (VaR) estimation at high confidence levels is inherently a rare-event problem and is particularly sensitive to tail behavior and model misspecification. This paper studies the performance of two simulation-based VaR estimation approaches, importance sampling and discrete moment matching, under controlled tail misspecification. The analysis separates the nominal model used for estimator construction from the true data-generating process used for evaluation, allowing the effects of heavy-tailed returns to be examined in a transparent and reproducible setting. Daily returns of a broad equity market proxy are used to calibrate a nominal Gaussian model, while true returns are generated from Student-t distributions with varying degrees of freedom to represent increasingly heavy tails. Importance sampling is implemented via exponential tilting of the Gaussian model, and VaR is estimated through likelihood-weighted root-finding. Discrete moment matching constructs deterministic lower and upper VaR bounds by enforcing a finite number of moment constraints on a discretized loss distribution. The results demonstrate a clear trade-off between efficiency and robustness. Importance sampling produces low-variance VaR estimates under the nominal model but systematically underestimates the true VaR under heavy-tailed returns, with bias increasing at higher confidence levels and for thicker tails. In contrast, discrete moment matching yields conservative VaR bracketing that remains robust under tail misspecification. These findings highlight that variance reduction alone is insufficient for reliable tail risk estimation when model uncertainty is significant.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.09927
  40. By: Zhimin Chen (Nanyang Business School, Nanyang Technological University); Bryan T. Kelly (Yale SOM; AQR Capital Management, LLC; National Bureau of Economic Research (NBER)); Semyon Malamud (Ecole Polytechnique Federale de Lausanne; Centre for Economic Policy Research (CEPR); Swiss Finance Institute)
    Abstract: Machine learning (ML) methods are highly flexible, but their ability to approximate the true data-generating process is fundamentally constrained by finite samples. We characterize a universal lower bound, the Limits-to-Learning Gap (LLG), quantifying the unavoidable discrepancy between a model's empirical fit and the population benchmark. Recovering the true population R 2 , therefore, requires correcting observed predictive performance by this bound. Using a broad set of variables, including excess returns, yields, credit spreads, and valuation ratios, we find that the implied LLGs are large. This indicates that standard ML approaches can substantially understate true predictability in financial data. We also derive LLG-based refinements to the classic Hansen and Jagannathan (1991) bounds, analyze implications for parameter learning in general-equilibrium settings, and show that the LLG provides a natural mechanism for generating excess volatility.
    Keywords: machine learning, asset pricing, predictability, big data, limits to learning, excess volatility, stochastic discount factor, kernel methods
    JEL: C13 C32 C55 C58 G12 G17
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:chf:rpseri:rp25106
  41. By: Arundeep Chinta; Lucas Vinh Tran; Jay Katukuri
    Abstract: Time Series Foundation Models (TSFMs) have emerged as a promising approach for zero-shot financial forecasting, demonstrating strong transferability and data efficiency gains. However, their adoption in financial applications is hindered by fundamental limitations in uncertainty quantification: current approaches either rely on restrictive distributional assumptions, conflate different sources of uncertainty, or lack principled calibration mechanisms. While recent TSFMs employ sophisticated techniques such as mixture models, Student's t-distributions, or conformal prediction, they fail to address the core challenge of providing theoretically-grounded uncertainty decomposition. For the very first time, we present a novel transformer-based probabilistic framework, ProbFM (probabilistic foundation model), that leverages Deep Evidential Regression (DER) to provide principled uncertainty quantification with explicit epistemic-aleatoric decomposition. Unlike existing approaches that pre-specify distributional forms or require sampling-based inference, ProbFM learns optimal uncertainty representations through higher-order evidence learning while maintaining single-pass computational efficiency. To rigorously evaluate the core DER uncertainty quantification approach independent of architectural complexity, we conduct an extensive controlled comparison study using a consistent LSTM architecture across five probabilistic methods: DER, Gaussian NLL, Student's-t NLL, Quantile Loss, and Conformal Prediction. Evaluation on cryptocurrency return forecasting demonstrates that DER maintains competitive forecasting accuracy while providing explicit epistemic-aleatoric uncertainty decomposition. This work establishes both an extensible framework for principled uncertainty quantification in foundation models and empirical evidence for DER's effectiveness in financial applications.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.10591
  42. By: L. J. Espinosa Gonz\'alez; Erick Trevi\~no Aguilar
    Abstract: In this paper we study the Fourier estimator of Malliavin and Mancino for the spot volatility. We establish the convergence of the trigonometric polynomial to the volatility's path in a setting that includes the following aspects. First, the volatility is required to satisfy a mild integrability condition, but otherwise allowed to be unbounded. Second, the price process is assumed to have cadlag paths, not necessarily continuous. We obtain convergence rates for the probability of a bad approximation in estimated coefficients, with a speed that allow to obtain an almost sure convergence and not just in probability in the estimated reconstruction of the volatility's path. This is a new result even in the setting of continuous paths. We prove that a rescaled trigonometric polynomial approximate the quadratic jump process.
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2601.09074

This nep-ecm issue is ©2026 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.