|
on Econometrics |
By: | Xiduo Chen; Xingdong Feng; Antonio F. Galvao; Yeheng Ge |
Abstract: | Obtaining valid treatment effect inferences remains a challenging problem when dealing with numerous instruments and non-sparse control variables. In this paper, we propose a novel ridge regularization-based instrumental variables method for estimation and inference in the presence of both high-dimensional instrumental variables and high-dimensional control variables. These methods are applicable both with and without sparsity assumptions. To address the bias caused by high-dimensional instruments, we introduce a two-step procedure incorporating a data-splitting strategy. We establish statistical properties of the estimator, including consistency and asymptotic normality. Furthermore, we develop statistical inference procedures by providing a consistent estimator for the asymptotic variance of the estimator. The finite sample performance of the proposed method is evaluated through numerical simulations. Results indicate that the new estimator consistently outperforms existing sparsity-based approaches across various settings, offering valuable insights for more complex scenarios. Finally, we provide an empirical application estimating the causal effect of schooling on earnings by addressing potential endogeneity through the use of high-dimensional instrumental variables and high-dimensional covariates. |
Date: | 2025–03 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2503.20149 |
By: | Jungbin Hwang (University of Connecticut); Gonzalo Valdés (Universidad de Tarapacá) |
Abstract: | This paper develops robust inference for conditional quantile regression (QR) under unknown forms of weak dependence in time series data. We rst establish xed-smoothing asymptotic theory for QR by showing that the long-run variance (LRV) estimator for the non-smooth QR score process weakly converges to a random matrix scaled by the true LRV. Additionally, QR-Wald statistics based on the kernel LRV estimator converge to non-standard limits, while using orthonormal series LRV estimators yields standard F and t limits. For the practical implementation of our new asymptotic theory for Wald and t inference in QR, we extend heteroskedasticity and autocorrelation robust (HAR) inference for conditional mean regression to QR and apply the optimal smoothing parameter selection rule based on the Neyman-Pearson principle. Monte Carlo simulation results show that our QR-HAR procedure reduces size distortions of the HAR inference based on the conditional mean regression and the QR-HAC inference particularly in scenarios with moderate sample sizes, strong temporal dependence, and multiple parameters in the joint null hypothesis. |
Keywords: | Quantile regression, heteroskedasticity and autocorrelation robust, long-run variance, alter-native asymptotics, testing-optimal smoothing parameter choice |
JEL: | C12 C19 C22 C32 |
Date: | 2025–02 |
URL: | https://d.repec.org/n?u=RePEc:uct:uconnp:2025-03 |
By: | Jacob Dorn |
Abstract: | In the presence of sufficiently weak overlap, it is known that no regular root-n-consistent estimators exist and standard estimators may fail to be asymptotically normal. This paper shows that a thresholded version of the standard doubly robust estimator is asymptotically normal with well-calibrated Wald confidence intervals even when constructed using nonparametric estimates of the propensity score and conditional mean outcome. The analysis implies a cost of weak overlap in terms of black-box nuisance rates, borne when the semiparametric bound is infinite, and the contribution of outcome smoothness to the outcome regression rate, which is incurred even when the semiparametric bound is finite. As a byproduct of this analysis, I show that under weak overlap, the optimal global regression rate is the same as the optimal pointwise regression rate, without the usual polylogarithmic penalty. The high-level conditions yield new rules of thumb for thresholding in practice. In simulations, thresholded AIPW can exhibit moderate overrejection in small samples, but I am unable to reject a null hypothesis of exact coverage in large samples. In an empirical application, the clipped AIPW estimator that targets the standard average treatment effect yields similar precision to a heuristic 10% fixed-trimming approach that changes the target sample. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.13273 |
By: | Masahiro Tanaka |
Abstract: | While local projections (LPs) are widely used for impulse response analysis, existing Bayesian approaches face fundamental challenges because a set of LPs does not constitute a likelihood function. Prior studies address this issue by constructing a pseudo-likelihood, either by treating LPs as a system of seemingly unrelated regressions with a multivariate normal error structure or by applying a quasi-Bayesian approach with a sandwich estimator. However, these methods lead to posterior distributions that are not "well calibrated, " preventing proper Bayesian belief updates and complicating the interpretation of posterior distributions. To resolve these issues, we propose a novel quasi-Bayesian approach for inferring LPs using the Laplace-type estimator. Specifically, we construct a quasi-likelihood based on a generalized method of moments criterion, which avoids restrictive distributional assumptions and provides well-calibrated inferences. The proposed framework enables the estimation of simultaneous credible bands and naturally extends to LPs with an instrumental variable, offering the first Bayesian treatment of this method. Furthermore, we introduce two posterior simulators capable of handling the high-dimensional parameter space of LPs with the Laplace-type estimator. We demonstrate the effectiveness of our approach through extensive Monte Carlo simulations and an empirical application to U.S. monetary policy. |
Date: | 2025–03 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2503.20249 |
By: | Badi H. Baltagi (Center for Policy Research, Maxwell School, Syracuse University, 426 Eggers Hall, Syracuse, NY 13244); Long Liu (Florida Atlantic University) |
Abstract: | This paper revisits the fixed effects panel data model with AR(1) remainder disturbances and provides a bias corrected estimator for the serial correlation coefficient based on first differencing the panel regression to get rid of the fixed effects. This bias corrected estimator builds upon the estimator proposed by Han and Phillips (2010). Asymptotic properties as well as Monte Carlo results are provided that show the better performance of this new proposed bias corrected estimator. This is extended to the unbalanced panel data case and also illustrated using the empirical application in Donohue and Levitt (2001). |
Keywords: | Panel Data, Serial Correlation, Generalized Least Squares, Fixed Effects, First Difference, Nonstationarity |
JEL: | C23 C24 |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:max:cprwps:267 |
By: | Ben Deaner; Chen-Wei Hsiang; Andrei Zeleneev |
Abstract: | The presence of unobserved confounders is one of the main challenges in identifying treatment effects. In this paper, we propose a new approach to causal inference using panel data with large large $N$ and $T$. Our approach imputes the untreated potential outcomes for treated units using the outcomes for untreated individuals with similar values of the latent confounders. In order to find units with similar latent characteristics, we utilize long pre-treatment histories of the outcomes. Our analysis is based on a nonparametric, nonlinear, and nonseparable factor model for untreated potential outcomes and treatments. The model satisfies minimal smoothness requirements. We impute both missing counterfactual outcomes and propensity scores using kernel smoothing based on the constructed measure of latent similarity between units, and demonstrate that our estimates can achieve the optimal nonparametric rate of convergence up to log terms. Using these estimates, we construct a doubly robust estimator of the period-specifc average treatment effect on the treated (ATT), and provide conditions, under which this estimator is $\sqrt{N}$-consistent, and asymptotically normal and unbiased. Our simulation study demonstrates that our method provides accurate inference for a wide range of data generating processes. |
Date: | 2025–03 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2503.20769 |
By: | Cen, Zetai; Lam, Clifford |
Abstract: | We propose tensor time series imputation when the missing pattern in the tensor data can be general, as long as any two data positions along a tensor fibre are both observed for enough time points. The method is based on a tensor time series factor model with Tucker decomposition of the common component. One distinguished feature of the tensor time series factor model used is that there can be weak factors in the factor loading matrix for each mode. This reflects reality better when real data can have weak factors which drive only groups of observed variables, for instance, a sector factor in a financial market driving only stocks in a particular sector. Using the data with missing entries, asymptotic normality is derived for rows of estimated factor loadings, while consistent covariance matrix estimation enables us to carry out inferences. As a first in the literature, we also propose a ratio-based estimator for the rank of the core tensor under general missing patterns. Rates of convergence are spelt out for the imputations from the estimated tensor factor models. Simulation results show that our imputation procedure works well, with asymptotic normality and corresponding inferences also demonstrated. Re-imputation performances are also gauged when we demonstrate that using slightly larger rank then estimated gives superior re-imputation performances. A Fama–French portfolio example with matrix returns and an OECD data example with matrix of economic indicators are presented and analysed, showing the efficacy of our imputation approach compared to direct vector imputation. |
Keywords: | generalized cross-covariance matrix; tensor unfolding; core tensor; α-mixing time series variables; missingness tensor |
JEL: | C1 J1 |
Date: | 2025–05–31 |
URL: | https://d.repec.org/n?u=RePEc:ehl:lserod:127231 |
By: | Matthew Read (Reserve Bank of Australia); Dan Zhu (Department of Econometrics and Business Statistics, Monash University) |
Abstract: | We propose algorithms for conducting Bayesian inference in structural vector autoregressions identified using sign restrictions. The key feature of our approach is a sampling step based on 'soft' sign restrictions. This step draws from a target density that smoothly penalises parameter values violating the restrictions, facilitating the use of computationally efficient Markov chain Monte Carlo sampling algorithms. An importance-sampling step yields draws from the desired distribution conditional on the 'hard' sign restrictions. Relative to standard accept-reject sampling, the method substantially improves computational efficiency when identification is 'tight'. It can also greatly reduce the computational burden of implementing prior-robust Bayesian methods. We illustrate the broad applicability of the approach in a model of the global oil market identified using a rich set of sign, elasticity and narrative restrictions. |
Keywords: | Bayesian inference; Markov chain Monte Carlo; oil market; sign restrictions; structural vector autoregression |
JEL: | C32 Q35 Q43 |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:rba:rbardp:rdp2025-03 |
By: | Castellanos, Juan (Bank of England) |
Abstract: | This paper conducts a Monte Carlo study to examine the small sample performance of impulse response (IRF) matching and Indirect Inference estimators that target IRFs that have been estimated with Local Projections (LP) or Vector Autoregressions (VAR). The analysis considers various identification schemes for the shocks and several variants of LP and VAR estimators. Results show that the lower bias from LP responses is a big advantage when it comes to IRF matching, while the lower variance from VAR is desirable for Indirect Inference applications as it is robust to the higher bias of VAR-IRFs. Overall, I recommend the use of Indirect Inference over IRF matching when estimating Dynamic Stochastic General Equilibrium (DSGE) models as the former is robust to potential misspecification coming from invalid identification assumptions, small sample issues or incorrect lag selection. |
Keywords: | DSGE estimation; impulse responses; Indirect Inference; Local Projection; Vector Autoregression; Monte Carlo analysis |
JEL: | C13 C15 E00 |
Date: | 2025–02–14 |
URL: | https://d.repec.org/n?u=RePEc:boe:boeewp:1116 |
By: | Uwe Hassler; Marc-Oliver Pohle; Tanja Zahn |
Abstract: | Sample autocorrelograms typically come with significance bands (non-rejection regions) for the null hypothesis of temporal independence. These bands have two shortcomings. First, they build on pointwise intervals and suffer from joint undercoverage (overrejection) under the null hypothesis. Second, if this null is clearly violated one would rather prefer to see confidence bands to quantify estimation uncertainty. We propose and discuss both simultaneous significance bands and simultaneous confidence bands for time series and series of regression residuals. They are as easy to construct as their pointwise counterparts and at the same time provide an intuitive and visual quantification of sampling uncertainty as well as valid statistical inference. For regression residuals, we show that for static regressions the asymptotic variances underlying the construction of the bands are as for observed time series and for dynamic regressions (with lagged endogenous regressors) we show how they need to be adjusted. We study theoretical properties of simultaneous significance bands and two types of simultaneous confidence bands (sup-t and Bonferroni) and analyse their finite-sample performance in a simulation study. Finally, we illustrate the use of the bands in an application to monthly US inflation and residuals from Phillips curve regressions. |
Date: | 2025–03 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2503.18560 |
By: | Gregor Steiner; Mark Steel |
Abstract: | Instrumental variables are a popular tool to infer causal effects under unobserved confounding, but choosing suitable instruments is challenging in practice. We propose gIVBMA, a Bayesian model averaging procedure that addresses this challenge by averaging across different sets of instrumental variables and covariates in a structural equation model. Our approach extends previous work through a scale-invariant prior structure and accommodates non-Gaussian outcomes and treatments, offering greater flexibility than existing methods. The computational strategy uses conditional Bayes factors to update models separately for the outcome and treatments. We prove that this model selection procedure is consistent. By explicitly accounting for model uncertainty, gIVBMA allows instruments and covariates to switch roles and provides robustness against invalid instruments. In simulation experiments, gIVBMA outperforms current state-of-the-art methods. We demonstrate its usefulness in two empirical applications: the effects of malaria and institutions on income per capita and the returns to schooling. A software implementation of gIVBMA is available in Julia. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.13520 |
By: | Brendan Kline; Matthew A. Masten |
Abstract: | We develop an approach to sensitivity analysis that uses design distributions to calibrate sensitivity parameters in a finite population model. We use this approach to (1) give a new formal analysis of the role of randomization, (2) provide a new motivation for examining covariate balance, and (3) show how to construct design-based confidence intervals for the average treatment effect, which allow for heterogeneous treatment effects but do not rely on asymptotics. This approach to confidence interval construction relies on partial identification analysis rather than hypothesis test inversion. Moreover, these intervals also have a non-frequentist, identification-based interpretation. We illustrate our approach in three empirical applications. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.14127 |
By: | Jushan Bai; Pablo Mones |
Abstract: | This paper examines the problem of global identification in dynamic panel models with interactive effects, a fundamental issue in econometric theory. We focus on the setting where the number of cross-sectional units (N) is large, but the time dimension (T) remains fixed. While local identification based on the Jacobian matrix is well understood and relatively straightforward to establish, achieving global identification remains a significant challenge. Under a set of mild and easily satisfied conditions, we demonstrate that the parameters of the model are globally identified, ensuring that no two distinct parameter values generate the same probability distribution of the observed data. Our findings contribute to the broader literature on identification in panel data models and have important implications for empirical research that relies on interactive effects. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.14354 |
By: | Christophe Bruneel-Zupanc |
Abstract: | This paper develops a general framework for dynamic models in which individuals simultaneously make both discrete and continuous choices. The framework incorporates a wide range of unobserved heterogeneity. I show that such models are nonparametrically identified. Based on constructive identification arguments, I build a novel two-step estimation method in the lineage of Hotz and Miller (1993) and Arcidiacono and Miller (2011) but extended to simultaneous discrete-continuous choice. In the first step, I recover the (type-dependent) optimal choices with an expectation-maximization algorithm and instrumental variable quantile regression. In the second step, I estimate the primitives of the model taking the estimated optimal choices as given. The method is especially attractive for complex dynamic models because it significantly reduces the computational burden associated with their estimation compared to alternative full solution methods. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.16630 |
By: | Monica Billio; Roberto Casarin; Fausto Corradin; Antonio Peruzzi |
Abstract: | Bayes Factor (BF) is one of the tools used in Bayesian analysis for model selection. The predictive BF finds application in detecting outliers, which are relevant sources of estimation and forecast errors. An efficient framework for outlier detection is provided and purposely designed for large multidimensional datasets. Online detection and analytical tractability guarantee the procedure's efficiency. The proposed sequential Bayesian monitoring extends the univariate setup to a matrix--variate one. Prior perturbation based on power discounting is applied to obtain tractable predictive BFs. This way, computationally intensive procedures used in Bayesian Analysis are not required. The conditions leading to inconclusive responses in outlier identification are derived, and some robust approaches are proposed that exploit the predictive BF's variability to improve the standard discounting method. The effectiveness of the procedure is studied using simulated data. An illustration is provided through applications to relevant benchmark datasets from macroeconomics and finance. |
Date: | 2025–03 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2503.19515 |
By: | Alessandro Morico; Ovidijus Stauskas |
Abstract: | We present four novel tests of equal predictive accuracy and encompassing for out-of-sample forecasts based on factor-augmented regression. We extend the work of Pitarakis (2023a, b) to develop the inferential theory of predictive regressions with generated regressors which are estimated by using Common Correlated Effects (henceforth CCE) - a technique that utilizes cross-sectional averages of grouped series. It is particularly useful since large datasets of such structure are becoming increasingly popular. Under our framework, CCE-based tests are asymptotically normal and robust to overspecification of the number of factors, which is in stark contrast to existing methodologies in the CCE context. Our tests are highly applicable in practice as they accommodate for different predictor types (e.g., stationary and highly persistent factors), and remain invariant to the location of structural breaks in loadings. Extensive Monte Carlo simulations indicate that our tests exhibit excellent local power properties. Finally, we apply our tests to a novel EA-MD-QD dataset by Barigozzi et al. (2024b), which covers Euro Area as a whole and primary member countries. We demonstrate that CCE factors offer a substantial predictive power even under varying data persistence and structural breaks. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.08455 |
By: | Achim Ahrens; Victor Chernozhukov; Christian Hansen; Damian Kozbur; Mark Schaffer; Thomas Wiemann |
Abstract: | This paper provides a practical introduction to Double/Debiased Machine Learning (DML). DML provides a general approach to performing inference about a target parameter in the presence of nuisance parameters. The aim of DML is to reduce the impact of nuisance parameter estimation on estimators of the parameter of interest. We describe DML and its two essential components: Neyman orthogonality and cross-fitting. We highlight that DML reduces functional form dependence and accommodates the use of complex data types, such as text data. We illustrate its application through three empirical examples that demonstrate DML's applicability in cross-sectional and panel settings. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.08324 |
By: | Jan Rabenseifner; Sven Klaassen; Jannis Kueck; Philipp Bach |
Abstract: | The partitioning of data for estimation and calibration critically impacts the performance of propensity score based estimators like inverse probability weighting (IPW) and double/debiased machine learning (DML) frameworks. We extend recent advances in calibration techniques for propensity score estimation, improving the robustness of propensity scores in challenging settings such as limited overlap, small sample sizes, or unbalanced data. Our contributions are twofold: First, we provide a theoretical analysis of the properties of calibrated estimators in the context of DML. To this end, we refine existing calibration frameworks for propensity score models, with a particular emphasis on the role of sample-splitting schemes in ensuring valid causal inference. Second, through extensive simulations, we show that calibration reduces variance of inverse-based propensity score estimators while also mitigating bias in IPW, even in small-sample regimes. Notably, calibration improves stability for flexible learners (e.g., gradient boosting) while preserving the doubly robust properties of DML. A key insight is that, even when methods perform well without calibration, incorporating a calibration step does not degrade performance, provided that an appropriate sample-splitting approach is chosen. |
Date: | 2025–03 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2503.17290 |
By: | Chris Hays; Manish Raghavan |
Abstract: | Researchers and practitioners often wish to measure treatment effects in settings where units interact via markets and recommendation systems. In these settings, units are affected by certain shared states, like prices, algorithmic recommendations or social signals. We formalize this structure, calling it shared-state interference, and argue that our formulation captures many relevant applied settings. Our key modeling assumption is that individuals' potential outcomes are independent conditional on the shared state. We then prove an extension of a double machine learning (DML) theorem providing conditions for achieving efficient inference under shared-state interference. We also instantiate our general theorem in several models of interest where it is possible to efficiently estimate the average direct effect (ADE) or global average treatment effect (GATE). |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.08836 |
By: | Markus Bibinger; Jun Yu; Chen Zhang |
Abstract: | A multivariate fractional Brownian motion (mfBm) with component-wise Hurst exponents is used to model and forecast realized volatility. We investigate the interplay between correlation coefficients and Hurst exponents and propose a novel estimation method for all model parameters, establishing consistency and asymptotic normality of the estimators. Additionally, we develop a time-reversibility test, which is typically not rejected by real volatility data. When the data-generating process is a time-reversible mfBm, we derive optimal forecasting formulae and analyze their properties. A key insight is that an mfBm with different Hurst exponents and non-zero correlations can reduce forecasting errors compared to a one-dimensional model. Consistent with optimal forecasting theory, out-of-sample forecasts using the time-reversible mfBm show improvements over univariate fBm, particularly when the estimated Hurst exponents differ significantly. Empirical results demonstrate that mfBm-based forecasts outperform the (vector) HAR model. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.15985 |
By: | Alfonzetti, Giuseppe; Bellio, Ruggero; Chen, Yunxiao; Moustaki, Irini |
Abstract: | Pairwise likelihood is a limited-information method widely used to estimate latent variable models, including factor analysis of categorical data. It can often avoid evaluating high-dimensional integrals and, thus, is computationally more efficient than relying on the full likelihood. Despite its computational advantage, the pairwise likelihood approach can still be demanding for large-scale problems that involve many observed variables. We tackle this challenge by employing an approximation of the pairwise likelihood estimator, which is derived from an optimization procedure relying on stochastic gradients. The stochastic gradients are constructed by subsampling the pairwise log-likelihood contributions, for which the subsampling scheme controls the per-iteration computational complexity. The stochastic estimator is shown to be asymptotically equivalent to the pairwise likelihood one. However, finite-sample performance can be improved by compounding the sampling variability of the data with the uncertainty introduced by the subsampling scheme. We demonstrate the performance of the proposed method using simulation studies and two real data applications. |
Keywords: | item factor analysis; structural equation models; composite likelihood; asymptotic normality; stochastic gradient descent |
JEL: | C1 |
Date: | 2025–02–28 |
URL: | https://d.repec.org/n?u=RePEc:ehl:lserod:122638 |
By: | Lan Luo, By; Shi, Chengchun; Wang, Jitao; Wu, Zhenke; Li, Lexin |
Abstract: | Mediation analysis is an important analytic tool commonly used in a broad range of scientific applications. In this article, we study the problem of mediation analysis when there are multivariate and conditionally dependent mediators, and when the variables are observed over multiple time points. The problem is challenging, because the effect of a mediator involves not only the path from the treatment to this mediator itself at the current time point, but also all possible paths pointed to this mediator from its upstream mediators, as well as the carryover effects from all previous time points. We propose a novel multivariate dynamic mediation analysis approach. Drawing inspiration from the Markov decision process model that is frequently employed in reinforcement learning, we introduce a Markov mediation process paired with a system of time-varying linear structural equation models to formulate the problem. We then formally define the individual mediation effect, built upon the idea of simultaneous interventions and intervention calculus. We next derive the closed-form expression, propose an iterative estimation procedure under the Markov mediation process model, and develop a bootstrap method to infer the individual mediation effect. We study both the asymptotic property and the empirical performance of the proposed methodology, and further illustrate its usefulness with a mobile health application. |
JEL: | C1 |
Date: | 2025–02–01 |
URL: | https://d.repec.org/n?u=RePEc:ehl:lserod:127112 |
By: | Riccardo Della Vecchia (Scool - Scool - Centre Inria de l'Université de Lille - Inria - Institut National de Recherche en Informatique et en Automatique - CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 - Centrale Lille - Université de Lille - CNRS - Centre National de la Recherche Scientifique); Debabrota Basu (Scool - Scool - Centre Inria de l'Université de Lille - Inria - Institut National de Recherche en Informatique et en Automatique - CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 - Centrale Lille - Université de Lille - CNRS - Centre National de la Recherche Scientifique, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 - Centrale Lille - Université de Lille - CNRS - Centre National de la Recherche Scientifique, Centre Inria de l'Université de Lille - Inria - Institut National de Recherche en Informatique et en Automatique, Université de Lille, Centrale Lille) |
Abstract: | The independence of noise and covariates is a standard assumption in online linear regression with unbounded noise and linear bandit literature. This assumption and the following analysis are invalid in the case of endogeneity, i.e., when the noise and covariates are correlated. In this paper, we study the online setting of Instrumental Variable (IV) regression, which is widely used in economics to identify the underlying model from an endogenous dataset. Specifically, we upper bound the identification and oracle regrets of the popular Two-Stage Least Squares (2SLS) approach to IV regression but in the online setting. Our analysis shows that Online 2SLS (O2SLS) achieves $\mathcal O(d^2\log^2 T)$ identification and $\mathcal O(\gamma \sqrt{d T \log T})$ oracle regret after $T$ interactions, where $d$ is the dimension of covariates and $\gamma$ is the bias due to endogeneity. Then, we leverage O2SLS as an oracle to design OFUL-IV, a linear bandit algorithm. OFUL-IV can tackle endogeneity and achieves $\mathcal O(d\sqrt{T}\log T)$ regret. For different datasets with endogeneity, we experimentally show efficiencies of O2SLS and OFUL-IV. |
Keywords: | Causality, Instrumental Variables, Online linear regression, Online learning, Bandit / imperfect feedback, Linear bandits, Regret Bounds, Econometrics, Two-stage regression |
Date: | 2025–02 |
URL: | https://d.repec.org/n?u=RePEc:hal:journl:hal-03831210 |
By: | Xiyue Han; Alexander Schied |
Abstract: | In [8], easily computable scale-invariant estimator $\widehat{\mathscr{R}}^s_n$ was constructed to estimate the Hurst parameter of the drifted fractional Brownian motion $X$ from its antiderivative. This paper extends this convergence result by proving that $\widehat{\mathscr{R}}^s_n$ also consistently estimates the Hurst parameter when applied to the antiderivative of $g \circ X$ for a general nonlinear function $g$. We also establish an almost sure rate of convergence in this general setting. Our result applies, in particular, to the estimation of the Hurst parameter of a wide class of rough stochastic volatility models from discrete observations of the integrated variance, including the fractional stochastic volatility model. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.09276 |
By: | Jingyi Wei; Steve Yang; Zhenyu Cui |
Abstract: | In this study, we propose a novel integrated Generalized Autoregressive Conditional Heteroskedasticity-Gated Recurrent Unit (GARCH-GRU) model for financial volatility modeling and forecasting. The model embeds the GARCH(1, 1) formulation directly into the GRU cell architecture, yielding a unified recurrent unit that jointly captures both traditional econometric properties and complex temporal dynamics. This hybrid structure leverages the strengths of GARCH in modeling key stylized facts of financial volatility, such as clustering and persistence, while utilizing the GRU's capacity to learn nonlinear dependencies from sequential data. Compared to the GARCH-LSTM counterpart, the GARCH-GRU model demonstrates superior computational efficiency, requiring significantly less training time, while maintaining and improving forecasting accuracy. Empirical evaluation across multiple financial datasets confirms the model's robust outperformance in terms of mean squared error (MSE) and mean absolute error (MAE) relative to a range of benchmarks, including standard neural networks, alternative hybrid architectures, and classical GARCH-type models. As an application, we compute Value-at-Risk (VaR) using the model's volatility forecasts and observe lower violation ratios, further validating the predictive reliability of the proposed framework in practical risk management settings. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.09380 |
By: | Duncan K. Foley; Ellis Scharfenaker |
Abstract: | Bayes' theorem incorporates distinct types of information through the likelihood and prior. Direct observations of state variables enter the likelihood and modify posterior probabilities through consistent updating. Information in terms of expected values of state variables modify posterior probabilities by constraining prior probabilities to be consistent with the information. Constraints on the prior can be exact, limiting hypothetical frequency distributions to only those that satisfy the constraints, or be approximate, allowing residual deviations from the exact constraint to some degree of tolerance. When the model parameters and constraint tolerances are known, posterior probability follows directly from Bayes' theorem. When parameters and tolerances are unknown a prior for them must be specified. When the system is close to statistical equilibrium the computation of posterior probabilities is simplified due to the concentration of the prior on the maximum entropy hypothesis. The relationship between maximum entropy reasoning and Bayes' theorem from this point of view is that maximum entropy reasoning is a special case of Bayesian inference with a constrained entropy-favoring prior. |
Keywords: | Bayesian inference, Maximum entropy, Priors, Information theory, Statistical equilibrium JEL Classification: |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:uta:papers:2024-03 |
By: | Philippe Goulet Coulombe |
Abstract: | I show that ordinary least squares (OLS) predictions can be rewritten as the output of a restricted attention module, akin to those forming the backbone of large language models. This connection offers an alternative perspective on attention beyond the conventional information retrieval framework, making it more accessible to researchers and analysts with a background in traditional statistics. It falls into place when OLS is framed as a similarity-based method in a transformed regressor space, distinct from the standard view based on partial correlations. In fact, the OLS solution can be recast as the outcome of an alternative problem: minimizing squared prediction errors by optimizing the embedding space in which training and test vectors are compared via inner products. Rather than estimating coefficients directly, we equivalently learn optimal encoding and decoding operations for predictors. From this vantage point, OLS maps naturally onto the query-key-value structure of attention mechanisms. Building on this foundation, I discuss key elements of Transformer-style attention and draw connections to classic ideas from time series econometrics. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.09663 |
By: | Sokolov, Boris (HSE University) |
Abstract: | This paper reviews various estimands used in modern scientific and applied research to operationalize causal inquiries within the Rubin Causal Model framework. I first introduce the most widely utilized average treatment effects, such as ATE, ATT, and ATC. I then describe their popular extensions, including those targeting local and conditional treatment effects; causal interactions and mediation; effects for non-continuous outcomes, as well as for multi-valued and continuous treatments; and longitudinal treatment effects. For each of these estimands, a substantive explanation is provided, along with examples of research questions they can address. The key assumptions necessary for the identification of the most widely used effects are also discussed. |
Date: | 2025–04–24 |
URL: | https://d.repec.org/n?u=RePEc:osf:socarx:4vtpk_v1 |
By: | Yuming Ma; Shintaro Sengoku; Kazuhide Nakata |
Abstract: | For quantitative trading risk management purposes, we present a novel idea: the realized local volatility surface. Concisely, it stands for the conditional expected volatility when sudden market behaviors of the underlying occur. One is able to explore risk management usages by following the orthotical Delta-Gamma dynamic hedging framework. The realized local volatility surface is, mathematically, a generalized Wiener measure from historical prices. It is reconstructed via employing high-frequency trading market data. A Stick-Breaking Gaussian Mixture Model is fitted via Hamiltonian Monte Carlo, producing a local volatility surface with 95% credible intervals. A practically validated Bayesian nonparametric estimation workflow. Empirical results on TSLA high-frequency data illustrate its ability to capture counterfactual volatility. We also discuss its application in improving volatility-based risk management. |
Date: | 2025–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2504.15626 |
By: | Koukorinis, Andreas; Peters, Gareth W.; Germano, Guido |
Abstract: | We combine a hidden Markov model (HMM) and a kernel machine (SVM/MKL) into a hybrid HMM-SVM/MKL generative-discriminative learning approach to accurately classify high-frequency financial regimes and predict the direction of trades. We capture temporal dependencies and key stylized facts in high-frequency financial time series by integrating the HMM to produce model-based generative feature embeddings from microstructure time series data. These generative embeddings then serve as inputs to a SVM with single- and multi-kernel (MKL) formulations for predictive discrimination. Our methodology, which does not require manual feature engineering, improves classification accuracy compared to single-kernel SVMs and kernel target alignment methods. It also outperforms both logistic classifier and feed-forward networks. This hybrid HMM-SVM-MKL approach shows high-frequency time-series classification improvements that can significantly benefit applications in finance. |
Keywords: | Fisher information kernel; hidden Markov model; Kernel methods; support vector machine |
JEL: | C1 F3 G3 |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:ehl:lserod:128016 |