|
on Econometrics |
By: | Juan M. Rodriguez-Poo; Alexandra Soberon; Stefan Sperlich |
Abstract: | We consider identification, inference and validation of linear panel data models when both factors and factor loadings are accounted for by a nonparametric function. This general specification encompasses rather popular models such as the two-way fixed effects and the interactive fixed effects ones. By applying a conditional mean independence assumption between unobserved heterogeneity and the covariates, we obtain consistent estimators of the parameters of interest at the optimal rate of convergence, for fixed and large $T$. We also provide a specification test for the modeling assumption based on the methodology of conditional moment tests and nonparametric estimation techniques. Using degenerate and nondegenerate theories of U-statistics we show its convergence and asymptotic distribution under the null, and that it diverges under the alternative at a rate arbitrarily close to $\sqrt{NT}$. Finite sample inference is based on bootstrap. Simulations reveal an excellent performance of our methods and an empirical application is conducted. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.10690 |
By: | Chen, J.; Li, Y.; Linton, O. B. |
Abstract: | This paper studies the estimation of dynamic precision matrices with multiple conditioning variables for high-dimensional time series. We assume that the high-dimensional time series has an approximate factor structure plus an idiosyncratic error term; this allows the time series to have a non-sparse dynamic precision matrix, which enhances the applicability of our method. Exploiting the Sherman-Morrison-Woodbury formula, the estimation of the dynamic precision matrix for the time series boils down to the estimation of a low-rank factor structure and the precision matrix of the idiosyncratic error term. For the latter, we introduce an easy-to-implement semiparametric method to estimate the entries of the corresponding dynamic covariance matrix via the Model Averaging MArginal Regression (MAMAR) before applying the constrained â„“1 minimisation for inverse matrix estimation (CLIME) method to obtain the dynamic precision matrix. Under some regularity conditions, we derive the uniform consistency for the proposed estimators. We provide a simulation study that illustrates the finite-sample performance of the developed methodology and an application in construction of minimum-variance portfolios using daily returns of S&P 500 constituents from 2000 to 2024. |
Keywords: | Approximate Factor Model, Conditional Sparsity, Large Precision Matrix, MAMAR, Semiparametric Estimation |
JEL: | C10 C14 |
Date: | 2025–06–02 |
URL: | https://d.repec.org/n?u=RePEc:cam:camjip:2514 |
By: | Chen, J.; Li, Y.; Linton, O. B. |
Abstract: | This paper studies the estimation of dynamic precision matrices with multiple conditioning variables for high-dimensional time series. We assume that the high-dimensional time series has an approximate factor structure plus an idiosyncratic error term; this allows the time series to have a non-sparse dynamic precision matrix, which enhances the applicability of our method. Exploiting the Sherman-Morrison-Woodbury formula, the estimation of the dynamic precision matrix for the time series boils down to the estimation of a low-rank factor structure and the precision matrix of the idiosyncratic error term. For the latter, we introduce an easy-to-implement semiparametric method to estimate the entries of the corresponding dynamic covariance matrix via the Model Averaging MArginal Regression (MAMAR) before applying the constrained â„“1 minimisation for inverse matrix estimation (CLIME) method to obtain the dynamic precision matrix. Under some regularity conditions, we derive the uniform consistency for the proposed estimators. We provide a simulation study that illustrates the finite-sample performance of the developed methodology and an application in construction of minimum-variance portfolios using daily returns of S&P 500 constituents from 2000 to 2024. |
Keywords: | Approximate Factor Model, Conditional Sparsity, Large Precision Matrix, MAMAR, Semiparametric Estimation |
JEL: | C10 C14 |
Date: | 2025–06–02 |
URL: | https://d.repec.org/n?u=RePEc:cam:camdae:2536 |
By: | Undral Byambadalai; Tomu Hirata; Tatsushi Oka; Shota Yasui |
Abstract: | This paper focuses on the estimation of distributional treatment effects in randomized experiments that use covariate-adaptive randomization (CAR). These include designs such as Efron's biased-coin design and stratified block randomization, where participants are first grouped into strata based on baseline covariates and assigned treatments within each stratum to ensure balance across groups. In practice, datasets often contain additional covariates beyond the strata indicators. We propose a flexible distribution regression framework that leverages off-the-shelf machine learning methods to incorporate these additional covariates, enhancing the precision of distributional treatment effect estimates. We establish the asymptotic distribution of the proposed estimator and introduce a valid inference procedure. Furthermore, we derive the semiparametric efficiency bound for distributional treatment effects under CAR and demonstrate that our regression-adjusted estimator attains this bound. Simulation studies and empirical analyses of microcredit programs highlight the practical advantages of our method. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.05945 |
By: | Jiang Hu; Jiahui Xie; Yangchun Zhang; Wang Zhou |
Abstract: | Factor models are essential tools for analyzing high-dimensional data, particularly in economics and finance. However, standard methods for determining the number of factors often overestimate the true number when data exhibit heavy-tailed randomness, misinterpreting noise-induced outliers as genuine factors. This paper addresses this challenge within the framework of Elliptical Factor Models (EFM), which accommodate both heavy tails and potential non-linear dependencies common in real-world data. We demonstrate theoretically and empirically that heavy-tailed noise generates spurious eigenvalues that mimic true factor signals. To distinguish these, we propose a novel methodology based on a fluctuation magnification algorithm. We show that under magnifying perturbations, the eigenvalues associated with real factors exhibit significantly less fluctuation (stabilizing asymptotically) compared to spurious eigenvalues arising from heavy-tailed effects. This differential behavior allows the identification and detection of the true and spurious factors. We develop a formal testing procedure based on this principle and apply it to the problem of accurately selecting the number of common factors in heavy-tailed EFMs. Simulation studies and real data analysis confirm the effectiveness of our approach compared to existing methods, particularly in scenarios with pronounced heavy-tailedness. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.05116 |
By: | Ziyang Xiong; Zhao Chen; Christina Dan Wang |
Abstract: | Estimating the leverage effect from high-frequency data is vital but challenged by complex, dependent microstructure noise, often exhibiting non-Gaussian higher-order moments. This paper introduces a novel multi-scale framework for efficient and robust leverage effect estimation under such flexible noise structures. We develop two new estimators, the Subsampling-and-Averaging Leverage Effect (SALE) and the Multi-Scale Leverage Effect (MSLE), which adapt subsampling and multi-scale approaches holistically using a unique shifted window technique. This design simplifies the multi-scale estimation procedure and enhances noise robustness without requiring the pre-averaging approach. We establish central limit theorems and stable convergence, with MSLE achieving convergence rates of an optimal $n^{-1/4}$ and a near-optimal $n^{-1/9}$ for the noise-free and noisy settings, respectively. A cornerstone of our framework's efficiency is a specifically designed MSLE weighting strategy that leverages covariance structures across scales. This significantly reduces asymptotic variance and, critically, yields substantially smaller finite-sample errors than existing methods under both noise-free and realistic noisy settings. Extensive simulations and empirical analyses confirm the superior efficiency, robustness, and practical advantages of our approach. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.08654 |
By: | Wen, Kaiyue; Wang, Tengyao; Wang, Yuhao |
Abstract: | We consider the problem of testing whether a single coefficient is equal to zero in linear models when the dimension of covariates p can be up to a constant fraction of sample size n. In this regime, an important topic is to propose tests with finite-population valid size control without requiring the noise to follow strong distributional assumptions. In this paper, we propose a new method, called residual permutation test (RPT), which is constructed by projecting the regression residuals onto the space orthogonal to the union of the column spaces of the original and permuted design matrices. RPT can be proved to achieve finite-population size validity under fixed design with just exchangeable noises, whenever p |
Keywords: | distribution-free test; permutation test; finite-population validity; heavy tail distribution; high-dimensional data |
JEL: | C1 |
Date: | 2025–04–30 |
URL: | https://d.repec.org/n?u=RePEc:ehl:lserod:126275 |
By: | Cui Rui; Li Yuhao; Song Xiaojun |
Abstract: | We propose power-boosting strategies for kernel-based specification tests in conditional moment models, with a focus on the Kernel Conditional Moment (KCM) test. By decomposing the KCM statistic into spectral components, we demonstrate that truncating poorly estimated directions and selecting kernels based on a non-asymptotic signal-to-noise ratio significantly improves both test power and size control. Our theoretical and simulation results demonstrate that, while divergent component weights may offer higher asymptotic power, convergent component weights perform better in finite samples. The methods outperform existing tests across various settings and are illustrated in an empirical application. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.04900 |
By: | Alexander Chudik; M. Hashem Pesaran; Ron P. Smith |
Abstract: | This paper provides a new methodology for the analysis of multiple long run relations in panel data models where the cross section dimension, $n$, is large relative to the time series dimension, $T$. For panel data models with large $n$ researchers have focused on panels with a single long run relationship. The main difficulty has been to eliminate short run dynamics without generating significant uncertainty for identification of the long run. We overcome this problem by using non-overlapping sub-sample time averages as deviations from their full-sample counterpart and estimating the number of long run relations and their coefficients using eigenvalues and eigenvectors of the pooled covariance matrix of these sub-sample deviations. We refer to this procedure as pooled minimum eigenvalue (PME) and show that it applies to unbalanced panels generated from general linear processes with interactive stationary time effects and does not require knowing long run causal linkages. To our knowledge, no other estimation procedure exists for this setting. We show the PME estimator is consistent and asymptotically normal as $n$ and $T \rightarrow \infty$ jointly, such that $T\approx n^{d}$, with $d>0$ for consistency and $d>1/2$ for asymptotic normality. Extensive Monte Carlo studies show that the number of long run relations can be estimated with high precision and the PME estimates of the long run coefficients show small bias and RMSE and have good size and power properties. The utility of our approach is illustrated with an application to key financial variables using an unbalanced panel of US firms from merged CRSP-Compustat data set covering 2, 000 plus firms over the period 1950-2021. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.02135 |
By: | Giuseppe Cavaliere; Thomas Mikosch; Anders Rahbek; Frederik Vilandt |
Abstract: | Integrated autoregressive conditional duration (ACD) models serve as natural counterparts to the well-known integrated GARCH models used for financial returns. However, despite their resemblance, asymptotic theory for ACD is challenging and also not complete, in particular for integrated ACD. Central challenges arise from the facts that (i) integrated ACD processes imply durations with infinite expectation, and (ii) even in the non-integrated case, conventional asymptotic approaches break down due to the randomness in the number of durations within a fixed observation period. Addressing these challenges, we provide here unified asymptotic theory for the (quasi-) maximum likelihood estimator for ACD models; a unified theory which includes integrated ACD models. Based on the new results, we also provide a novel framework for hypothesis testing in duration models, enabling inference on a key empirical question: whether durations possess a finite or infinite expectation. We apply our results to high-frequency cryptocurrency ETF trading data. Motivated by parameter estimates near the integrated ACD boundary, we assess whether durations between trades in these markets have finite expectation, an assumption often made implicitly in the literature on point process models. Our empirical findings indicate infinite-mean durations for all the five cryptocurrencies examined, with the integrated ACD hypothesis rejected -- against alternatives with tail index less than one -- for four out of the five cryptocurrencies considered. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.06190 |
By: | Aditya Ghosh; Dominik Rothenh\"ausler |
Abstract: | In observational causal inference, it is common to encounter multiple adjustment sets that appear equally plausible. It is often untestable which of these adjustment sets are valid to adjust for (i.e., satisfies ignorability). This discrepancy can pose practical challenges as it is typically unclear how to reconcile multiple, possibly conflicting estimates of the average treatment effect (ATE). A naive approach is to report the whole range (convex hull of the union) of the resulting confidence intervals. However, the width of this interval might not shrink to zero in large samples and can be unnecessarily wide in real applications. To address this issue, we propose a summary procedure that generates a single estimate, one confidence interval, and identifies a set of units for which the causal effect estimate remains valid, provided at least one adjustment set is valid. The width of our proposed confidence interval shrinks to zero with sample size at $n^{-1/2}$ rate, unlike the original range which is of constant order. Thus, our assumption-robust approach enables reliable causal inference on the ATE even in scenarios where most of the adjustment sets are invalid. Admittedly, this robustness comes at a cost: our inferential guarantees apply to a target population close to, but different from, the one originally intended. We use synthetic and real-data examples to demonstrate that our proposed procedure provides substantially tighter confidence intervals for the ATE as compared to the whole range. In particular, for a real-world dataset on 401(k) retirement plans our method produces a confidence interval 50\% shorter than the whole range of confidence intervals based on multiple adjustment sets. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.08729 |
By: | Florian Gunsilius; Lonjezo Sithole |
Abstract: | Economic theory implies strong limitations on what types of consumption behavior are considered rational. Rationality implies that the Slutsky matrix, which captures the substitution effects of compensated price changes on demand for different goods, is symmetric and negative semi-definite. While negative semi-definiteness has been shown to be nonparametrically testable, a fully nonparametric test of symmetry has remained elusive due to the inherent multidimensionality of the problem. Recently, it has even been shown that the symmetry condition is not testable via the average Slutsky matrix, prompting conjectures about its non-testability. We settle this question by deriving nonparametric conditional quantile restrictions on observable data that permit construction of a fully nonparametric test for the symmetry condition. The theoretical contribution is a multivariate extension of identification results for partial effects in nonseparable models without monotonicity, which is of independent interest. The derived conditional restrictions induce challenges related to generated regressors and multiple hypothesis testing, which can be addressed using recent statistical methods. Our results provide researchers with the missing tool in many econometric models that rely on Slutsky matrices: from welfare analysis with individual heterogeneity to testing an empirical version of rationality in consumption behavior. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.05603 |
By: | Thomas B. Marvell |
Abstract: | Textbook theory predicts that t-ratios decline towards zero in regressions when there is increasing collinearity between two independent variables. This article shows that this rarely happens if the two variables are endogenous, and coefficients increase greatly with more collinearity. The purposes of this article are 1) to illustrate this bias and explain why it occurs, and 2) to use the phenomenon to develop a test for endogeneity. For the test, one creates a variable that is highly collinear with the independent variable of interest, and endogeneity is indicated if t-ratios do not decline with increasing collinearity. False negatives are possible, but not likely. The test is confirmed with algebraic examples and simulations. I give many empirical examples of the bias and the test, including testing exogeneity assumptions behind instrumental variables and Granger causality. |
Keywords: | Endogeneity, collinearity, simultaneity, omitted variable bias, instrumental variables. |
JEL: | C12 C13 C26 |
Date: | 2025–05–05 |
URL: | https://d.repec.org/n?u=RePEc:eei:rpaper:eeri_rp_2025_05 |
By: | Jarek Duda |
Abstract: | Nonstationarity of real-life time series requires model adaptation. In classical approaches like ARMA-ARCH there is assumed some arbitrarily chosen dependence type. To avoid their bias, we will focus on novel more agnostic approach: moving estimator, which estimates parameters separately for every time $t$: optimizing $F_t=\sum_{\tau |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.05354 |
By: | Daniil Bargman |
Abstract: | This paper introduces a new least squares regression methodology called (C)LARX: a (constrained) latent variable autoregressive model with exogenous inputs. Two additional contributions are made as a side effect: First, a new matrix operator is introduced for matrices and vectors with blocks along one dimension; Second, a new latent variable regression (LVR) framework is proposed for economics and finance. The empirical section examines how well the stock market predicts real economic activity in the United States. (C)LARX models outperform the baseline OLS specification in out-of-sample forecasts and offer novel analytical insights about the underlying functional relationship. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.04488 |
By: | Zhongren Chen; Siyu Chen; Zhengling Qi; Xiaohong Chen; Zhuoran Yang |
Abstract: | We study quantile-optimal policy learning where the goal is to find a policy whose reward distribution has the largest $\alpha$-quantile for some $\alpha \in (0, 1)$. We focus on the offline setting whose generating process involves unobserved confounders. Such a problem suffers from three main challenges: (i) nonlinearity of the quantile objective as a functional of the reward distribution, (ii) unobserved confounding issue, and (iii) insufficient coverage of the offline dataset. To address these challenges, we propose a suite of causal-assisted policy learning methods that provably enjoy strong theoretical guarantees under mild conditions. In particular, to address (i) and (ii), using causal inference tools such as instrumental variables and negative controls, we propose to estimate the quantile objectives by solving nonlinear functional integral equations. Then we adopt a minimax estimation approach with nonparametric models to solve these integral equations, and propose to construct conservative policy estimates that address (iii). The final policy is the one that maximizes these pessimistic estimates. In addition, we propose a novel regularized policy learning method that is more amenable to computation. Finally, we prove that the policies learned by these methods are $\tilde{\mathscr{O}}(n^{-1/2})$ quantile-optimal under a mild coverage assumption on the offline dataset. Here, $\tilde{\mathscr{O}}(\cdot)$ omits poly-logarithmic factors. To the best of our knowledge, we propose the first sample-efficient policy learning algorithms for estimating the quantile-optimal policy when there exist unmeasured confounding. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.07140 |
By: | De Graeve, Ferre (KU Leuven); Westermark, Andreas (Research Department, Central Bank of Sweden) |
Abstract: | Macroeconomic research often relies on structural vector autoregressions, (S)VARs, to uncover empirical regularities. Critics argue the method goes awry due to lag truncation: short lag-lengths imply a poor approximation to important data-generating processes (e.g. DSGE-models). Empirically, short lag-length is deemed necessary as increased parametrization induces excessive uncertainty. The paper shows that this argument is incomplete. Longer lag-length simultaneously reduces misspecification, which in turn reduces variance. For data generated by frontier DSGE-models long-lag VARs are feasible, reduce bias and variance, and have better coverage. Long-lag VARs are also viable in common macroeconomic data and applications. Thus, contrary to conventional wisdom, the trivial solution to the critique actually works. |
Keywords: | VAR; SVAR; Lag-length; Lag truncation |
JEL: | C18 E37 |
Date: | 2025–05–01 |
URL: | https://d.repec.org/n?u=RePEc:hhs:rbnkwp:0451 |
By: | Damir Filipović (École Polytechnique Fédérale de Lausanne (EPFL); Swiss Finance Institute); Paul Schneider (University of Lugano - Institute of Finance; Swiss Finance Institute) |
Abstract: | We introduce kernel density machines (KDM), a novel density ratio estimator in a reproducing kernel Hilbert space setting. KDM applies to general probability measures on countably generated measurable spaces without restrictive assumptions on continuity, or the existence of a Lebesgue density. For computational efficiency, we incorporate a low-rank approximation with precisely controlled error that grants scalability to large-sample settings. We provide rigorous theoretical guarantees, including asymptotic consistency, a functional central limit theorem, and finite-sample error bounds, establishing a strong foundation for practical use. Empirical results based on simulated and real data demonstrate the efficacy and precision of KDM. |
Keywords: | density ratio estimation, reproducing kernel Hilbert space (RKHS), low-rank approximation, finite-sample guarantees |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:chf:rpseri:rp2553 |
By: | Hasan Fallahgoul |
Abstract: | Recent advances in machine learning have shown promising results for financial prediction using large, over-parameterized models. This paper provides theoretical foundations and empirical validation for understanding when and how these methods achieve predictive success. I examine three key aspects of high-dimensional learning in finance. First, I prove that within-sample standardization in Random Fourier Features implementations fundamentally alters the underlying Gaussian kernel approximation, replacing shift-invariant kernels with training-set dependent alternatives. Second, I derive sample complexity bounds showing when reliable learning becomes information-theoretically impossible under weak signal-to-noise ratios typical in finance. Third, VC-dimension analysis reveals that ridgeless regression's effective complexity is bounded by sample size rather than nominal feature dimension. Comprehensive numerical validation confirms these theoretical predictions, revealing systematic breakdown of claimed theoretical properties across realistic parameter ranges. These results show that when sample size is small and features are high-dimensional, observed predictive success is necessarily driven by low-complexity artifacts, not genuine high-dimensional learning. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.03780 |
By: | Nicolas Camenzind (Swiss Federal Institute of Technology in Lausanne -EPFL); Damir Filipović (École Polytechnique Fédérale de Lausanne (EPFL); Swiss Finance Institute) |
Abstract: | We propose a framework for transfer learning of discount curves across different fixed-income product classes. Motivated by challenges in estimating discount curves from sparse or noisy data, we extend kernel ridge regression (KR) to a vector-valued setting, formulating a convex optimization problem in a vector-valued reproducing kernel Hilbert space (RKHS). Each component of the solution corresponds to the discount curve implied by a specific product class. We introduce an additional regularization term motivated by economic principles, promoting smoothness of spread curves between product classes, and show that it leads to a valid separable kernel structure. A main theoretical contribution is a decomposition of the vector-valued RKHS norm induced by separable kernels. We further provide a Gaussian process interpretation of vector-valued KR, enabling quantification of estimation uncertainty. Illustrative examples demonstrate that transfer learning significantly improves extrapolation performance and tightens confidence intervals compared to single-curve estimation. |
Keywords: | yield curve estimation, transfer learning, nonparametric estimator, machine learning in finance, vector-valued reproducing kernel Hilbert space |
JEL: | C14 E43 G12 |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:chf:rpseri:rp2550 |
By: | Stephane Hess; David Bunch; Andrew Daly |
Abstract: | Choice modellers routinely acknowledge the risk of convergence to inferior local optima when using structures other than a simple linear-in-parameters logit model. At the same time, there is no consensus on appropriate mechanisms for addressing this issue. Most analysts seem to ignore the problem, while others try a set of different starting values, or put their faith in what they believe to be more robust estimation approaches. This paper puts forward the use of a profile likelihood approach that systematically analyses the parameter space around an initial maximum likelihood estimate and tests for the existence of better local optima in that space. We extend this to an iterative algorithm which then progressively searches for the best local optimum under given settings for the algorithm. Using a well known stated choice dataset, we show how the approach identifies better local optima for both latent class and mixed logit, with the potential for substantially different policy implications. In the case studies we conduct, an added benefit of the approach is that the new solutions exhibit properties that more closely adhere to the property of asymptotic normality, also highlighting the benefits of the approach in analysing the statistical properties of a solution. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.02722 |
By: | Haoyuan Wang; Chen Liu; Minh-Ngoc Tran; Chao Wang |
Abstract: | This paper introduces a novel multivariate volatility modeling framework, named Long Short-Term Memory enhanced BEKK (LSTM-BEKK), that integrates deep learning into multivariate GARCH processes. By combining the flexibility of recurrent neural networks with the econometric structure of BEKK models, our approach is designed to better capture nonlinear, dynamic, and high-dimensional dependence structures in financial return data. The proposed model addresses key limitations of traditional multivariate GARCH-based methods, particularly in capturing persistent volatility clustering and asymmetric co-movement across assets. Leveraging the data-driven nature of LSTMs, the framework adapts effectively to time-varying market conditions, offering improved robustness and forecasting performance. Empirical results across multiple equity markets confirm that the LSTM-BEKK model achieves superior performance in terms of out-of-sample portfolio risk forecast, while maintaining the interpretability from the BEKK models. These findings highlight the potential of hybrid econometric-deep learning models in advancing financial risk management and multivariate volatility forecasting. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.02796 |
By: | Shunxin Yao |
Abstract: | OC-DeepIV is a neural network model designed for estimating causal effects. It characterizes heterogeneity by adding interaction features and reduces redundancy through orthogonal constraints. The model includes two feature extractors, one for the instrumental variable Z and the other for the covariate X*. The training process is divided into two stages: the first stage uses the mean squared error (MSE) loss function, and the second stage incorporates orthogonal regularization. Experimental results show that this model outperforms DeepIV and DML in terms of accuracy and stability. Future research directions include applying the model to real-world problems and handling scenarios with multiple processing variables. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.02790 |
By: | Johannes Schwab (École Polytechnique Fédérale de Lausanne (EPFL)); Bryan T. Kelly (Yale SOM; AQR Capital Management, LLC; National Bureau of Economic Research (NBER)); Semyon Malamud (Ecole Polytechnique Federale de Lausanne; Centre for Economic Policy Research (CEPR); Swiss Finance Institute); Teng Andrea Xu (AQR Capital Management, LLC) |
Abstract: | The performance of the data-dependent neural tangent kernel (NTK; Jacot et al. (2018)) associated with a trained deep neural network (DNN) often matches or exceeds that of the full network. This implies that DNN training via gradient descent implicitly performs kernel learning by optimizing the NTK. In this paper, we propose instead to optimize the NTK explicitly. Rather than minimizing empirical risk, we train the NTK to minimize its generalization error using the recently developed Kernel Alignment Risk Estimator (KARE; Jacot et al. (2020)). Our simulations and real data experiments show that NTKs trained with KARE consistently match or significantly outperform the original DNN and the DNNinduced NTK (the after-kernel). These results suggest that explicitly trained kernels can outperform traditional end-to-end DNN optimization in certain settings, challenging the conventional dominance of DNNs. We argue that explicit training of NTK is a form of over-parametrized feature learning. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:chf:rpseri:rp2551 |
By: | Andrew Paskaramoorthy; Terence van Zyl; Tim Gebbie |
Abstract: | Backtests on historical data are the basis for practical evaluations of portfolio selection rules, but their reliability is often limited by reliance on a single sample path. This can lead to high estimation variance. Resampling techniques offer a potential solution by increasing the effective sample size, but can disrupt the temporal ordering inherent in financial data and introduce significant bias. This paper investigates the critical questions: First, How large is this bias for Sharpe Ratio estimates?, and then, second: What are its primary drivers?. We focus on the canonical rolling-window mean-variance portfolio rule. Our contributions are identifying the bias mechanism, and providing a practical heuristic for gauging bias severity. We show that the bias arises from the disruption of train-test dependence linked to the return auto-covariance structure and derive bounds for the bias which show a strong dependence on the observable first-lag autocorrelation. Using simulations to confirm these findings, it is revealed that the resulting Sharpe Ratio bias is often a fraction of a typical backtest's estimation noise, benefiting from partial offsetting of component biases. Empirical analysis further illustrates that differences between IID-resampled and standard backtests align qualitatively with these drivers. Surprisingly, our results suggest that while IID resampling can disrupt temporal dependence, its resulting bias can often be tolerable. However, we highlight the need for structure-preserving resampling methods. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.06383 |