|
on Big Data |
By: | Alexandre d'Aspremont (LIENS - Laboratoire d'informatique de l'école normale supérieure - DI-ENS - Département d'informatique - ENS-PSL - ENS-PSL - École normale supérieure - Paris - PSL - Université Paris Sciences et Lettres - Inria - Institut National de Recherche en Informatique et en Automatique - CNRS - Centre National de la Recherche Scientifique - CNRS - Centre National de la Recherche Scientifique, SIERRA - Statistical Machine Learning and Parsimony - DI-ENS - Département d'informatique - ENS-PSL - ENS-PSL - École normale supérieure - Paris - PSL - Université Paris Sciences et Lettres - Inria - Institut National de Recherche en Informatique et en Automatique - CNRS - Centre National de la Recherche Scientifique - CNRS - Centre National de la Recherche Scientifique - Centre Inria de Paris - Inria - Institut National de Recherche en Informatique et en Automatique, Kayrros); Simon Ben Arous (Kayrros); Jean-Charles Bricongne (LEO - Laboratoire d'Économie d'Orleans [2022-...] - UO - Université d'Orléans - UT - Université de Tours - UCA - Université Clermont Auvergne, Centre de recherche de la Banque de France - Banque de France); Benjamin Lietti (EPEE - Centre d'Etudes des Politiques Economiques - UEVE - Université d'Évry-Val-d'Essonne - Université Paris-Saclay); Baptiste Meunier (Centre de recherche de la Banque Centrale européenne - Banque Centrale Européenne, AMSE - Aix-Marseille Sciences Economiques - EHESS - École des hautes études en sciences sociales - AMU - Aix Marseille Université - ECM - École Centrale de Marseille - CNRS - Centre National de la Recherche Scientifique) |
Abstract: | This paper exploits daily infrared images taken from satellites to track economic activity in advanced and emerging countries. We first develop a framework to read, clean, and exploit satellite images. Our algorithm uses the laws of physics (Planck's law) and machine learning to detect the heat produced by cement plants in activity. This allows us to monitor in real-time whether a cement plant is working. Using this on around 1, 000 plants, we construct a satellitebased index. We show that using this satellite index outperforms benchmark models and alternative indicators for nowcasting the production of the cement industry as well as the activity in the construction sector. Comparing across methods, neural networks appear to yield more accurate predictions as they allow to exploit the granularity of our dataset. Overall, combining satellite images and machine learning can help policymakers to take informed and swift economic policy decisions by nowcasting accurately and in real-time economic activity. |
Keywords: | Big data, Data science, Machine learning, Construction, High-frequency data |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:hal:journl:hal-05104995 |
By: | Yuke Zhang |
Abstract: | This study introduces an interpretable machine learning (ML) framework to extract macroeconomic alpha from global news sentiment. We process the Global Database of Events, Language, and Tone (GDELT) Project's worldwide news feed using FinBERT -- a Bidirectional Encoder Representations from Transformers (BERT) based model pretrained on finance-specific language -- to construct daily sentiment indices incorporating mean tone, dispersion, and event impact. These indices drive an XGBoost classifier, benchmarked against logistic regression, to predict next-day returns for EUR/USD, USD/JPY, and 10-year U.S. Treasury futures (ZN). Rigorous out-of-sample (OOS) backtesting (5-fold expanding-window cross-validation, OOS period: c. 2017-April 2025) demonstrates exceptional, cost-adjusted performance for the XGBoost strategy: Sharpe ratios achieve 5.87 (EUR/USD), 4.65 (USD/JPY), and 4.65 (Treasuries), with respective compound annual growth rates (CAGRs) exceeding 50% in Foreign Exchange (FX) and 22% in bonds. Shapley Additive Explanations (SHAP) affirm that sentiment dispersion and article impact are key predictive features. Our findings establish that integrating domain-specific Natural Language Processing (NLP) with interpretable ML offers a potent and explainable source of macro alpha. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.16136 |
By: | Altug Aydemir; Cem Cebi |
Abstract: | This study aims at forecasting the future behavior of budget variables, using Artificial Neural Network (ANN) and Deep Neural Network (DNN) techniques for Türkiye. Particularly, we focus on budget expenditures, tax revenues and their main components. Annual data were used and divided into two sub-periods: a training set (2002-2019) and a test set (2020-2022). Each fiscal item is estimated using relevant explanatory variables selected based on economic theory. We achieved good forecasting performance for main budget items using ANN and DNN methodologies. We found that most of the Mean Absolute Error (MAE) values fell within the acceptable range, an indicator of good prediction performance. Second, we see that the MAE values for public expenditures are lower than taxes. Third, estimating total tax revenues (aggregate data) performs better compared to subcomponents of taxes (disaggregated data). The opposite is the case for public expenditures. |
Keywords: | Machine Learning, Deep Learning, Artificial Neural Network (ANN), Deep Neural Network (DNN), Budget Forecast, Government Spending, Tax Revenue |
JEL: | C53 H20 H50 H68 |
Date: | 2025 |
URL: | https://d.repec.org/n?u=RePEc:tcb:wpaper:2509 |
By: | Altug Aydemir; Mert Gokcu |
Abstract: | [EN] In recent years, machine learning-based techniques have gained prominence in forecasting crude oil prices due to their ability effectively handle the highly volatile and nonlinear nature of oil prices. The primary objective of this paper is to forecast monthly oil prices with the highest level of precision and accuracy possible. To do this, we propose a deepened and high-parametrized version of the deep neural network model framework that integrates widely adopted algorithms and a variety of datasets. Additionally, our approach involves the optimal architecture for deep neural networks used in oil price forecasting and offers forecasts that are repeatable and consistent. All the evaluation metrics values indicate that the proposed model achieves superior forecasting performance compared to some simple conventional statistical models. [TR] Son zamanlarda, makine ogrenimi tabanli yontemler, petrol fiyatlarinin son derece oynak ve dogrusal olmayan dogasi ile etkin bir sekilde basa cikma yetenekleri sayesinde ham petrol fiyatlarini tahmin etmede onem kazanmistir. Bu calismanin temel amaci, aylik bazda petrol fiyatlarini mumkun olan en yuksek hassasiyet ve dogrulukla tahmin etmektir. Bunu yapmak icin, ham petrol fiyat tahmini icin iyi bilinen algoritmalari ve cesitli veri kumelerini kullanan derin sinir agi modeli cercevesinin derinlestirilmis ve yuksek parametreli bir versiyonunu oneriyoruz. Ayrica, yaklasimimiz petrol fiyat tahmininde kullanilan derin sinir aglari icin en uygun mimariyi icermekte ve tekrarlanabilir ve tutarli tahminler sunmaktadir. Tum degerlendirme metrik degerleri, onerilen modelimizin geleneksel yontemlere kiyasla tahmin performansinda onemli bir iyilesmeye sahip oldugunu gostermektedir. |
Date: | 2025 |
URL: | https://d.repec.org/n?u=RePEc:tcb:econot:2511 |
By: | Giorgio Alfredo Spedicato (Leitha SRL); Christophe Dutang (ASAR - Applied Statistics And Reliability - ASAR - LJK - Laboratoire Jean Kuntzmann - Inria - Institut National de Recherche en Informatique et en Automatique - CNRS - Centre National de la Recherche Scientifique - UGA - Université Grenoble Alpes - Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology - UGA - Université Grenoble Alpes); Quentin Guibert (CEREMADE - CEntre de REcherches en MAthématiques de la DEcision - Université Paris Dauphine-PSL - PSL - Université Paris Sciences et Lettres - CNRS - Centre National de la Recherche Scientifique, LSAF - Laboratoire de Sciences Actuarielle et Financière - UCBL - Université Claude Bernard Lyon 1 - Université de Lyon) |
Abstract: | Credibility theory is the usual framework in actuarial science when it comes to reinforcing individual experience by transfering rates estimated from collective information. Based on the paradigm of transfer learning, this article presents the idea that a machine learning (ML) model pre-trained using a rich market data porfolio can improve the prediction of rates for an individual insurance portfolio. This framework consists first in training several ML models on a market portfolio of insurance data. Pre-trained models provide valuable information on relations between features and predicted rates. Furthermore, features shared with the company dataset are used to predict rates better than the same ML models trained on the insurer's dataset alone. Our approach is illustrated with classical ML models on an anonymized dataset including both market data and data from an European non-life insurance company, and is compared with a hierarchical Bühlmann-Straub credibility model. We observe the transfert learning stragegy combining company data with external market data significantly improves the prediction accuracy compared to a ML model only trained on the insurer's data and provides competitive results compared to hierarchical credibility models. |
Keywords: | Transfer learning, Hierarchical credibility theory, Bühlmann credibility theory, Boosting, Deep Learning |
Date: | 2025–06–27 |
URL: | https://d.repec.org/n?u=RePEc:hal:journl:hal-04821310 |
By: | Duane, Jackson; Morgan, Ashley; Carter, Emily |
Abstract: | Financial institutions are increasingly leveraging---such as text, audio, and images---to gain insights and competitive advantage. Deep learning (DL) has emerged as a powerful paradigm for analyzing these complex data types, transforming tasks like financial news analysis, earnings call interpretation, and document parsing. This paper provides a comprehensive academic review of deep learning techniques for unstructured financial data. We present a taxonomy of data types and DL methods, including natural language processing models, speech and audio processing frameworks, multimodal fusion approaches, and transformer-based architectures. We survey key applications ranging from sentiment analysis and market prediction to fraud detection, credit risk assessment, and beyond, highlighting recent advancements in each domain. Additionally, we discuss major challenges unique to financial settings, such as data scarcity and annotation cost, model interpretability and regulatory compliance, and the dynamic, non-stationary nature of financial data. We enumerate prominent datasets and benchmarks that have accelerated research, and identify research gaps and future directions. The review emphasizes the latest developments up to 2025, including the rise of large pre-trained models and multimodal learning, and outlines how these innovations are shaping the next generation of financial analytics. |
Date: | 2025–06–25 |
URL: | https://d.repec.org/n?u=RePEc:osf:osfxxx:gdvbj_v1 |
By: | Qingyu Li; Chiranjib Mukhopadhyay; Abolfazl Bayat; Ali Habibnia |
Abstract: | Recent advances in quantum computing have demonstrated its potential to significantly enhance the analysis and forecasting of complex classical data. Among these, quantum reservoir computing has emerged as a particularly powerful approach, combining quantum computation with machine learning for modeling nonlinear temporal dependencies in high-dimensional time series. As with many data-driven disciplines, quantitative finance and econometrics can hugely benefit from emerging quantum technologies. In this work, we investigate the application of quantum reservoir computing for realized volatility forecasting. Our model employs a fully connected transverse-field Ising Hamiltonian as the reservoir with distinct input and memory qubits to capture temporal dependencies. The quantum reservoir computing approach is benchmarked against several econometric models and standard machine learning algorithms. The models are evaluated using multiple error metrics and the model confidence set procedures. To enhance interpretability and mitigate current quantum hardware limitations, we utilize wrapper-based forward selection for feature selection, identifying optimal subsets, and quantifying feature importance via Shapley values. Our results indicate that the proposed quantum reservoir approach consistently outperforms benchmark models across various metrics, highlighting its potential for financial forecasting despite existing quantum hardware constraints. This work serves as a proof-of-concept for the applicability of quantum computing in econometrics and financial analysis, paving the way for further research into quantum-enhanced predictive modeling as quantum hardware capabilities continue to advance. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.13933 |
By: | Kubra Bolukbas; Ertan Tok |
Abstract: | The goal of this study is to identify the most effective model for predicting credit risk, the likelihood a commercial loan defaults (become a non-performing loan) in the Turkish banking sector and to determine which firm and loan characteristics influence that risk. The analysis draws on an unbalanced dataset of 1.2 million firm-level observations for 2018–2023, combining financial ratios with detailed loan- and firm-specific information. Class imbalance is addressed through oversampling (including SMOTE) and multiple down-sampling schemes. Although the risk is assessed ex-ante, model performance is evaluated ex-post using the ROC-AUC metric. Within tested conventional econometric and machine learning approaches accompanied with different sampling techniques, Extreme Gradient Boosting (XGBoost) with oversampling delivers the best result with a ROC-AUC score of 0.914. Compared with logistic regression under the same sampling setup, a 4.9- percentage-point increase in test ROC-AUC is attained, confirming the model’s superior predictive performance over conventional approaches. Accordingly, the study finds that the industry and location in which a firm operates, its loan-restructuring status, loan cost and type (fixed vs. floating rate), the firm’s record of bad checks, and core ratios capturing profitability, liquidity and leverage to be the most influential predictors of credit risk. |
Keywords: | Credit Risk, Machine Learning Techniques, Financial Ratios, Banking Sector, Macro-Financial Stability, Feature Importance |
JEL: | C52 C53 C55 G17 G2 G32 G33 |
Date: | 2025 |
URL: | https://d.repec.org/n?u=RePEc:tcb:wpaper:2508 |
By: | Millend Roy; Vladimir Pyltsov; Yinbo Hu |
Abstract: | Accurate electricity load forecasting is essential for grid stability, resource optimization, and renewable energy integration. While transformer-based deep learning models like TimeGPT have gained traction in time-series forecasting, their effectiveness in long-term electricity load prediction remains uncertain. This study evaluates forecasting models ranging from classical regression techniques to advanced deep learning architectures using data from the ESD 2025 competition. The dataset includes two years of historical electricity load data, alongside temperature and global horizontal irradiance (GHI) across five sites, with a one-day-ahead forecasting horizon. Since actual test set load values remain undisclosed, leveraging predicted values would accumulate errors, making this a long-term forecasting challenge. We employ (i) Principal Component Analysis (PCA) for dimensionality reduction and (ii) frame the task as a regression problem, using temperature and GHI as covariates to predict load for each hour, (iii) ultimately stacking 24 models to generate yearly forecasts. Our results reveal that deep learning models, including TimeGPT, fail to consistently outperform simpler statistical and machine learning approaches due to the limited availability of training data and exogenous variables. In contrast, XGBoost, with minimal feature engineering, delivers the lowest error rates across all test cases while maintaining computational efficiency. This highlights the limitations of deep learning in long-term electricity forecasting and reinforces the importance of model selection based on dataset characteristics rather than complexity. Our study provides insights into practical forecasting applications and contributes to the ongoing discussion on the trade-offs between traditional and modern forecasting methods. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.11390 |
By: | Paker, Meredith; Stephenson, Judy; Wallis, Patrick |
Abstract: | Understanding long-run economic growth requires reliable historical data, yet the vast majority of long-run economic time series are drawn from incomplete records with significant temporal and geographic gaps. Conventional solutions to these gaps rely on linear regressions that risk bias or overfitting when data are scarce. We introduce “past predictive modeling, ” a framework that leverages machine learning and out-of-sample predictive modeling techniques to reconstruct representative historical time series from scarce data. Validating our approach using nominal wage data from England, 1300-1900, we show that this new method leads to more accurate and generalizable estimates, with bootstrapped standard errors 72% lower than benchmark linear regressions. Beyond just bettering accuracy, these improved wage estimates for England yield new insights into the impact of the Black Death on inequality, the economic geography of pre-industrial growth, and productivity over the long-run. |
Keywords: | machine learning; predictive modeling; wages; black death; industrial revolution |
JEL: | J31 C53 N33 N13 N63 |
Date: | 2025–06–13 |
URL: | https://d.repec.org/n?u=RePEc:ehl:lserod:128852 |
By: | Mahdi Kohan Sefidi |
Abstract: | Financial crises often occur without warning, yet markets leading up to these events display increasing volatility and complex interdependencies across multiple sectors. This study proposes a novel approach to predicting market crises by combining multilayer network analysis with Long Short-Term Memory (LSTM) models, using Granger causality to capture within-layer connections and Random Forest to model interlayer relationships. Specifically, we utilize Granger causality to model the temporal dependencies between market variables within individual layers, such as asset prices, trading values, and returns. To represent the interactions between different market variables across sectors, we apply Random Forest to model the interlayer connections, capturing the spillover effects between these features. The LSTM model is then trained to predict market instability and potential crises based on the dynamic features of the multilayer network. Our results demonstrate that this integrated approach, combining Granger causality, Random Forest, and LSTM, significantly enhances the accuracy of market crisis prediction, outperforming traditional forecasting models. This methodology provides a powerful tool for financial institutions and policymakers to better monitor systemic risks and take proactive measures to mitigate financial crises. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.11019 |
By: | Junzhe Jiang; Chang Yang; Aixin Cui; Sihan Jin; Ruiyu Wang; Bo Li; Xiao Huang; Dongning Sun; Xinrun Wang |
Abstract: | Financial tasks are pivotal to global economic stability; however, their execution faces challenges including labor intensive processes, low error tolerance, data fragmentation, and tool limitations. Although large language models (LLMs) have succeeded in various natural language processing tasks and have shown potential in automating workflows through reasoning and contextual understanding, current benchmarks for evaluating LLMs in finance lack sufficient domain-specific data, have simplistic task design, and incomplete evaluation frameworks. To address these gaps, this article presents FinMaster, a comprehensive financial benchmark designed to systematically assess the capabilities of LLM in financial literacy, accounting, auditing, and consulting. Specifically, FinMaster comprises three main modules: i) FinSim, which builds simulators that generate synthetic, privacy-compliant financial data for companies to replicate market dynamics; ii) FinSuite, which provides tasks in core financial domains, spanning 183 tasks of various types and difficulty levels; and iii) FinEval, which develops a unified interface for evaluation. Extensive experiments over state-of-the-art LLMs reveal critical capability gaps in financial reasoning, with accuracy dropping from over 90% on basic tasks to merely 40% on complex scenarios requiring multi-step reasoning. This degradation exhibits the propagation of computational errors, where single-metric calculations initially demonstrating 58% accuracy decreased to 37% in multimetric scenarios. To the best of our knowledge, FinMaster is the first benchmark that covers full-pipeline financial workflows with challenging tasks. We hope that FinMaster can bridge the gap between research and industry practitioners, driving the adoption of LLMs in real-world financial practices to enhance efficiency and accuracy. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.13533 |
By: | Yingjie Kuang; Tianchen Zhang; Zhen-Wei Huang; Zhongjie Zeng; Zhe-Yuan Li; Ling Huang; Yuefang Gao |
Abstract: | Accurately predicting customers' purchase intentions is critical to the success of a business strategy. Current researches mainly focus on analyzing the specific types of products that customers are likely to purchase in the future, little attention has been paid to the critical factor of whether customers will engage in repurchase behavior. Predicting whether a customer will make the next purchase is a classic time series forecasting task. However, in real-world purchasing behavior, customer groups typically exhibit imbalance - i.e., there are a large number of occasional buyers and a small number of loyal customers. This head-to-tail distribution makes traditional time series forecasting methods face certain limitations when dealing with such problems. To address the above challenges, this paper proposes a unified Clustering and Attention mechanism GRU model (CAGRU) that leverages multi-modal data for customer purchase intention prediction. The framework first performs customer profiling with respect to the customer characteristics and clusters the customers to delineate the different customer clusters that contain similar features. Then, the time series features of different customer clusters are extracted by GRU neural network and an attention mechanism is introduced to capture the significance of sequence locations. Furthermore, to mitigate the head-to-tail distribution of customer segments, we train the model separately for each customer segment, to adapt and capture more accurately the differences in behavioral characteristics between different customer segments, as well as the similar characteristics of the customers within the same customer segment. We constructed four datasets and conducted extensive experiments to demonstrate the superiority of the proposed CAGRU approach. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.13558 |
By: | Dumas, Christelle (University of Fribourg, Switzerland); Gautrain, Elsa (University of Fribourg, Switzerland); Gosselin-Pali, Adrien (Université Clermont Auvergne) |
Abstract: | In sub-Saharan Africa, child fostering—a widespread practice in which a child moves out of the household of her biological parents—can have significant implications for a child’s overall well-being. Using longitudinal data from South Africa that includes individual tracking, we employ double machine learning techniques to evaluate the impact of fostering on nutrition, addressing biases related to selection into treatment and endogenous attrition, two common challenges in the literature. Our findings reveal that fostering reduces the probability of being stunted by 6.8 percentage points, corresponding to a 37 percent reduction compared to the mean prevalence. This improvement appears to be driven by foster children relocating to smaller, rural households, often including retired individuals, typically grandparents, who receive a pension. Furthermore, we find that it not only enhances the nutritional status of foster children but also benefits the nutrition of other children from sending households, suggesting that fostering can be mutually beneficial for both groups. |
Keywords: | Child Fostering; Nutrition; Machine Learning; South Africa |
JEL: | I15 J12 J13 O15 C14 |
Date: | 2025–07–01 |
URL: | https://d.repec.org/n?u=RePEc:fri:fribow:fribow00542 |
By: | Sander de Vries (Vrije Universiteit Amsterdam and Tinbergen Institute) |
Abstract: | This paper provides new insights on the importance of family background by linking 1.7 million Dutch children’s incomes to an exceptionally rich set of family characteristics — including income, wealth, education, occupation, crime, and health. Using a machine learning approach, I show that conventional analyses using parental income only considerably underestimate intergenerational dependence. This underestimation is concentrated at the extremes of the child income distribution, where families are often (dis)advantaged across multiple dimensions. Gender differences in intergenerational dependence are minimal, despite allowing for complex gender-specific patterns. A comparison with adoptees highlights the role of pre-birth factors in driving intergenerational transmission. |
Keywords: | Intergenerational mobility, inequality of opportunity |
JEL: | I24 J24 J62 |
Date: | 2025–02–14 |
URL: | https://d.repec.org/n?u=RePEc:tin:wpaper:20250010 |
By: | Capistrano, Daniel (University College Dublin); Creighton, Mathew (University College Dublin); Fernández-Reino, Mariña |
Abstract: | In this study, we assessed if Large Language Models provided biased answers when prompted to assist with the evaluation of requests made by individuals with different ethnic backgrounds and gender. We emulated an experimental procedure traditionally used in correspondence studies to test discrimination in social settings. The preference given as recommendation from the language models were compared across groups revealing a significant bias against names associated with ethnic minorities, particularly in the housing domain. However, the magnitude of this ethnic bias as well as differences by gender depended on the context mentioned in the prompt to the model. Finally, directing the model to take into consideration regulatory provisions on Artificial Intelligence or potential gender and ethnic discrimination does not seem to mitigate the observed bias between groups. |
Date: | 2025–07–06 |
URL: | https://d.repec.org/n?u=RePEc:osf:socarx:9zusq_v1 |
By: | Gianluca De Nard; Damjan Kostovic |
Abstract: | The paper introduces a new type of shrinkage estimation that is not based on asymptotic optimality but uses artificial intelligence (AI) techniques to shrink the sample eigenvalues. The proposed AI Shrinkage estimator applies to both linear and nonlinear shrinkage, demonstrating improved performance compared to the classic shrinkage estimators. Our results demonstrate that reinforcement learning solutions identify a downward bias in classic shrinkage intensity estimates derived under the i.i.d. assumption and automatically correct for it in response to prevailing market conditions. Additionally, our data-driven approach enables more efficient implementation of risk-optimized portfolios and is well-suited for real-world investment applications including various optimization constraints. |
Keywords: | Covariance matrix estimation, linear and nonlinear shrinkage, portfolio management reinforcement learning, risk optimization |
JEL: | C13 C58 G11 |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:zur:econwp:470 |
By: | Lo, Chi-Sheng |
Abstract: | This study explores whether a NASDAQ-100 derivatives ETF portfolio can outperform the Invesco QQQ Trust (QQQ) using a Deep Reinforcement Learning framework based on Proximal Policy Optimization (PPO). The portfolio dynamically allocates across three NASDAQ-100 derivative ETFs: YQQQ (short options income), QYLD (covered calls), and TQQQ (3x leveraged), employing Isolation Forest anomaly detection to optimize rebalancing timing. A train-validation-test framework (2010-2018 training, 2019-2023 validation, 2024-2025 testing) utilizes a multi-objective function to balance tracking error minimization and excess return maximization, integrating dividend payments and quarterly with event-driven rebalancing. The results show significant alpha generation over QQQ by leveraging YQQQ’s inverse exposure, QYLD’s income stability, and TQQQ’s leveraged growth. Though experiencing higher volatility and drawdowns, the PPO agent skillfully optimizes allocations, achieving positive excess returns in the testing phase, with performance varying by market condition, emphasizing the need for adaptive strategies in dynamic markets. |
Keywords: | Deep reinforcement learning, enhanced index tracking, isolation forest, QQQ, Nasdaq 100, exchange traded fund, options derivatives |
JEL: | C32 C44 C61 |
Date: | 2025–07–10 |
URL: | https://d.repec.org/n?u=RePEc:pra:mprapa:125307 |