|
on Financial Markets |
By: | Shasha Yu; Qinchen Zhang; Yuwei Zhao |
Abstract: | This project aims to predict short-term and long-term upward trends in the S&P 500 index using machine learning models and feature engineering based on the "101 Formulaic Alphas" methodology. The study employed multiple models, including Logistic Regression, Decision Trees, Random Forests, Neural Networks, K-Nearest Neighbors (KNN), and XGBoost, to identify market trends from historical stock data collected from Yahoo! Finance. Data preprocessing involved handling missing values, standardization, and iterative feature selection to ensure relevance and variability. For short-term predictions, KNN emerged as the most effective model, delivering robust performance with high recall for upward trends, while for long-term forecasts, XGBoost demonstrated the highest accuracy and AUC scores after hyperparameter tuning and class imbalance adjustments using SMOTE. Feature importance analysis highlighted the dominance of momentum-based and volume-related indicators in driving predictions. However, models exhibited limitations such as overfitting and low recall for positive market movements, particularly in imbalanced datasets. The study concludes that KNN is ideal for short-term alerts, whereas XGBoost is better suited for long-term trend forecasting. Future enhancements could include advanced architectures like Long Short-Term Memory (LSTM) networks and further feature refinement to improve precision and generalizability. These findings contribute to developing reliable machine learning tools for market trend prediction and investment decision-making. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.11462 |
By: | Jiajun Gu; Zichen Yang; Xintong Lin; Sixun Chen; YuTing Lu |
Abstract: | This project investigates the interplay of technical, market, and statistical factors in predicting stock market performance, with a primary focus on S&P 500 companies. Utilizing a comprehensive dataset spanning multiple years, the analysis constructs advanced financial metrics, such as momentum indicators, volatility measures, and liquidity adjustments. The machine learning framework is employed to identify patterns, relationships, and predictive capabilities of these factors. The integration of traditional financial analytics with machine learning enables enhanced predictive accuracy, offering valuable insights into market behavior and guiding investment strategies. This research highlights the potential of combining domain-specific financial expertise with modern computational tools to address complex market dynamics. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.12438 |
By: | Viral V. Acharya; Markus K. Brunnermeier; Diane Pierret |
Abstract: | We assess the efficacy of systemic risk measures that rely on U.S. financial firms’ stock return co-movements with market- or sector-wide returns under stress from 1927 to 2023. We ascertain stress episodes based on widening of corporate bond spreads and narrative dating. Systemic risk measures exhibit substantial and robust predictive power in explaining the cross-section of market realized outcomes, viz., volatility and returns, during stress episodes. The measures also help predict bank failures and balance-sheet outcomes, confirming their relevance for understanding risks to the real economy emanating from banking sector fragility. Overall, market-based systemic risk measures offer a promising complement to macro-prudential and supervisory assessments of the financial sector. |
JEL: | G01 G20 G21 G23 G28 |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:33211 |
By: | Maria Teresa Medeiros Garcia; Carolina e Silva Correia de Carvalho |
Abstract: | This paper provides insights into the impact of sentiment factors on stock market volatility using monthly panel data from Germany, the UK and the US from 2002-2022. The main objective is to understand how the consumer confidence index, the trading volume, the put/call ratio, and the number of IPOs - components of the sentiment index used in this research - affect the volatility of the DAX 40, FTSE 100, and S&P 500 indices, respectively. The results suggest that investor sentiment has an impact on market volatility in all three indices. A higher consumer confidence index correlates with lower volatility, suggesting that positive sentiment stabilizes markets. Conversely, increased trading volume and a higher put/call ratio are associated with increased volatility, reflecting greater market activity and investor uncertainty. In addition, the number of IPOs serves as a sentiment gauge, with increased IPO activity corresponding to a more optimistic market outlook and contributing to lower volatility. Overall, the results underscore the importance of integrating sentiment measures into financial analysis and provide valuable insights for investors and policymakers seeking to understand and manage market fluctuations. This research contributes to the behavioural finance literature by elucidating the complex interplay between investor sentiment and stock market behaviour. |
Keywords: | sentiment; volatility; stock market. |
JEL: | G12 G14 G17 C58 E44 |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:ise:remwps:wp03652025 |
By: | Gang Huang; Xiaohua Zhou; Qingyang Song |
Abstract: | Artificial intelligence is fundamentally transforming financial investment decision-making paradigms, with deep reinforcement learning (DRL) demonstrating significant application potential in domains such as robo-advisory services. Given that traditional portfolio optimization methods face significant challenges in effectively managing dynamic asset weight adjustments, this paper approaches the problem from the perspective of practical trading processes and develops a dynamic optimization model using deep reinforcement learning to achieve more effective asset allocation. The study's innovations are twofold: First, we propose a Sharpe ratio reward function specifically designed for Actor-Critic deep reinforcement learning algorithms, which optimizes portfolio performance by maximizing the average Sharpe ratio through random sampling and reinforcement learning algorithms during the training process; Second, we design deep neural networks that are specifically structured to meet asset optimization objectives. The study empirically evaluates the model using randomly selected constituent stocks from the CSI300 index and conducts comparative analyses against traditional approaches, including mean-variance optimization and risk parity strategies. Backtesting results demonstrate the dynamic optimization model's effectiveness in portfolio asset allocation, yielding enhanced risk reduction, superior risk-return metrics, and optimal performance across comprehensive evaluation criteria. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.18563 |
By: | Akash Deep; Chris Monico; Abootaleb Shirvani; Svetlozar Rachev; Frank J. Fabozzi |
Abstract: | This study evaluates the performance of random forest regression models enhanced with technical indicators for high-frequency stock price prediction. Using minute-level SPY data, we assessed 13 models that incorporate technical indicators such as Bollinger bands, exponential moving average, and Fibonacci retracement. While these models improved risk-adjusted performance metrics, they struggled with out-of-sample generalization, highlighting significant overfitting challenges. Feature importance analysis revealed that primary price-based features consistently outperformed technical indicators, suggesting their limited utility in high-frequency trading contexts. These findings challenge the weak form of the efficient market hypothesis, identifying short-lived inefficiencies during volatile periods but its limited persistence across market regimes. The study emphasizes the need for selective feature engineering, adaptive modeling, and a stronger focus on risk-adjusted performance metrics to navigate the complexities of high-frequency trading environments. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.15448 |
By: | Adair Morse; Parinitha R. Sastry |
Abstract: | Banks have voluntarily committed to align their lending portfolios with a net zero path toward a decarbonized economy. In this review, we explore the economic channels for why portfolio decarbonization might be consistent with lender profit maximization. We frame the question by positing that net zero lending may create differential value through the channels of risk and returns, where return topics span profit margins and lending book growth arguments. We then use the lens of the frame to survey the literature and speak to gaps in research knowledge. We uncover multiple roles for risk arguments influencing decarbonization. Moreover, decarbonization and green investment are tied to enhanced profitability through bank lending growth. Yet, the literature has many dots yet to connect. We suggest that future work may draw further connections between the literature in climate finance and the broader literature in banking, to enhance our understanding of the role that banks will play in the net zero transition. |
JEL: | G21 G28 G31 Q54 |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:33148 |
By: | Yilie Huang; Yanwei Jia; Xun Yu Zhou |
Abstract: | We study continuous-time mean--variance portfolio selection in markets where stock prices are diffusion processes driven by observable factors that are also diffusion processes yet the coefficients of these processes are unknown. Based on the recently developed reinforcement learning (RL) theory for diffusion processes, we present a general data-driven RL algorithm that learns the pre-committed investment strategy directly without attempting to learn or estimate the market coefficients. For multi-stock Black--Scholes markets without factors, we further devise a baseline algorithm and prove its performance guarantee by deriving a sublinear regret bound in terms of Sharpe ratio. For performance enhancement and practical implementation, we modify the baseline algorithm into four variants, and carry out an extensive empirical study to compare their performance, in terms of a host of common metrics, with a large number of widely used portfolio allocation strategies on S\&P 500 constituents. The results demonstrate that the continuous-time RL strategies are consistently among the best especially in a volatile bear market, and decisively outperform the model-based continuous-time counterparts by significant margins. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.16175 |
By: | Olamilekan Shobayo; Sidikat Adeyemi-Longe; Olusogo Popoola; Bayode Ogunleye |
Abstract: | This study explores the comparative performance of cutting-edge AI models, i.e., Finaance Bidirectional Encoder representations from Transsformers (FinBERT), Generatice Pre-trained Transformer GPT-4, and Logistic Regression, for sentiment analysis and stock index prediction using financial news and the NGX All-Share Index data label. By leveraging advanced natural language processing models like GPT-4 and FinBERT, alongside a traditional machine learning model, Logistic Regression, we aim to classify market sentiment, generate sentiment scores, and predict market price movements. This research highlights global AI advancements in stock markets, showcasing how state-of-the-art language models can contribute to understanding complex financial data. The models were assessed using metrics such as accuracy, precision, recall, F1 score, and ROC AUC. Results indicate that Logistic Regression outperformed the more computationally intensive FinBERT and predefined approach of versatile GPT-4, with an accuracy of 81.83% and a ROC AUC of 89.76%. The GPT-4 predefined approach exhibited a lower accuracy of 54.19% but demonstrated strong potential in handling complex data. FinBERT, while offering more sophisticated analysis, was resource-demanding and yielded a moderate performance. Hyperparameter optimization using Optuna and cross-validation techniques ensured the robustness of the models. This study highlights the strengths and limitations of the practical applications of AI approaches in stock market prediction and presents Logistic Regression as the most efficient model for this task, with FinBERT and GPT-4 representing emerging tools with potential for future exploration and innovation in AI-driven financial analytics |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.06837 |
By: | Lin William Cong; Ke Tang; Danxia Xie; Weiyi Zhao |
Abstract: | We conceptually identify and empirically verify using marketplace lending data the features distinguishing FinTech platforms from non-financial platforms: (i) Long-term contracts introducing default risk at both the individual and platform levels; (ii) Lenders’ investment diversification to mitigate individual default risk; (iii) Platform-level default risk leading to greater asymmetric user stickiness and rendering platform-level cross-side network effects (p-CNEs), a novel metric we introduce, crucial for adoption and market dynamics. We incorporate these features into a model of two-sided FinTech platform with potential failures and endogenous participation/fees. The model predicts lenders’ single-homing, occasional lower fees for borrowers, asymmetric p-CNEs, and the predictive power of lenders’ p-CNEs in forecasting platform failures. Marketplace lending in China empirically corroborate our model predictions in this dynamic industry characterized by entries, exits, and network externalities. Specifically, lenders’ p-CNEs are empirically lower on declining or more established platforms compared to growing or new ones. Moreover, lenders’ p-CNEs predict platforms’ survival likelihood among others, even at very early stages. Our findings provide novel economic insights on multi-sided FinTech platforms for both practitioners and regulators. |
JEL: | G19 G23 |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:33173 |
By: | Yixuan Liang; Yuncong Liu; Boyu Zhang; Christina Dan Wang; Hongyang Yang |
Abstract: | Financial sentiment analysis is crucial for understanding the influence of news on stock prices. Recently, large language models (LLMs) have been widely adopted for this purpose due to their advanced text analysis capabilities. However, these models often only consider the news content itself, ignoring its dissemination, which hampers accurate prediction of short-term stock movements. Additionally, current methods often lack sufficient contextual data and explicit instructions in their prompts, limiting LLMs' ability to interpret news. In this paper, we propose a data-driven approach that enhances LLM-powered sentiment-based stock movement predictions by incorporating news dissemination breadth, contextual data, and explicit instructions. We cluster recent company-related news to assess its reach and influence, enriching prompts with more specific data and precise instructions. This data is used to construct an instruction tuning dataset to fine-tune an LLM for predicting short-term stock price movements. Our experimental results show that our approach improves prediction accuracy by 8\% compared to existing methods. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.10823 |
By: | Laura Chioda; Paul Gertler; Sean Higgins; Paolina C. Medina |
Abstract: | Despite the promise of FinTech lending to expand access to credit to populations without a formal credit history, FinTech lenders primarily lend to applicants with a formal credit history and rely on conventional credit bureau scores as an input to their algorithms. Using data from a large FinTech lender in Mexico, we show that alternative data from digital transactions through a delivery app are effective at predicting creditworthiness for borrowers with no credit history. We also show that segmenting our machine learning model by gender can improve credit allocation fairness without a substantive effect on the model’s predictive performance. |
JEL: | G23 G5 O16 |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:33208 |
By: | Siqiao Zhao; Dan Wang; Raphael Douady |
Abstract: | The domain of hedge fund investments is undergoing significant transformation, influenced by the rapid expansion of data availability and the advancement of analytical technologies. This study explores the enhancement of hedge fund investment performance through the integration of machine learning techniques, the application of PolyModel feature selection, and the analysis of fund size. We address three critical questions: (1) the effect of machine learning on trading performance, (2) the role of PolyModel feature selection in fund selection and performance, and (3) the comparative reliability of larger versus smaller funds. Our findings offer compelling insights. We observe that while machine learning techniques enhance cumulative returns, they also increase annual volatility, indicating variability in performance. PolyModel feature selection proves to be a robust strategy, with approaches that utilize a comprehensive set of features for fund selection outperforming more selective methodologies. Notably, Long-Term Stability (LTS) effectively manages portfolio volatility while delivering favorable returns. Contrary to popular belief, our results suggest that larger funds do not consistently yield better investment outcomes, challenging the assumption of their inherent reliability. This research highlights the transformative impact of data-driven approaches in the hedge fund investment arena and provides valuable implications for investors and asset managers. By leveraging machine learning and PolyModel feature selection, investors can enhance portfolio optimization and reassess the dependability of larger funds, leading to more informed investment strategies. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.11019 |
By: | Zong Ke; Yuchen Yin |
Abstract: | As the increasing application of AI in finance, this paper will leverage AI algorithms to examine tail risk and develop a model to alter tail risk to promote the stability of US financial markets, and enhance the resilience of the US economy. Specifically, the paper constructs a multivariate multilevel CAViaR model, optimized by gradient descent and genetic algorithm, to study the tail risk spillover between the US stock market, foreign exchange market and credit market. The model is used to provide early warning of related risks in US stocks, US credit bonds, etc. The results show that, by analyzing the direction, magnitude, and pseudo-impulse response of the risk spillover, it is found that the credit market's spillover effect on the stock market and its duration are both greater than the spillover effect of the stock market and the other two markets on credit market, placing credit market in a central position for warning of extreme risks. Its historical information on extreme risks can serve as a predictor of the VaR of other markets. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.06193 |