|
on Computational Economics |
By: | Kuan-Ming Liu; Ming-Chih Lo |
Abstract: | Recent advances in deep learning and large language models (LLMs) have facilitated the deployment of the mixture-of-experts (MoE) mechanism in the stock investment domain. While these models have demonstrated promising trading performance, they are often unimodal, neglecting the wealth of information available in other modalities, such as textual data. Moreover, the traditional neural network-based router selection mechanism fails to consider contextual and real-world nuances, resulting in suboptimal expert selection. To address these limitations, we propose LLMoE, a novel framework that employs LLMs as the router within the MoE architecture. Specifically, we replace the conventional neural network-based router with LLMs, leveraging their extensive world knowledge and reasoning capabilities to select experts based on historical price data and stock news. This approach provides a more effective and interpretable selection mechanism. Our experiments on multimodal real-world stock datasets demonstrate that LLMoE outperforms state-of-the-art MoE models and other deep neural network approaches. Additionally, the flexible architecture of LLMoE allows for easy adaptation to various downstream tasks. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.09636 |
By: | Jingfeng Chen; Wanlin Deng; Dangxing Chen; Luyao Zhang |
Abstract: | Machine learning is critical for innovation and efficiency in financial markets, offering predictive models and data-driven decision-making. However, challenges such as missing data, lack of transparency, untimely updates, insecurity, and incompatible data sources limit its effectiveness. Blockchain technology, with its transparency, immutability, and real-time updates, addresses these challenges. We present a framework for integrating high-frequency on-chain data with low-frequency off-chain data, providing a benchmark for addressing novel research questions in economic mechanism design. This framework generates modular, extensible datasets for analyzing economic mechanisms such as the Transaction Fee Mechanism, enabling multi-modal insights and fairness-driven evaluations. Using four machine learning techniques, including linear regression, deep neural networks, XGBoost, and LSTM models, we demonstrate the framework's ability to produce datasets that advance financial research and improve understanding of blockchain-driven systems. Our contributions include: (1) proposing a research scenario for the Transaction Fee Mechanism and demonstrating how the framework addresses previously unexplored questions in economic mechanism design; (2) providing a benchmark for financial machine learning by open-sourcing a sample dataset generated by the framework and the code for the pipeline, enabling continuous dataset expansion; and (3) promoting reproducibility, transparency, and collaboration by fully open-sourcing the framework and its outputs. This initiative supports researchers in extending our work and developing innovative financial machine-learning models, fostering advancements at the intersection of machine learning, blockchain, and economics. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.16277 |
By: | Adamantios Ntakaris; Gbenga Ibikunle |
Abstract: | This study presents an autonomous experimental machine learning protocol for high-frequency trading (HFT) stock price forecasting that involves a dual competitive feature importance mechanism and clustering via shallow neural network topology for fast training. By incorporating the k-means algorithm into the radial basis function neural network (RBFNN), the proposed method addresses the challenges of manual clustering and the reliance on potentially uninformative features. More specifically, our approach involves a dual competitive mechanism for feature importance, combining the mean-decrease impurity (MDI) method and a gradient descent (GD) based feature importance mechanism. This approach, tested on HFT Level 1 order book data for 20 S&P 500 stocks, enhances the forecasting ability of the RBFNN regressor. Our findings suggest that an autonomous approach to feature selection and clustering is crucial, as each stock requires a different input feature space. Overall, by automating the feature selection and clustering processes, we remove the need for manual topological grid search and provide a more efficient way to predict LOB's mid-price. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.16160 |
By: | Suyeol Yun |
Abstract: | Developing effective quantitative trading strategies using reinforcement learning (RL) is challenging due to the high risks associated with online interaction with live financial markets. Consequently, offline RL, which leverages historical market data without additional exploration, becomes essential. However, existing offline RL methods often struggle to capture the complex temporal dependencies inherent in financial time series and may overfit to historical patterns. To address these challenges, we introduce a Decision Transformer (DT) initialized with pre-trained GPT-2 weights and fine-tuned using Low-Rank Adaptation (LoRA). This architecture leverages the generalization capabilities of pre-trained language models and the efficiency of LoRA to learn effective trading policies from expert trajectories solely from historical data. Our model performs competitively with established offline RL algorithms, including Conservative Q-Learning (CQL), Implicit Q-Learning (IQL), and Behavior Cloning (BC), as well as a baseline Decision Transformer with randomly initialized GPT-2 weights and LoRA. Empirical results demonstrate that our approach effectively learns from expert trajectories and secures superior rewards in certain trading scenarios, highlighting the effectiveness of integrating pre-trained language models and parameter-efficient fine-tuning in offline RL for quantitative trading. Replication code for our experiments is publicly available at https://github.com/syyunn/finrl-dt |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.17900 |
By: | Muhammad Arslan (ICB - Laboratoire Interdisciplinaire Carnot de Bourgogne - UTBM - Université de Technologie de Belfort-Montbeliard - UB - Université de Bourgogne - UBFC - Université Bourgogne Franche-Comté [COMUE] - CNRS - Centre National de la Recherche Scientifique); Christophe Cruz (ICB - Laboratoire Interdisciplinaire Carnot de Bourgogne - UTBM - Université de Technologie de Belfort-Montbeliard - UB - Université de Bourgogne - UBFC - Université Bourgogne Franche-Comté [COMUE] - CNRS - Centre National de la Recherche Scientifique) |
Abstract: | Enterprises depend on diverse data like invoices, news articles, legal documents, and financial records to operate. Efficient Information Extraction (IE) is essential for extracting valuable insights from this data for decision-making. Natural Language Processing (NLP) has transformed IE, enabling rapid and accurate analysis of vast datasets. Tasks such as Named Entity Recognition (NER), Relation Extraction (RE), Event Extraction (EE), Term Extraction (TE), and Topic Modeling (TM) are vital across sectors. Yet, implementing these methods individually can be resource-intensive, especially for smaller organizations lacking in Research and Development (R&D) capabilities. Large Language Models (LLMs), powered by Generative Artificial Intelligence (GenAI), offer a cost-effective solution, seamlessly handling multiple IE tasks. Despite their capabilities, LLMs may struggle with domain-specific queries, leading to inaccuracies. To overcome this challenge, Retrieval-Augmented Generation (RAG) complements LLMs by enhancing IE with external data retrieval, ensuring accuracy and relevance. While the adoption of RAG with LLMs is increasing, comprehensive business applications utilizing this integration remain limited. This paper addresses this gap by introducing a novel application named Business-RAG, showcasing its potential and encouraging further research in this domain. |
Keywords: | Business Intelligence (BI), Decision-Making, Information Extraction (IE), Large Language Models (LLMs), Natural Language Processing (NLP), Retrieval-Augmented Generation (RAG) |
Date: | 2024–07–09 |
URL: | https://d.repec.org/n?u=RePEc:hal:journl:hal-04862172 |
By: | Fischer, Manfred M. |
Abstract: | Convolutional Neural Networks (CNNs) are a specialized class of deep neural networks tailored for processing high-dimensional data, excelling in tasks like image classification, object detection, and facial recognition. Their architecture is built on convolutional layers interspersed with pooling layers, which efficiently extract hierarchical features while reducing computational complexity. Convolutional layers utilize filters to detect patterns such as edges or textures, employing shared weights to enhance translational invariance and reduce the number of parameters. Pooling layers further simplify feature maps by downsampling, preserving critical information while minimizing spatial dimensions. Prominent architectures, including LeNet, AlexNet, VGGNet, and ResNet, have set benchmarks in image recognition and inspired advancements in deep learning. The careful tuning of hyperparameters, such as filter size, stride, and padding, plays a pivotal role in optimizing performance, balancing accuracy with computational efficiency. CNNs continue to drive innovation, expanding their applications across diverse fields like natural language processing, speech recognition, and autonomous systems. |
Keywords: | Deep convolutional neural networks, ; LeNet; AlexNet; VGGNet; ResNet |
Date: | 2025 |
URL: | https://d.repec.org/n?u=RePEc:wiw:wus046:71119092 |
By: | Gabriel Okasa; Alberto de Le\'on; Michaela Strinzel; Anne Jorstad; Katrin Milzow; Matthias Egger; Stefan M\"uller |
Abstract: | Peer review in grant evaluation informs funding decisions, but the contents of peer review reports are rarely analyzed. In this work, we develop a thoroughly tested pipeline to analyze the texts of grant peer review reports using methods from applied Natural Language Processing (NLP) and machine learning. We start by developing twelve categories reflecting content of grant peer review reports that are of interest to research funders. This is followed by multiple human annotators' iterative annotation of these categories in a novel text corpus of grant peer review reports submitted to the Swiss National Science Foundation. After validating the human annotation, we use the annotated texts to fine-tune pre-trained transformer models to classify these categories at scale, while conducting several robustness and validation checks. Our results show that many categories can be reliably identified by human annotators and machine learning approaches. However, the choice of text classification approach considerably influences the classification performance. We also find a high correspondence between out-of-sample classification performance and human annotators' perceived difficulty in identifying categories. Our results and publicly available fine-tuned transformer models will allow researchers and research funders and anybody interested in peer review to examine and report on the contents of these reports in a structured manner. Ultimately, we hope our approach can contribute to ensuring the quality and trustworthiness of grant peer review. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.16662 |
By: | Filipovska, Elena; Mladenovska, Ana; Bajrami, Merxhan; Dobreva, Jovana; Hillman, Velislava; Lameski, Petre; Zdravevski, Eftim |
Abstract: | The rapid growth of document volumes and complexity in various domains necessitates advanced automated methods to enhance the efficiency and accuracy of information extraction and analysis. This paper aims to evaluate the efficiency and repeatability of OpenAI's APIs and other Large Language Models (LLMs) in automating question-answering tasks across multiple documents, specifically focusing on analyzing Data Privacy Policy (DPP) documents of selected EdTech providers. We test how well these models perform on large-scale text processing tasks using the OpenAI's LLM models (GPT 3.5 Turbo, GPT 4, GPT 4o) and APIs in several frameworks: direct API calls (i.e., one-shot learning), LangChain, and Retrieval Augmented Generation (RAG) systems. We also evaluate a local deployment of quantized versions (with FAISS) of LLM models (Llama-2-13B-chat-GPTQ). Through systematic evaluation against predefined use cases and a range of metrics, including response format, execution time, and cost, our study aims to provide insights into the optimal practices for document analysis. Our findings demonstrate that using OpenAI's LLMs via API calls is a workable workaround for accelerating document analysis when using a local GPU-powered infrastructure is not a viable solution, particularly for long texts. On the other hand, the local deployment is quite valuable for maintaining the data within the private infrastructure. Our findings show that the quantized models retain substantial relevance even with fewer parameters than ChatGPT and do not impose processing restrictions on the number of tokens. This study offers insights on maximizing the use of LLMs for better efficiency and data governance in addition to confirming their usefulness in improving document analysis procedures. |
Keywords: | few-shot learning Q&A; GPT; LangChain; Large Language Models; Llama; LLM; multi-document; one-shot learning; OpenAI; QA; RAG |
JEL: | J50 |
Date: | 2024–12–31 |
URL: | https://d.repec.org/n?u=RePEc:ehl:lserod:126674 |
By: | Roberto-Rafael Maura-Rivero; Chirag Nagpal; Roma Patel; Francesco Visin |
Abstract: | Current methods that train large language models (LLMs) with reinforcement learning feedback, often resort to averaging outputs of multiple rewards functions during training. This overlooks crucial aspects of individual reward dimensions and inter-reward dependencies that can lead to sub-optimal outcomes in generations. In this work, we show how linear aggregation of rewards exhibits some vulnerabilities that can lead to undesired properties of generated text. We then propose a transformation of reward functions inspired by economic theory of utility functions (specifically Inada conditions), that enhances sensitivity to low reward values while diminishing sensitivity to already high values. We compare our approach to the existing baseline methods that linearly aggregate rewards and show how the Inada-inspired reward feedback is superior to traditional weighted averaging. We quantitatively and qualitatively analyse the difference in the methods, and see that models trained with Inada-transformations score as more helpful while being less harmful. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.06248 |
By: | Bo Yuan; Damiano Brigo; Antoine Jacquier; Nicola Pede |
Abstract: | Deep learning methods have become a widespread toolbox for pricing and calibration of financial models. While they often provide new directions and research results, their `black box' nature also results in a lack of interpretability. We provide a detailed interpretability analysis of these methods in the context of rough volatility - a new class of volatility models for Equity and FX markets. Our work sheds light on the neural network learned inverse map between the rough volatility model parameters, seen as mathematical model inputs and network outputs, and the resulting implied volatility across strikes and maturities, seen as mathematical model outputs and network inputs. This contributes to building a solid framework for a safer use of neural networks in this context and in quantitative finance more generally. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.19317 |
By: | Qian Yu; Zhen Xu; Zong Ke |
Abstract: | In the context of globalization and the rapid expansion of the digital economy, anti-money laundering (AML) has become a crucial aspect of financial oversight, particularly in cross-border transactions. The rising complexity and scale of international financial flows necessitate more intelligent and adaptive AML systems to combat increasingly sophisticated money laundering techniques. This paper explores the application of unsupervised learning models in cross-border AML systems, focusing on rule optimization through contrastive learning techniques. Five deep learning models, ranging from basic convolutional neural networks (CNNs) to hybrid CNNGRU architectures, were designed and tested to assess their performance in detecting abnormal transactions. The results demonstrate that as model complexity increases, so does the system's detection accuracy and responsiveness. In particular, the self-developed hybrid Convolutional-Recurrent Neural Integration Model (CRNIM) model showed superior performance in terms of accuracy and area under the receiver operating characteristic curve (AUROC). These findings highlight the potential of unsupervised learning models to significantly improve the intelligence, flexibility, and real-time capabilities of AML systems. By optimizing detection rules and enhancing adaptability to emerging money laundering schemes, this research provides both theoretical and practical contributions to the advancement of AML technologies, which are essential for safeguarding the global financial system against illicit activities. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.07027 |
By: | Hamza Bodor; Laurent Carlier |
Abstract: | The Queue-Reactive model introduced by Huang et al. (2015) has become a standard tool for limit order book modeling, widely adopted by both researchers and practitioners for its simplicity and effectiveness. We present the Multidimensional Deep Queue-Reactive (MDQR) model, which extends this framework in three ways: it relaxes the assumption of queue independence, enriches the state space with market features, and models the distribution of order sizes. Through a neural network architecture, the model learns complex dependencies between different price levels and adapts to varying market conditions, while preserving the interpretable point-process foundation of the original framework. Using data from the Bund futures market, we show that MDQR captures key market properties including the square-root law of market impact, cross-queue correlations, and realistic order size patterns. The model demonstrates particular strength in reproducing both conditional and stationary distributions of order sizes, as well as various stylized facts of market microstructure. The model achieves this while maintaining the computational efficiency needed for practical applications such as strategy development through reinforcement learning or realistic backtesting. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.08822 |
By: | Bryan T. Kelly; Boris Kuznetsov; Semyon Malamud; Teng Andrea Xu |
Abstract: | The core statistical technology in artificial intelligence is the large-scale transformer network. We propose a new asset pricing model that implants a transformer in the stochastic discount factor. This structure leverages conditional pricing information via cross-asset information sharing and nonlinearity. We also develop a linear transformer that serves as a simplified surrogate from which we derive an intuitive decomposition of the transformer's asset pricing mechanisms. We find large reductions in pricing errors from our artificial intelligence pricing model (AIPM) relative to previous machine learning models and dissect the sources of these gains. |
JEL: | C45 G10 G11 G14 G17 |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:33351 |
By: | Jerick Shi; Burton Hollifield |
Abstract: | Predicting the movement of the stock market and other assets has been valuable over the past few decades. Knowing how the value of a certain sector market may move in the future provides much information for investors, as they use that information to develop strategies to maximize profit or minimize risk. However, market data are quite noisy, and it is challenging to choose the right data or the right model to create such predictions. With the rise of large language models, there are ways to analyze certain data much more efficiently than before. Our goal is to determine whether the GPT model provides more useful information compared to other traditional transformer models, such as the BERT model. We shall use data from the Federal Reserve Beige Book, which provides summaries of economic conditions in different districts in the US. Using such data, we then employ the LLM's to make predictions on the correlations. Using these correlations, we then compare the results with well-known strategies and determine whether knowing the economic conditions improves investment decisions. We conclude that the Beige Book does contain information regarding correlations amongst different assets, yet the GPT model has too much look-ahead bias and that traditional models still triumph. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.16569 |
By: | Jebish Purbey; Siddhant Gupta; Nikhil Manali; Siddartha Pullakhandam; Drishti Sharma; Ashay Srivastava; Ram Mohan Rao Kadiyala |
Abstract: | This paper presents the system description of our entry for the COLING 2025 FMD challenge, focusing on misinformation detection in financial domains. We experimented with a combination of large language models, including Qwen, Mistral, and Gemma-2, and leveraged pre-processing and sequential learning for not only identifying fraudulent financial content but also generating coherent, and concise explanations that clarify the rationale behind the classifications. Our approach achieved competitive results with an F1-score of 0.8283 for classification, and ROUGE-1 of 0.7253 for explanations. This work highlights the transformative potential of LLMs in financial applications, offering insights into their capabilities for combating misinformation and enhancing transparency while identifying areas for future improvement in robustness and domain adaptation. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.00549 |
By: | Chi-Sheng Chen; Ying-Jung Chen |
Abstract: | Graph Neural Networks (GNNs) have emerged as transformative tools for modeling complex relational data, offering unprecedented capabilities in tasks like forecasting and optimization. This study investigates the application of GNNs to demand forecasting within supply chain networks using the SupplyGraph dataset, a benchmark for graph-based supply chain analysis. By leveraging advanced GNN methodologies, we enhance the accuracy of forecasting models, uncover latent dependencies, and address temporal complexities inherent in supply chain operations. Comparative analyses demonstrate that GNN-based models significantly outperform traditional approaches, including Multilayer Perceptrons (MLPs) and Graph Convolutional Networks (GCNs), particularly in single-node demand forecasting tasks. The integration of graph representation learning with temporal data highlights GNNs' potential to revolutionize predictive capabilities for inventory management, production scheduling, and logistics optimization. This work underscores the pivotal role of forecasting in supply chain management and provides a robust framework for advancing research and applications in this domain. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.06221 |
By: | Benjamin Patrick Evans; Sihan Zeng; Sumitra Ganesh; Leo Ardon |
Abstract: | Agent-based models (ABMs) are valuable for modelling complex, potentially out-of-equilibria scenarios. However, ABMs have long suffered from the Lucas critique, stating that agent behaviour should adapt to environmental changes. Furthermore, the environment itself often adapts to these behavioural changes, creating a complex bi-level adaptation problem. Recent progress integrating multi-agent reinforcement learning into ABMs introduces adaptive agent behaviour, beginning to address the first part of this critique, however, the approaches are still relatively ad hoc, lacking a general formulation, and furthermore, do not tackle the second aspect of simultaneously adapting environmental level characteristics in addition to the agent behaviours. In this work, we develop a generic two-layer framework for ADaptive AGEnt based modelling (ADAGE) for addressing these problems. This framework formalises the bi-level problem as a Stackelberg game with conditional behavioural policies, providing a consolidated framework for adaptive agent-based modelling based on solving a coupled set of non-linear equations. We demonstrate how this generic approach encapsulates several common (previously viewed as distinct) ABM tasks, such as policy design, calibration, scenario generation, and robust behavioural learning under one unified framework. We provide example simulations on multiple complex economic and financial environments, showing the strength of the novel framework under these canonical settings, addressing long-standing critiques of traditional ABMs. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.09429 |
By: | Haonan Xu; Alessio Brini |
Abstract: | This paper applies deep reinforcement learning (DRL) to optimize liquidity provisioning in Uniswap v3, a decentralized finance (DeFi) protocol implementing an automated market maker (AMM) model with concentrated liquidity. We model the liquidity provision task as a Markov Decision Process (MDP) and train an active liquidity provider (LP) agent using the Proximal Policy Optimization (PPO) algorithm. The agent dynamically adjusts liquidity positions by using information about price dynamics to balance fee maximization and impermanent loss mitigation. We use a rolling window approach for training and testing, reflecting realistic market conditions and regime shifts. This study compares the data-driven performance of the DRL-based strategy against common heuristics adopted by small retail LP actors that do not systematically modify their liquidity positions. By promoting more efficient liquidity management, this work aims to make DeFi markets more accessible and inclusive for a broader range of participants. Through a data-driven approach to liquidity management, this work seeks to contribute to the ongoing development of more efficient and user-friendly DeFi markets. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.07508 |
By: | Eric Innocenti (LISA - Laboratoire « Lieux, Identités, eSpaces, Activités » (UMR CNRS 6240 LISA) - CNRS - Centre National de la Recherche Scientifique - Università di Corsica Pasquale Paoli [Université de Corse Pascal Paoli]); Dominique Prunetti (LISA - Laboratoire « Lieux, Identités, eSpaces, Activités » (UMR CNRS 6240 LISA) - CNRS - Centre National de la Recherche Scientifique - Università di Corsica Pasquale Paoli [Université de Corse Pascal Paoli], Università di Corsica Pasquale Paoli [Université de Corse Pascal Paoli]); Marielle Delhom; Corinne Idda |
Abstract: | The 5-Steps Simplified Modelling Process (5-SSIMP) is a method designed to develop modular and reusable Agent-Based Models of Land Use and Cover Change (ABM/LUCC) in an interdisciplinary context. It is based on the definition of three types of Reusable Building Blocks (RBBs): Conceptual-RBB (conRBB), Computer-RBB (comRBB), and Executable-RBB (exeRBB). We present a practical modelling example based on the 5-SSIMP method for creating an ABM/LUCC applied to the socio-economic tourism system of Corsica. |
Keywords: | Modelling method ABM/LUCC RBB, Modelling method, ABM/LUCC, RBB |
Date: | 2024–09–29 |
URL: | https://d.repec.org/n?u=RePEc:hal:journl:hal-04862884 |
By: | Yonggai Zhuang; Haoran Chen; Kequan Wang; Teng Fei |
Abstract: | The complexity of stocks and industries presents challenges for stock prediction. Currently, stock prediction models can be divided into two categories. One category, represented by GRU and ALSTM, relies solely on stock factors for prediction, with limited effectiveness. The other category, represented by HIST and TRA, incorporates not only stock factors but also industry information, industry financial reports, public sentiment, and other inputs for prediction. The second category of models can capture correlations between stocks by introducing additional information, but the extra data is difficult to standardize and generalize. Considering the current state and limitations of these two types of models, this paper proposes the GRU-PFG (Project Factors into Graph) model. This model only takes stock factors as input and extracts inter-stock correlations using graph neural networks. It achieves prediction results that not only outperform the others models relies solely on stock factors, but also achieve comparable performance to the second category models. The experimental results show that on the CSI300 dataset, the IC of GRU-PFG is 0.134, outperforming HIST's 0.131 and significantly surpassing GRU and Transformer, achieving results better than the second category models. Moreover as a model that relies solely on stock factors, it has greater potential for generalization. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.18997 |
By: | Jimmy Cheung; Smruthi Rangarajan; Amelia Maddocks; Xizhe Chen; Rohitash Chandra |
Abstract: | Uncertainty quantification is crucial in time series prediction, and quantile regression offers a valuable mechanism for uncertainty quantification which is useful for extreme value forecasting. Although deep learning models have been prominent in multi-step ahead prediction, the development and evaluation of quantile deep learning models have been limited. We present a novel quantile regression deep learning framework for multi-step time series prediction. In this way, we elevate the capabilities of deep learning models by incorporating quantile regression, thus providing a more nuanced understanding of predictive values. We provide an implementation of prominent deep learning models for multi-step ahead time series prediction and evaluate their performance under high volatility and extreme conditions. We include multivariate and univariate modelling, strategies and provide a comparison with conventional deep learning models from the literature. Our models are tested on two cryptocurrencies: Bitcoin and Ethereum, using daily close-price data and selected benchmark time series datasets. The results show that integrating a quantile loss function with deep learning provides additional predictions for selected quantiles without a loss in the prediction accuracy when compared to the literature. Our quantile model has the ability to handle volatility more effectively and provides additional information for decision-making and uncertainty quantification through the use of quantiles when compared to conventional deep learning models. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.15674 |
By: | Jens Ludwig; Sendhil Mullainathan; Ashesh Rambachan |
Abstract: | How can we use the novel capacities of large language models (LLMs) in empirical research? And how can we do so while accounting for their limitations, which are themselves only poorly understood? We develop an econometric framework to answer this question that distinguishes between two types of empirical tasks. Using LLMs for prediction problems (including hypothesis generation) is valid under one condition: no “leakage” between the LLM’s training dataset and the researcher’s sample. No leakage can be ensured by using open-source LLMs with documented training data and published weights. Using LLM outputs for estimation problems to automate the measurement of some economic concept (expressed either by some text or from human subjects) requires the researcher to collect at least some validation data: without such data, the errors of the LLM’s automation cannot be assessed and accounted for. As long as these steps are taken, LLM outputs can be used in empirical research with the familiar econometric guarantees we desire. Using two illustrative applications to finance and political economy, we find that these requirements are stringent; when they are violated, the limitations of LLMs now result in unreliable empirical estimates. Our results suggest the excitement around the empirical uses of LLMs is warranted – they allow researchers to effectively use even small amounts of language data for both prediction and estimation – but only with these safeguards in place. |
JEL: | C01 C45 |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:33344 |
By: | Connor Douglas; Foster Provost; Arun Sundararajan |
Abstract: | Algorithmic agents are used in a variety of competitive decision settings, notably in making pricing decisions in contexts that range from online retail to residential home rentals. Business managers, algorithm designers, legal scholars, and regulators alike are all starting to consider the ramifications of "algorithmic collusion." We study the emergent behavior of multi-armed bandit machine learning algorithms used in situations where agents are competing, but they have no information about the strategic interaction they are engaged in. Using a general-form repeated Prisoner's Dilemma game, agents engage in online learning with no prior model of game structure and no knowledge of competitors' states or actions (e.g., no observation of competing prices). We show that these context-free bandits, with no knowledge of opponents' choices or outcomes, still will consistently learn collusive behavior - what we call "naive collusion." We primarily study this system through an analytical model and examine perturbations to the model through simulations. Our findings have several notable implications for regulators. First, calls to limit algorithms from conditioning on competitors' prices are insufficient to prevent algorithmic collusion. This is a direct result of collusion arising even in the naive setting. Second, symmetry in algorithms can increase collusion potential. This highlights a new, simple mechanism for "hub-and-spoke" algorithmic collusion. A central distributor need not imbue its algorithm with supra-competitive tendencies for apparent collusion to arise; it can simply arise by using certain (common) machine learning algorithms. Finally, we highlight that collusive outcomes depend starkly on the specific algorithm being used, and we highlight market and algorithmic conditions under which it will be unknown a priori whether collusion occurs. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.16574 |
By: | Stella C. Dong; James R. Finlay |
Abstract: | Reinsurance optimization is critical for insurers to manage risk exposure, ensure financial stability, and maintain solvency. Traditional approaches often struggle with dynamic claim distributions, high-dimensional constraints, and evolving market conditions. This paper introduces a novel hybrid framework that integrates {Generative Models}, specifically Variational Autoencoders (VAEs), with {Reinforcement Learning (RL)} using Proximal Policy Optimization (PPO). The framework enables dynamic and scalable optimization of reinsurance strategies by combining the generative modeling of complex claim distributions with the adaptive decision-making capabilities of reinforcement learning. The VAE component generates synthetic claims, including rare and catastrophic events, addressing data scarcity and variability, while the PPO algorithm dynamically adjusts reinsurance parameters to maximize surplus and minimize ruin probability. The framework's performance is validated through extensive experiments, including out-of-sample testing, stress-testing scenarios (e.g., pandemic impacts, catastrophic events), and scalability analysis across portfolio sizes. Results demonstrate its superior adaptability, scalability, and robustness compared to traditional optimization techniques, achieving higher final surpluses and computational efficiency. Key contributions include the development of a hybrid approach for high-dimensional optimization, dynamic reinsurance parameterization, and validation against stochastic claim distributions. The proposed framework offers a transformative solution for modern reinsurance challenges, with potential applications in multi-line insurance operations, catastrophe modeling, and risk-sharing strategy design. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.06404 |
By: | Aaron Wheeler; Jeffrey D. Varner |
Abstract: | This work presents a generative pre-trained transformer (GPT) designed for modeling financial time series. The GPT functions as an order generation engine within a discrete event simulator, enabling realistic replication of limit order book dynamics. Our model leverages recent advancements in large language models to produce long sequences of order messages in a steaming manner. Our results demonstrate that the model successfully reproduces key features of order flow data, even when the initial order flow prompt is no longer present within the model's context window. Moreover, evaluations reveal that the model captures several statistical properties, or 'stylized facts', characteristic of real financial markets and broader macro-scale data distributions. Collectively, this work marks a significant step toward creating high-fidelity, interactive market simulations. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.16585 |
By: | Kun Liu; Jin Zhao |
Abstract: | In the field of finance, the prediction of individual credit default is of vital importance. However, existing methods face problems such as insufficient interpretability and transparency as well as limited performance when dealing with high-dimensional and nonlinear data. To address these issues, this paper introduces a method based on Kolmogorov-Arnold Networks (KANs). KANs is a new type of neural network architecture with learnable activation functions and no linear weights, which has potential advantages in handling complex multi-dimensional data. Specifically, this paper applies KANs to the field of individual credit risk prediction for the first time and constructs the Kolmogorov-Arnold Credit Default Predict (KACDP) model. Experiments show that the KACDP model outperforms mainstream credit default prediction models in performance metrics (ROC_AUC and F1 values). Meanwhile, through methods such as feature attribution scores and visualization of the model structure, the model's decision-making process and the importance of different features are clearly demonstrated, providing transparent and interpretable decision-making basis for financial institutions and meeting the industry's strict requirements for model interpretability. In conclusion, the KACDP model constructed in this paper exhibits excellent predictive performance and satisfactory interpretability in individual credit risk prediction, providing an effective way to address the limitations of existing methods and offering a new and practical credit risk prediction tool for financial institutions. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.17783 |
By: | Ling Chen |
Abstract: | This paper investigates the application of Feature-Enriched Generative Adversarial Networks (FE-GAN) in financial risk management, with a focus on improving the estimation of Value at Risk (VaR) and Expected Shortfall (ES). FE-GAN enhances existing GANs architectures by incorporating an additional input sequence derived from preceding data to improve model performance. Two specialized GANs models, the Wasserstein Generative Adversarial Network (WGAN) and the Tail Generative Adversarial Network (Tail-GAN), were evaluated under the FE-GAN framework. The results demonstrate that FE-GAN significantly outperforms traditional architectures in both VaR and ES estimation. Tail-GAN, leveraging its task-specific loss function, consistently outperforms WGAN in ES estimation, while both models exhibit similar performance in VaR estimation. Despite these promising results, the study acknowledges limitations, including reliance on highly correlated temporal data and restricted applicability to other domains. Future research directions include exploring alternative input generation methods, dynamic forecasting models, and advanced neural network architectures to further enhance GANs-based financial risk estimation. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.15519 |
By: | Robert Novy-Marx; Mihail Z. Velikov |
Abstract: | This paper describes a process for automatically generating academic finance papers using large language models (LLMs). It demonstrates the process’ efficacy by producing hundreds of complete papers on stock return predictability, a topic particularly well-suited for our illustration. We first mine over 30, 000 potential stock return predictor signals from accounting data, and apply the Novy-Marx and Velikov (2024) “Assaying Anomalies” protocol to generate standardized “template reports” for 96 signals that pass the protocol’s rigorous criteria. Each report details a signal’s performance predicting stock returns using a wide array of tests and benchmarks it to more than 200 other known anomalies. Finally, we use state-of-the-art LLMs to generate three distinct complete versions of academic papers for each signal. The different versions include creative names for the signals, contain custom introductions providing different theoretical justifications for the observed predictability patterns, and incorporate citations to existing (and, on occasion, imagined) literature supporting their respective claims. This experiment illustrates AI’s potential for enhancing financial research efficiency, but also serves as a cautionary tale, illustrating how it can be abused to industrialize HARKing (Hypothesizing After Results are Known). |
JEL: | A12 C12 C18 C45 G11 G12 |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:33363 |
By: | Augusto Gonzalez-Bonorino (Pomona College Economics Department); Monica Capra (Claremont Graduate University Economics Department; University of Arizona Center for the Philosophy of Freedom); Emilio Pantoja (Pitzer College Economics and Computer Science Department) |
Abstract: | Despite its importance, studying economic behavior across diverse, non-WEIRD (Western, Educated, Industrialized, Rich, and Democratic) populations presents significant challenges. We address this issue by introducing a novel methodology that uses Large Language Models (LLMs) to create synthetic cultural agents (SCAs) representing these populations. We subject these SCAs to classic behavioral experiments, including the dictator and ultimatum games. Our results demonstrate substantial cross-cultural variability in experimental behavior. Notably, for populations with available data, SCAs' behaviors qualitatively resemble those of real human subjects. For unstudied populations, our method can generate novel, testable hypotheses about economic behavior. By integrating AI into experimental economics, this approach offers an effective and ethical method to pilot experiments and refine protocols for hard-to-reach populations. Our study provides a new tool for cross-cultural economic studies and demonstrates how LLMs can help experimental behavioral research. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.06834 |
By: | Ramshreyas Rao |
Abstract: | I introduce an agent-based model of a Perpetual Futures market with heterogeneous agents trading via a central limit order book. Perpetual Futures (henceforth Perps) are financial derivatives introduced by the economist Robert Shiller, designed to peg their price to that of the underlying Spot market. This paper extends the limit order book model of Chiarella et al. (2002) by taking their agent and orderbook parameters, designed for a simple stock exchange, and applying it to the more complex environment of a Perp market with long and short traders who exhibit both positional and basis-trading behaviors. I find that despite the simplicity of the agent behavior, the simulation is able to reproduce the most salient feature of a Perp market, the pegging of the Perp price to the underlying Spot price. In contrast to fundamental simulations of stock markets which aim to reproduce empirically observed stylized facts such as the leptokurtosis and heteroscedasticity of returns, volatility clustering and others, in derivatives markets many of these features are provided exogenously by the underlying Spot price signal. This is especially true of Perps since the derivative is designed to mimic the price of the Spot market. Therefore, this paper will focus exclusively on analyzing how market and agent parameters such as order lifetime, trading horizon and spread affect the premiums at which Perps trade with respect to the underlying Spot market. I show that this simulation provides a simple and robust environment for exploring the dynamics of Perpetual Futures markets and their microstructure in this regard. Lastly, I explore the ability of the model to reproduce the effects of biasing long traders to trade positionally and short traders to basis-trade, which was the original intention behind the market design, and is a tendency observed empirically in real Perp markets. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.09404 |
By: | Briola, Antonio; Bartolucci, Silvia; Aste, Tomaso |
Abstract: | We introduce a novel large-scale deep learning model for Limit Order Book mid-price changes forecasting, and we name it ‘HLOB’. This architecture (i) exploits the information encoded by an Information Filtering Network, namely the Triangulated Maximally Filtered Graph, to unveil deeper and non-trivial dependency structures among volume levels; and (ii) guarantees deterministic design choices to handle the complexity of the underlying system by drawing inspiration from the groundbreaking class of Homological Convolutional Neural Networks. We test our model against 9 state-of-the-art deep learning alternatives on 3 real-world Limit Order Book datasets, each including 15 stocks traded on the NASDAQ exchange, and we systematically characterize the scenarios where HLOB outperforms state-of-the-art architectures. Our approach sheds new light on the spatial distribution of information in Limit Order Books and on its degradation over increasing prediction horizons, narrowing the gap between microstructural modeling and deep learning-based forecasting in high-frequency financial markets. |
Keywords: | deep learning; eEconophysics; High frequency trading; limit order book; market microstructure |
JEL: | F3 G3 |
Date: | 2025–03–25 |
URL: | https://d.repec.org/n?u=RePEc:ehl:lserod:126623 |
By: | Andrew Lesniewski; Giulio Trigila |
Abstract: | We propose a highly efficient and accurate methodology for generating synthetic financial market data using a diffusion model approach. The synthetic data produced by our methodology align closely with observed market data in several key aspects: (i) they pass the two-sample Cramer - von Mises test for portfolios of assets, and (ii) Q - Q plots demonstrate consistency across quantiles, including in the tails, between observed and generated market data. Moreover, the covariance matrices derived from a large set of synthetic market data exhibit significantly lower condition numbers compared to the estimated covariance matrices of the observed data. This property makes them suitable for use as regularized versions of the latter. For model training, we develop an efficient and fast algorithm based on numerical integration rather than Monte Carlo simulations. The methodology is tested on a large set of equity data. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.00036 |
By: | Mahdi Salahshour; Amirahmad Shafiee; Mojtaba Tefagh |
Abstract: | The Lightning Network (LN) has emerged as a second-layer solution to Bitcoin's scalability challenges. The rise of Payment Channel Networks (PCNs) and their specific mechanisms incentivize individuals to join the network for profit-making opportunities. According to the latest statistics, the total value locked within the Lightning Network is approximately \$500 million. Meanwhile, joining the LN with the profit-making incentives presents several obstacles, as it involves solving a complex combinatorial problem that encompasses both discrete and continuous control variables related to node selection and resource allocation, respectively. Current research inadequately captures the critical role of resource allocation and lacks realistic simulations of the LN routing mechanism. In this paper, we propose a Deep Reinforcement Learning (DRL) framework, enhanced by the power of transformers, to address the Joint Combinatorial Node Selection and Resource Allocation (JCNSRA) problem. We have improved upon an existing environment by introducing modules that enhance its routing mechanism, thereby narrowing the gap with the actual LN routing system and ensuring compatibility with the JCNSRA problem. We compare our model against several baselines and heuristics, demonstrating its superior performance across various settings. Additionally, we address concerns regarding centralization in the LN by deploying our agent within the network and monitoring the centrality measures of the evolved graph. Our findings suggest not only an absence of conflict between LN's decentralization goals and individuals' revenue-maximization incentives but also a positive association between the two. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.17353 |
By: | Xuesong Wang; Sharaf K. Magableh; Oraib Dawaghreh; Caisheng Wang; Jiaxuan Gong; Zhongyang Zhao; Michael H. Liao |
Abstract: | Virtual bidding plays an important role in two-settlement electric power markets, as it can reduce discrepancies between day-ahead and real-time markets. Renewable energy penetration increases volatility in electricity prices, making accurate forecasting critical for virtual bidders, reducing uncertainty and maximizing profits. This study presents a Transformer-based deep learning model to forecast the price spread between real-time and day-ahead electricity prices in the ERCOT (Electric Reliability Council of Texas) market. The proposed model leverages various time-series features, including load forecasts, solar and wind generation forecasts, and temporal attributes. The model is trained under realistic constraints and validated using a walk-forward approach by updating the model every week. Based on the price spread prediction results, several trading strategies are proposed and the most effective strategy for maximizing cumulative profit under realistic market conditions is identified through backtesting. The results show that the strategy of trading only at the peak hour with a precision score of over 50% produces nearly consistent profit over the test period. The proposed method underscores the importance of an accurate electricity price forecasting model and introduces a new method of evaluating the price forecast model from a virtual bidder's perspective, providing valuable insights for future research. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.00062 |