|
on Computational Economics |
By: | Kamil {\L}. Szyd{\l}owski; Jaros{\l}aw A. Chudziak |
Abstract: | This paper investigates the application of Transformer-based neural networks to stock price forecasting, with a special focus on the intersection of machine learning techniques and financial market analysis. The evolution of Transformer models, from their inception to their adaptation for time series analysis in financial contexts, is reviewed and discussed. Central to our study is the exploration of the Hidformer model, which is currently recognized for its promising performance in time series prediction. The primary aim of this paper is to determine whether Hidformer will also prove itself in the task of stock price prediction. This slightly modified model serves as the framework for our experiments, integrating the principles of technical analysis with advanced machine learning concepts to enhance stock price prediction accuracy. We conduct an evaluation of the Hidformer model's performance, using a set of criteria to determine its efficacy. Our findings offer additional insights into the practical application of Transformer architectures in financial time series forecasting, highlighting their potential to improve algorithmic trading strategies, including human decision making. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.19932 |
By: | Adamantios Ntakaris; Gbenga Ibikunle |
Abstract: | High-frequency trading (HFT) has transformed modern financial markets, making reliable short-term price forecasting models essential. In this study, we present a novel approach to mid-price forecasting using Level 1 limit order book (LOB) data from NASDAQ, focusing on 100 U.S. stocks from the S&P 500 index during the period from September to November 2022. Expanding on our previous work with Radial Basis Function Neural Networks (RBFNN), which leveraged automated feature importance techniques based on mean decrease impurity (MDI) and gradient descent (GD), we introduce the Adaptive Learning Policy Engine (ALPE) - a reinforcement learning (RL)-based agent designed for batch-free, immediate mid-price forecasting. ALPE incorporates adaptive epsilon decay to dynamically balance exploration and exploitation, outperforming a diverse range of highly effective machine learning (ML) and deep learning (DL) models in forecasting performance. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.19372 |
By: | Oudom Hean; Utsha Saha; Binita Saha |
Abstract: | In recent years, Large Language Models (LLMs) have emerged as a transformative development in artificial intelligence (AI), drawing significant attention from industry and academia. Trained on vast datasets, these sophisticated AI systems exhibit impressive natural language processing and content generation capabilities. This paper explores the potential of LLMs to address key challenges in personal finance, focusing on the United States. We evaluate several leading LLMs, including OpenAI's ChatGPT, Google's Gemini, Anthropic's Claude, and Meta's Llama, to assess their effectiveness in providing accurate financial advice on topics such as mortgages, taxes, loans, and investments. Our findings show that while these models achieve an average accuracy rate of approximately 70%, they also display notable limitations in certain areas. Specifically, LLMs struggle to provide accurate responses for complex financial queries, with performance varying significantly across different topics. Despite these limitations, the analysis reveals notable improvements in newer versions of these models, highlighting their growing utility for individuals and financial advisors. As these AI systems continue to evolve, their potential for advancing AI-driven applications in personal finance becomes increasingly promising. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.19784 |
By: | Francesco Audrino (University of St. Gallen; Swiss Finance Institute); Jonathan Chassot (University of St. Gallen) |
Abstract: | We investigate the predictive abilities of the heterogeneous autoregressive (HAR) model compared to machine learning (ML) techniques across an unprecedented dataset of 1, 445 stocks. Our analysis focuses on the role of fitting schemes, particularly the training window and re-estimation frequency, in determining the HAR model's performance. Despite extensive hyperparameter tuning, ML models fail to surpass the linear benchmark set by HAR when utilizing a refined fitting approach for the latter. Moreover, the simplicity of HAR allows for an interpretable model with drastically lower computational costs. We assess performance using QLIKE, MSE, and realized utility metrics, finding that HAR consistently outperforms its ML counterparts when both rely solely on realized volatility and VIX as predictors. Our results underscore the importance of a correctly specified fitting scheme. They suggest that properly fitted HAR models provide superior forecasting accuracy, establishing robust guidelines for their practical application and use as a benchmark. This study not only reaffirms the efficacy of the HAR model but also provides a critical perspective on the practical limitations of ML approaches in realized volatility forecasting. |
Keywords: | Forecasting practice, HAR, Machine learning, Realized volatility, Volatility forecasting |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:chf:rpseri:rp2470 |
By: | Sermet Pekin; Aykut Sengul |
Abstract: | This study aims to classify high-growth firms using several machine learning algorithms, including K-Nearest Neighbors, Logistic Regression with L1 (Lasso) and L2 (Ridge) Regularization, XGBoost, Gradient Descent, Naive Bayes and Random Forest. Leveraging a dataset composed of financial metrics and firm characteristics between 2009 and 2022 with 1, 318, 799 unique firms (averaging 554, 178 annually), we evaluate the performance of each model using metrics such as MCC, ROC AUC, accuracy, precision, recall and F1-score. In our study, ROC AUC values ranged from 0.53 to 0.87 for employee-high growth and from 0.53 to 0.91 for turnover-high growth, depending on the method used. Our findings indicate that XGBoost achieves the highest performance, followed by Random Forest and Logistic Regression, demonstrating their effectiveness in distinguishing between high-growth and non-high-growth firms. Conversely, KNN and Naive Bayes yield lower accuracy. Furthermore, our findings reveal that growth opportunity emerges as the most significant factor in our study. This research contributes valuable insights in identifying high-growth firms and underscores the potential of machine learning in economic prediction. |
Keywords: | High-growth firms, Machine learning, Prediction, Firm dynamics |
JEL: | C40 C55 C60 C81 L25 |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:tcb:wpaper:2413 |
By: | Hendriks, Patrick; Sturm, Timo; Olt, Christian M.; Nan, Ning; Buxmann, Peter |
Abstract: | Organizations must constantly align their operations with their environment in order to survive. To achieve this, organizations use two primary strategies: Organizations (1) learn about their environment to adapt to it and (2) enact their environment to adapt it to their own needs. With the rise of artificial intelligence (AI), humans are no longer the only ones capable of learning and enacting on behalf of organizations. We investigate how organizations can effectively coordinate organizational learning and enactment to ensure that they complement rather than conflict with each other. Through a series of agent-based simulations, we find that the optimal coordination of learning and enactment between humans and AI largely depends on the prevailing nature of human learning and allowing both humans and AI to enact for organizations. Our findings contribute to rethinking theory on organizational learning and enactment in the AI era and provide practical guidance for organizations’ strategic alignment. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:dar:wpaper:150817 |
By: | Yichen Luo; Yebo Feng; Jiahua Xu; Paolo Tasca; Yang Liu |
Abstract: | Cryptocurrency investment is inherently difficult due to its shorter history compared to traditional assets, the need to integrate vast amounts of data from various modalities, and the requirement for complex reasoning. While deep learning approaches have been applied to address these challenges, their black-box nature raises concerns about trust and explainability. Recently, large language models (LLMs) have shown promise in financial applications due to their ability to understand multi-modal data and generate explainable decisions. However, single LLM faces limitations in complex, comprehensive tasks such as asset investment. These limitations are even more pronounced in cryptocurrency investment, where LLMs have less domain-specific knowledge in their training corpora. To overcome these challenges, we propose an explainable, multi-modal, multi-agent framework for cryptocurrency investment. Our framework uses specialized agents that collaborate within and across teams to handle subtasks such as data analysis, literature integration, and investment decision-making for the top 30 cryptocurrencies by market capitalization. The expert training module fine-tunes agents using multi-modal historical data and professional investment literature, while the multi-agent investment module employs real-time data to make informed cryptocurrency investment decisions. Unique intrateam and interteam collaboration mechanisms enhance prediction accuracy by adjusting final predictions based on confidence levels within agent teams and facilitating information sharing between teams. Empirical evaluation using data from November 2023 to September 2024 demonstrates that our framework outperforms single-agent models and market benchmarks in classification, asset pricing, portfolio, and explainability performance. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.00826 |
By: | Yijia Xiao; Edward Sun; Di Luo; Wei Wang |
Abstract: | Significant progress has been made in automated problem-solving using societies of agents powered by large language models (LLMs). In finance, efforts have largely focused on single-agent systems handling specific tasks or multi-agent frameworks independently gathering data. However, multi-agent systems' potential to replicate real-world trading firms' collaborative dynamics remains underexplored. TradingAgents proposes a novel stock trading framework inspired by trading firms, featuring LLM-powered agents in specialized roles such as fundamental analysts, sentiment analysts, technical analysts, and traders with varied risk profiles. The framework includes Bull and Bear researcher agents assessing market conditions, a risk management team monitoring exposure, and traders synthesizing insights from debates and historical data to make informed decisions. By simulating a dynamic, collaborative trading environment, this framework aims to improve trading performance. Detailed architecture and extensive experiments reveal its superiority over baseline models, with notable improvements in cumulative returns, Sharpe ratio, and maximum drawdown, highlighting the potential of multi-agent LLM frameworks in financial trading. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.20138 |
By: | Kauhanen, Antti; Kässi, Otto; Pajarinen, Mika; Rouvinen, Petri; Vanhala, Pekka |
Abstract: | Abstract Half of Finns have tried generative artificial intelligence (AI). Among the employed, thirty percent have used generative artificial intelligence for work purposes. Generative artificial intelligence has already had a significant impact on leisure and work in Finland. Although the United States is the leading country in generative artificial intelligence, Finns are more active users than Americans – especially thanks to women. Finns view the effects of generative artificial intelligence on work productivity, quality, and satisfaction positively. The situation among employer organizations is mixed: one-third of employees have received AI guidance or training from their employer, but another one-third have not received any AI instructions from their employer. Despite many positive aspects, intensities and domains of Finns’ AI use suggest that they are far from fully utilizing it. Generative AI is used for work purposes by 29%, but only 11% have weekly use, and only 8% have use in domains where generative AI is particularly suitable. For example, Nobel laureate in economics Daron Acemoglu has suggested that – without significant advancements – the economic growth impact of generative artificial intelligence will be only modestly positive. The evidence presented in this brief suggests that, to have a sizable impact on Finland’s future growth, AI applications ought to be more widespread and deeper than what they currently are. |
Keywords: | Generative artificial intelligence, Technological change, Employment, Labor market, Occupations |
JEL: | E24 J21 O33 |
Date: | 2024–11–19 |
URL: | https://d.repec.org/n?u=RePEc:rif:briefs:144 |
By: | Francesco Audrino (University of St. Gallen; Swiss Finance Institute); Jessica Gentner (University of St. Gallen; Swiss National Bank); Simon Stalder (Swiss National Bank; University of Lugano) |
Abstract: | This paper presents an innovative method for measuring uncertainty using Large Language Models (LLMs), offering enhanced precision and contextual sensitivity compared to the conventional methods used to construct prominent uncertainty indices. By analyzing newspaper texts with state-of-the-art LLMs, our approach captures nuances often missed by conventional methods. We develop indices for various types of uncertainty, including geopolitical risk, economic policy, monetary policy, and financial market uncertainty. Our findings show that shocks to these LLM-based indices exhibit stronger associations with macroeconomic variables, shifts in investor behaviour, and asset return variations than conventional indices, underscoring their potential for more accurately reflecting uncertainty. |
Keywords: | Large Language Models, Economic policy, Geopolitical risk, Monetary policy, Financial markets, Uncertainty measurment |
JEL: | C45 C55 E44 G12 |
Date: | 2024–08 |
URL: | https://d.repec.org/n?u=RePEc:chf:rpseri:rp2468 |
By: | Jule Schuettler (University of St.Gallen); Francesco Audrino (University of St. Gallen; Swiss Finance Institute); Fabio Sigrist (Lucerne University of Applied Sciences and Arts) |
Abstract: | We present a novel approach to sentiment analysis in financial markets by using a state-of-the-art large language model, a market data-driven labeling approach, and a large dataset consisting of diverse financial text sources including earnings call transcripts, newspapers, and social media tweets. Based on our approach, we define a predictive high-low sentiment asset pricing factor which is significant in explaining cross-sectional asset pricing for U.S. stocks. Further, we find that a long/short equal-weighted portfolio yields an average annualized return of 35.56% and an annualized Sharpe ratio of 2.21, remaining substantially profitable even when transaction costs are considered. A comparison with an alternative financial sentiment analysis tool (FinBERT) underscores the superiority of our data-driven labeling approach over traditional human-annotated labeling. |
Keywords: | natural language processing, large language models, DeBERTa, asset pricing |
Date: | 2024–08 |
URL: | https://d.repec.org/n?u=RePEc:chf:rpseri:rp2469 |
By: | Abdollah Rida |
Abstract: | Credit Scoring is one of the problems banks and financial institutions have to solve on a daily basis. If the state-of-the-art research in Machine and Deep Learning for finance has reached interesting results about Credit Scoring models, usage of such models in a heavily regulated context such as the one in banks has never been done so far. Our work is thus a tentative to challenge the current regulatory status-quo and introduce new BASEL 2 and 3 compliant techniques, while still answering the Federal Reserve Bank and the European Central Bank requirements. With the help of Gradient Boosting Machines (mainly XGBoost) we challenge an actual model used by BANK A for scoring through the door Auto Loan applicants. We prove that the usage of such algorithms for Credit Scoring models drastically improves performance and default capture rate. Furthermore, we leverage the power of Shapley Values to prove that these relatively simple models are not as black-box as the current regulatory system thinks they are, and we attempt to explain the model outputs and Credit Scores within the BANK A Model Design and Validation framework |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.20225 |
By: | Vanessa Heinemann-Heile (Paderborn University) |
Abstract: | I investigate whether a machine learning model can reliably predict firms’ tax rate perception. While standard models assume that decision-makers in firms are perfectly informed about firms’ tax rates and tax implications, also their tax rate perception influences the way in which they incorporate taxes into their decision-making processes. However, studies examining firms’ tax rate perception and its consequences remain scarce, mostly due to a lack of observations of firms’ tax rate perception. Using a dataset of German SMEs, I apply machine learning in the form of Extreme Gradient Boosting, to predict firms’ tax rate perception based on firm and personal characteristics of the decision-maker. The results show that Extreme Gradient Boosting outperforms traditional OLS regression. The model is highly accurate, as evidenced by a mean prediction error of less than one percentage point, produces reasonably precise predictions, as indicated by the root mean square error being comparable to the standard deviation, and explains up to 23.2% of the variance in firms’ tax rate perception. Even based on firm characteristics only, the model maintains high accuracy, albeit with some decline in precision and explained variance. Consistent with this finding, Shapley values highlight the importance of firm and personal characteristics such as tax compliance costs, tax literacy, and trust in government for the prediction. The results show that machine learning models can provide a time- and cost-effective way to fill the information gap created by the lack of observations on firms’ tax rate perception. This approach allows researchers and policymakers, to further analyze the impact of firms’ tax rate perception on tax reforms, tax compliance, or business decisions. |
Keywords: | Tax Rate Perception, Business Taxation, Prediction, XGBoost, Shapley |
JEL: | H25 D91 C8 C53 |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:pdn:dispap:128 |
By: | Denisa Millo; Blerina Vika; Nevila Baci |
Abstract: | The financial sector, a pivotal force in economic development, increasingly uses the intelligent technologies such as natural language processing to enhance data processing and insight extraction. This research paper through a review process of the time span of 2018-2023 explores the use of text mining as natural language processing techniques in various components of the financial system including asset pricing, corporate finance, derivatives, risk management, and public finance and highlights the need to address the specific problems in the discussion section. We notice that most of the research materials combined probabilistic with vector-space models, and text-data with numerical ones. The most used technique regarding information processing is the information classification technique and the most used algorithms include the long-short term memory and bidirectional encoder models. The research noticed that new specific algorithms are developed and the focus of the financial system is mainly on asset pricing component. The research also proposes a path from engineering perspective for researchers who need to analyze financial text. The challenges regarding text mining perspective such as data quality, context-adaption and model interpretability need to be solved so to integrate advanced natural language processing models and techniques in enhancing financial analysis and prediction. Keywords: Financial System (FS), Natural Language Processing (NLP), Software and Text Engineering, Probabilistic, Vector-Space, Models, Techniques, TextData, Financial Analysis. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.20438 |
By: | Kemal Kirtac; Guido Germano |
Abstract: | We investigate the efficacy of large language models (LLMs) in sentiment analysis of U.S. financial news and their potential in predicting stock market returns. We analyze a dataset comprising 965, 375 news articles that span from January 1, 2010, to June 30, 2023; we focus on the performance of various LLMs, including BERT, OPT, FINBERT, and the traditional Loughran-McDonald dictionary model, which has been a dominant methodology in the finance literature. The study documents a significant association between LLM scores and subsequent daily stock returns. Specifically, OPT, which is a GPT-3 based LLM, shows the highest accuracy in sentiment prediction with an accuracy of 74.4%, slightly ahead of BERT (72.5%) and FINBERT (72.2%). In contrast, the Loughran-McDonald dictionary model demonstrates considerably lower effectiveness with only 50.1% accuracy. Regression analyses highlight a robust positive impact of OPT model scores on next-day stock returns, with coefficients of 0.274 and 0.254 in different model specifications. BERT and FINBERT also exhibit predictive relevance, though to a lesser extent. Notably, we do not observe a significant relationship between the Loughran-McDonald dictionary model scores and stock returns, challenging the efficacy of this traditional method in the current financial context. In portfolio performance, the long-short OPT strategy excels with a Sharpe ratio of 3.05, compared to 2.11 for BERT and 2.07 for FINBERT long-short strategies. Strategies based on the Loughran-McDonald dictionary yield the lowest Sharpe ratio of 1.23. Our findings emphasize the superior performance of advanced LLMs, especially OPT, in financial market prediction and portfolio management, marking a significant shift in the landscape of financial analysis tools with implications to financial regulation and policy analysis. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2412.19245 |
By: | Hendriks, Patrick; Sturm, Timo; Mehler, Maren F.; Buxmann, Peter |
Abstract: | Culture is fundamental to our society, shaping the traditions, ethics, and laws that guide people’s beliefs and behaviors. At the same time, culture is also shaped by people—it evolves as people interact and collectively select, modify, and transmit the beliefs they deem desirable. As artificial intelligence (AI) becomes more integrated into our lives, it plays an increasing role in how cultural beliefs are (re)shaped and promoted. Using a series of agent-based simulations, we analyze how different ways of integrating AI into society (e.g., national vs. global AI) impact cultural evolution, thereby shaping cultural diversity. We find that less globalized AI can help promote diversity in the short run, but risks eliminating diversity in the long run. This becomes more pronounced the less humans and AI are grounded in each other’s beliefs. Our findings help researchers revisit cultural evolution in the presence of AI and assist policymakers with AI governance. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:dar:wpaper:150816 |
By: | Vetter, Oliver Andreas |
Abstract: | In today’s world, artificial intelligence (AI) permeates almost all areas of human life. Modern AI supports us both in our leisure time (e.g., built into applications on our smartphones that recommend music we may like, recognize people in pictures, or act as digital assistants) and in our work (e.g., by automating tasks or creating analyses, predictions, or almost perfectly formulated texts). For organizations, AI, particularly instances of AI built with the data-based learning approach of machine learning (ML), also unlocks entirely new possibilities. Such ML systems can, for instance, be integrated into organizational processes to achieve efficiency gains by automating (parts of) tasks or to elevate decision-making quality by provisioning information. Furthermore, as the examples above illustrate, ML also enables the creation of novel kinds of products and services. The associated entrepreneurial opportunities unveiled by the latest technological advancements in the field of ML are correspondingly diverse and numerous, offering enormous potential for exploitation through suitable business models. A business model is an activity system that illustrates the logic of how an organization conducts business, i.e., how it creates value through the creation of its products and services, how it delivers this value to its customers, and how it ultimately captures value for itself, e.g., in the form of profits. However, at the same time, the challenges that accompany the integration of ML into the business models of organizations exhibit similar diversity and also differ from those posed by other digital technologies. Therefore, the existing literature underscores that organizations wishing to harness the power of ML to drive their business models must carefully consider the peculiarities of the technology to be able to benefit in the long term. Yet, the state of existing research on the actual implementation of the various facets of ML-driven business models is sparse and lacks insights into their alignment with the particularities of ML. To expand this understanding in both academia and practice, this dissertation incorporates five papers that successively investigate ML-driven business models along the three business model dimensions of value creation, value delivery, and value capture. It examines both the ML-induced challenges that arise in each of these dimensions and the opportunities unlocked by ML, elaborating on their influences on the business logic of organizations from the perspective of the respective dimension of the business model. First, two studies address the dimension of ML-driven value creation. The creation of ML systems requires experts from various disciplines to collectively reflect on the organization’s existing knowledge (e.g., when making sense of data), which can lead to the creation of additional knowledge (e.g., through insights into inefficiencies in routines). Moreover, their data-based learning enables ML systems to generate knowledge in a way that complements the strengths of humans and thus to uniquely contribute to knowledge creation and revision in organizations. Existing literature on organizational learning hence regards productive ML systems as a new type of learner alongside humans. Yet, the potential for learning during ML development efforts, which include interactions of interdisciplinary groups of experts and (prototypical) ML systems, has to date remained largely unexplored. The first associated study therefore illuminates the beneficial learning processes that the creation of ML systems can stimulate. It also highlights the resulting human knowledge as a valuable additional by-product that can contribute to the knowledge base of the organization and thus to its long-term success. The second study examines a downside of the data-based learning approach of ML: the need for extensive experimentation during ML development. This runs counter to the demand of conventional business processes for efficiency and exploiting existing strengths, and organizations must allocate their limited resources between the two approaches, creating tensions during ML development that can take various forms in different structural approaches. Building on the theoretical foundation of organizational ambidexterity, the study identifies these tensions and corresponding tactics that organizations can employ to alleviate the tensions, depending on their manifestation, and facilitate ML-driven value creation. Next, the dissertation discusses the second dimension, ML-driven value delivery. A particular problem in this context is that ML systems are often highly complex, making them and their outputs incomprehensible to humans. If they cannot use them due to a lack of understanding, customers of ML-driven business models may thus fail to benefit from the value the business model intends to deliver (i.e., the ML system or its outputs). Therefore, the literature on explainable AI contains approaches that can provide users of ML systems with explanations that disclose their inner workings and the reasoning behind their decisions. Yet, thus far, these approaches have lacked a focus on distinct user groups and their specific requirements of the system. Especially lay users have often been neglected in previous studies. However, fostering the lay users’ understanding is critical if they are to incorporate the output of ML systems in their decision-making to benefit from the products and services of ML-driven business models. Hence, the third study in this dissertation follows a design science research process and presents an approach to elaborate the requirements specific to the users of ML systems. On this basis, the study further derives design principles for designing ML systems that provide user-centric explanations and thereby enhance value delivery. Finally, two more studies shed light on the third dimension of ML-driven value capture. In pursuit of their own goals, organizations must align all components of their business model to enable the capture of value, e.g., the reaping of profits from their business model in the long term. Only creating valuable solutions and supplying them to customers does not guarantee value capture for the organization, as the decade-long search of Twitter (now X) for a suitable way to profit from its unique offering and massive user base illustrates. With the current literature yielding little clarity on the nature of ML-driven business models, the fourth study in this dissertation aims to create a fundamental understanding of the business model components that organizations must align for successful value capture. Specifically, the resulting taxonomy offers insights into the components of ML-driven business models and is supplemented by archetypes that represent structural compositions of ML-driven business models commonly found in practice. Building on these findings, the fifth study investigates the question of how organizations seeking to profit from ML-driven business models can successfully realize them, which is under-researched in today’s scientific literature as well. Realizing business models is an inherently dynamic and iterative process. In the case of ML-driven business models, the particularities of ML systems further complicate the effort, due (for instance) to the additional uncertainty stemming from the experimental character of ML development. Therefore, the study shows that organizations must build dynamic capabilities to be able to successfully realize ML-driven business models in the long term. Moreover, the study develops microfoundations (e.g., practices or processes) that empower the creation of the necessary dynamic capabilities, consequently contributing to the understanding of how organizations can successfully capture value sustainably from their ML-driven business models. The studies within this dissertation illustrate that organizations must consider the unique characteristics of ML when designing and implementing their ML-driven business models to achieve sustainable success. Specifically, they show that the effects of ML particularities, such as the need for extensive experimentation in ML development, can manifest themselves in all three dimensions of the business model and can be both inhibiting (e.g., through additional uncertainty in the realization of the business model) and value-adding (e.g., through stimulated learning processes). The studies further delineate how organizations can take these influences into account through appropriate responses. This dissertation thus represents an important step toward a holistic understanding of ML-driven business models, emphasizes the value of the business model perspective for investigating the influence of ML on the business logic of organizations, and yields contributions to strategic management, entrepreneurship, and information systems literature. Thereby, it provides fertile ground for future examinations of ML-driven value creation, value delivery, and value capture against the backdrop of the high-level technological and entrepreneurial dynamism in the field of ML. |
Date: | 2024–11–21 |
URL: | https://d.repec.org/n?u=RePEc:dar:wpaper:150671 |
By: | Philipp Bach; Victor Chernozhukov; Sven Klaassen; Martin Spindler; Jan Teichert-Kluge; Suhas Vijaykumar |
Abstract: | This paper advances empirical demand analysis by integrating multimodal product representations derived from artificial intelligence (AI). Using a detailed dataset of toy cars on \textit{Amazon.com}, we combine text descriptions, images, and tabular covariates to represent each product using transformer-based embedding models. These embeddings capture nuanced attributes, such as quality, branding, and visual characteristics, that traditional methods often struggle to summarize. Moreover, we fine-tune these embeddings for causal inference tasks. We show that the resulting embeddings substantially improve the predictive accuracy of sales ranks and prices and that they lead to more credible causal estimates of price elasticity. Notably, we uncover strong heterogeneity in price elasticity driven by these product-specific features. Our findings illustrate that AI-driven representations can enrich and modernize empirical demand analysis. The insights generated may also prove valuable for applied causal inference more broadly. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.00382 |
By: | Wei Li; Yi-Lun Du; Nan Su; Konrad Tywoniuk; Kyle Godbey; Horst St\"ocker |
Abstract: | Community detection, also known as graph partitioning, is a well-known NP-hard combinatorial optimization problem with applications in diverse fields such as complex network theory, transportation, and smart power grids. The problem's solution space grows drastically with the number of vertices and subgroups, making efficient algorithms crucial. In recent years, quantum computing has emerged as a promising approach to tackling NP-hard problems. This study explores the use of a quantum-inspired algorithm, Simulated Bifurcation (SB), for community detection. Modularity is employed as both the objective function and a metric to evaluate the solutions. The community detection problem is formulated as a Quadratic Unconstrained Binary Optimization (QUBO) problem, enabling seamless integration with the SB algorithm. Experimental results demonstrate that SB effectively identifies community structures in benchmark networks such as Zachary's Karate Club and the IEEE 33-bus system. Remarkably, SB achieved the highest modularity, matching the performance of Fujitsu's Digital Annealer, while surpassing results obtained from two quantum machines, D-Wave and IBM. These findings highlight the potential of Simulated Bifurcation as a powerful tool for solving community detection problems. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.00075 |
By: | Lee, Heungmin |
Abstract: | The rapid advancements in large language models (LLMs) have ushered in a new era of transformative potential for the finance industry. This paper explores the latest developments in the application of LLMs across key areas of the finance domain, highlighting their significant impact and future implications. In the realm of financial analysis and modelling, LLMs have demonstrated the ability to outperform traditional models in tasks such as stock price prediction, portfolio optimization, and risk assessment. By processing vast amounts of financial data and leveraging their natural language understanding capabilities, these models can generate insightful analyses, identify patterns, and provide data-driven recommendations to support decision-making processes. The conversational capabilities of LLMs have also revolutionized the customer service landscape in finance. LLMs can engage in natural language dialogues, addressing customer inquiries, providing personalized financial advice, and even handling complex tasks like loan applications and investment planning. This integration of LLMs into financial institutions has the potential to enhance customer experiences, improve response times, and reduce the workload of human customer service representatives. Furthermore, LLMs are making significant strides in the realm of risk management and compliance. These models can analyze complex legal and regulatory documents, identify potential risks, and suggest appropriate remedial actions. By automating routine compliance tasks, such as anti-money laundering (AML) checks and fraud detection, LLMs can help financial institutions enhance their risk management practices and ensure better compliance, mitigating the risk of costly penalties or reputational damage. As the finance industry continues to embrace the transformative potential of LLMs, it will be crucial to address the challenges surrounding data privacy, algorithmic bias, and the responsible development of these technologies. By navigating these considerations, the finance sector can harness the full capabilities of LLMs to drive innovation, improve efficiency, and ultimately, enhance the overall financial ecosystem. |
Date: | 2025–01–03 |
URL: | https://d.repec.org/n?u=RePEc:osf:osfxxx:ahkd3 |
By: | Hannes Mueller; Christopher Rauh; Benjamin R Seimon; Mr. Raphael A Espinoza |
Abstract: | Can macroeconomic policy effectively help prevent armed conflicts? This paper contends that two key criteria need to be satisfied: the long-term benefits of prevention policies must exceed the costs associated with uncertain forecasts, and the policies themselves must be directly able to contribute to conflict prevention. This paper proposes policy simulations, based on a novel method of Mueller et al (2024a) that integrates machine learning and dynamic optimization, to show that investing in prevention can generate huge long-run benefits. Returns to prevention policies in countries that have not suffered recently from violence range from $26 to $75 per $1 spent on prevention, and for countries with recent violence, the rate of return could be as high as $103 per $1 spent on prevention. Furthermore, an analysis of the available data and results in the literature suggest that sound macroeconomic policies and international support for these policies can play key roles in conflict prevention. Based on these findings, this paper proposes actionable recommendations, for both global and domestic policymakers as well as international financial institutions and multilateral organizations, to promote peace and stability through macroeconomic policy. |
Keywords: | Prevention; fragile state; conflict ; machine learning |
Date: | 2024–12–20 |
URL: | https://d.repec.org/n?u=RePEc:imf:imfwpa:2024/256 |
By: | Zöll, Anne |
Abstract: | Companies‘ data-driven digital services rely on the collection of personal data and its processing by self-learning algorithms. With the help of machine learning, companies can offer personalized services tailored to customer needs. As a result of the intensive collection of personal information by companies, customers have a sense of loss of control over their own personal information. They also have high privacy concerns about data handling. These concerns are amplified by high-profile data breaches such as the Cambridge Analytica scandal. Consequently, customers are increasingly hesitant to share their personal data with these companies, which could pose a risk to data-driven digital services. A smaller amount of data could compromise the performance of algorithms and thus reduce the quality of data-driven digital services. Therefore, the stated goal of this dissertation is to establish the complex balance between protecting customers‘ privacy and improving value creation processes. Thus, the central research question of this dissertation is how companies can mitigate the dilemma between protecting individual privacy and enhancing data-driven digital services. This dissertation examines the issue from three different perspectives: technological, individual, and organizational. Over the past decades, privacy-enhancing technologies have been developed. These information and communication technologies protect individuals‘ privacy either by removing or minimizing personal information or by preventing unnecessary or unwanted processing of personal information while maintaining the functionality of information systems. Despite the advanced implementation of these privacy-enhancing technologies, they are rarely used in data-driven digital services. Therefore, this dissertation provides an overview of the reasons why these privacy-enhancing technologies are only reluctantly adopted by companies. In particular, it highlights the barriers that arise when integrating these technologies into data-driven digital services. Thus, this dissertation demonstrates that a purely technological solution is not sufficient to fully answer the research question. This is the starting point of this dissertation, which aims to find a solution to mitigate the aforementioned dilemma. As privacy concerns are primarily customer-driven, this dissertation focuses on individuals as a further perspective. This perspective aims to examine how companies should design data-driven digital services to alleviate customer privacy concerns. To achieve this goal, the dissertation draws on theories from privacy research, focusing on individuals‘ control over their personal information and trust in data-driven digital services. Essentially, design principles are developed that are necessary to create data-driven digital services that allow individuals to regain control over their personal data. Furthermore, this dissertation continues to develop design principles to enhance costumers‘ trust in data-driven digital services, especially those based on machine learning. As a third perspective, organizations are included, particularly examining how machine learning can be integrated into companies‘ value creation process to build data-driven digital services. The focus of this research is to identify the factors that either support or hinder the integration of machine learning into companies‘ value creation processes. Although many factors for the adoption of innovations have been examined in previous literature, a re-examination is important because the characteristics of machine learning are significantly different from other technologies. For instance, vast amounts of personal information are processed to generate personalized recommendations for individuals. The ability of machine learning to uncover hidden patterns can lead to the inadvertent disclosure of sensitive personal information, thereby intensifying privacy concerns. Additionally, this dissertation builds on previous research that highlights differences in the acceptance of innovations in different cultures and examines which different factors are important for the adoption of machine learning in data-driven digital services in different cultures. In this regard, this dissertation applies the organizational readiness concept for artificial intelligence within cultural research to gain deeper insights into this intersection. In summary, this dissertation presents three important perspectives that aim to alleviate the dilemma between the protection of individuals‘ privacy and the use of machine learning for value creation in companies. It deals with privacy-enhancing technologies, prioritizes user-centered approaches, and the strategic design of value creation processes within companies. Particularly driven by the three perspectives, this dissertation motivates the development of a multilevel theory that aim to enable a holistic approach to alleviate the dilemma between privacy protection and value creation by bringing together technology, individuals, and organizations. |
Date: | 2024–12–03 |
URL: | https://d.repec.org/n?u=RePEc:dar:wpaper:150796 |
By: | Yuhao Fu; Nobuyuki Hanaki |
Abstract: | This experimental study investigates whether people rely more on ChatGPT (GPT-4) than on their human peers when detecting AI-generated fake news (deepfake news). In multiple rounds of deepfake detection tasks conducted in a laboratory setting, student participants exhibited a greater reliance on ChatGPT compared to their peers. We explored this over-reliance on AI from two perspectives: the weight of advice (WOA) and the decomposition of reliance (DOR) into two stages. Our analysis indicates that reliance on external advice is primarily influenced by the source and quality of the advice, as well as the subjects’ prior beliefs, knowledge, and experience, while the type of news and time spent on tasks have no effect. Additionally, our study indicates a potential sequential mechanism of advice utilization, wherein the advice source affects reliance in both stages—activation and integration—whereas the quality of the advice, along with knowledge and experience, influences only the second stage. Our findings suggest that relying on AI to detect AI may not be detrimental and could, in fact, contribute to a deeper understanding of human-AI interaction and support advancements in AI development during the Generative Artificial Intelligence (GAI) era. |
Date: | 2024–03 |
URL: | https://d.repec.org/n?u=RePEc:dpr:wpaper:1233r |
By: | Filip Stefaniuk (University of Warsaw, Faculty of Economic Sciences); Robert Ślepaczuk (University of Warsaw, Faculty of Economic Sciences, Department of Quantitative Finance and Machine Learning, Quantitative Finance Research Group) |
Abstract: | The thesis investigates the usage of Informer architecture for building automated trading strategies for high frequency Bitcoin data. Two strategies using Informer models with different loss functions, Quantile loss and Generalized Mean Absolute Directional Loss (GMADL), are proposed and evaluated against the Buy and Hold benchmark and two benchmark strategies based on technical indicators. The evaluation is conducted using data of various frequencies: 5 minute, 15 minute, and 30 minute intervals, over the 6 different periods. Although the Informer-based model with Quantile loss did not manage to outperform the benchmark, the model that uses novel GMADL loss function turned out to be benefiting from higher frequency data and beat all the other strategies on most of the testing periods. The primary contribution of this study is the application and assessment of the Quantile and GMADL loss functions with the Informer model to forecast future returns, subsequently using these forecasts to develop automated trading strategies. The research provides evidence that employing an Informer model trained with the GMADL loss function can result in superior trading outcomes compared to the buy-and-hold approach. |
Keywords: | Machine Learning, Financial Series Forecasting, Automated Trading Strategy, Informer, Transformer, Bitcoin, High Frequency Trading, Statistics, GMADL |
JEL: | C4 C14 C45 C53 C58 G13 |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:war:wpaper:2024-27 |
By: | MEDAGLIA Rony; MIKALEF Patrick; TANGI Luca (European Commission - JRC) |
Abstract: | The diffusion of artificial intelligence (AI) in the public sector depends largely on ensuring the presence of appropriate competences and establishing appropriate governance practices to deploy solutions. This report builds on a synthesis of empirical research, grey and policy literature, on an expert workshop and on interviews from seven case studies of European public organisations to identify the competences and governance practices around AI required to enable value generation in the public sector. Based on the analysis, we present a comprehensive framework for relevant competences and a framework for the governance practices for AI in the public sector. The report also introduces six recommendations to be implemented through 18 actions to facilitate the development of the competences and governance practices needed for AI in the public sector in Europe. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:ipt:iptwpa:jrc138702 |
By: | MORIKAWA Masayuki |
Abstract: | Based on a survey of Japanese workers, this study documents the characteristics of workers who use artificial intelligence (AI) in their jobs and estimates the effects of this new general-purpose technology on macroeconomic productivity. The results indicate, first, 8.3% of workers used AI in their jobs in 2024, which is approximately 1.5 times than in 2023. Second, more educated and high-wage workers tend to use AI, suggesting that its diffusion may increase labor market inequality. Third, the use of AI is estimated to have increased labor productivity in the macroeconomy by 0.5–0.6%. Fourth, nearly 30% of workers expect to use AI for their jobs in the future, suggesting that its macroeconomic effects will increase. However, the productivity effect of AI for those who recently started using it is relatively small, suggesting a diminishing productivity impact of AI. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:eti:dpaper:24084 |
By: | Fetzer, Thiemo (Warwick University & University of Bonn, Departments of Economics. Also affiliated or fellowed at Grantham Research Institute, ECONtribute, CEPR, NIESR, CESifo and STICERD); Lambert, Peter John (London School of Economics and Political Science (LSE), Department of Economics. Also affiliated: Centre for Economic Performance (CEP), Programme on Innovation and Diffusion (POID)); Feld, Bennet (London School of Economics and Political Science (LSE), Department of Economics); Garg, Prashant (Imperial College London, Department of Economics and Public Policy. Also affiliated with IFC) |
Abstract: | This paper leverages generative AI to build a network structure over 5, 000 product nodes, where directed edges represent input-output relationships in production. We layout a two-step 'build-prune' approach using an ensemble of prompt-tuned generative AI classifications. The 'build' step provides an initial distribution of edge predictions, the 'prune' step then re-evaluates all edges. With our AI-generated Production Network (AIPNET) in toe, we document a host of shifts in the network position of products and countries during the 21st century. Finally, we study production network spillovers using the natural experiment presented by the 2017 blockade of Qatar. We find strong evidence of such spill-overs, suggestive of on-shoring of critical production. This descriptive and causal evidence demonstrates some of the many research possibilities opened up by our granular measurement of product linkages, including studies of on-shoring, industrial policy, and other recent shifts in global trade. |
Keywords: | Supply-Chain Network Analysis, Large Language Models, On-shoring, industrial policy, Trade wars, Econometrics-of-LLMs JEL Classification: F14, F23, L16, F52, O25, N74, C81 |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:cge:wacage:733 |
By: | ITO Arata; SATO Masahiro; OTA Rui |
Abstract: | Policy uncertainty has the potential to reduce policy effectiveness. Existing studies have measured policy uncertainty by tracking the frequency of specific keywords in newspaper articles. However, this keyword-based approach fails to account for the context of the articles and differentiate the types of uncertainty that such contexts indicate. This study introduces a new method of measuring different types of policy uncertainty in news content which utilizes large language models (LLMs). Specifically, we differentiate policy uncertainty into forward-looking and backward-looking uncertainty, or in other words, uncertainty regarding future policy direction and uncertainty about the effectiveness of the current policy. We fine-tune the LLMs to identify each type of uncertainty expressed in newspaper articles based on their context, even in the absence of specific keywords indicating uncertainty. By applying this method, we measure Japan’s monetary policy uncertainty (MPU) from 2015 to 2016. To reflect the unprecedented monetary policy conditions during this period when the unconventional policies were taken, we further classify MPU by layers of policy changes: changes in specific market operations and changes in the broader policy framework. The experimental results show that our approach successfully captures the dynamics of MPU, particularly for forward-looking uncertainty, which is not fully captured by the existing approach. Forward- and backward-looking uncertainty indices exhibit distinct movements depending on the conditions under which changes in the policy framework occur. This suggests that perceived uncertainty regarding monetary policy would be state-dependent, varying with the prevailing social environment. |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:eti:dpaper:24080 |
By: | Paul, Joseph R.; Schaffer, Mark E. |
Abstract: | This paper introduces conformal inference, a powerful and flexible framework for constructing prediction intervals with guaranteed coverage in finite samples. Unlike conventional methods, conformal inference makes no assumptions about the underlying data distribution other than exchangeability. The paper begins with some simple examples of full and split conformal prediction that highlight the key assumption of exchangeability. We then provide more formal treatments of full and split conformal prediction along with extensions of the basic framework, including the Jackknife+ and CV+ algorithms, both of which offer a better balance between computational and statistical efficiency compared to full and split conformal prediction. The paper then discusses the limitations to achieving exact conditional coverage and several methods that aim to improve conditional coverage in practice. The final section briefly discusses areas of current research the software options for implementing conformal methods. |
Keywords: | conformal inference, conformal prediction, distribution-free inference, machine learning |
JEL: | C12 C14 C53 |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:zbw:hwuaef:308058 |
By: | Erdinc Akyildirim (University of Bradford); Matteo Gambara (INAIT SA); Josef Teichmann (ETH Zurich; Swiss Finance Institute); Syang Zhou (ETH) |
Abstract: | We present convincing empirical results on the application of Randomized Signature Methods for non-linear, non-parametric drift estimation for a multi-variate financial market. Even though drift estimation is notoriously ill defined due to small signal to noise ratio, one can still try to learn optimal non-linear maps from data to future returns for the purposes of portfolio optimization. Randomized Signatures, in constrast to classical signatures, allow for high dimensional market dimension and provide features on the same scale. We do not contribute to the theory of Randomized Signatures here, but rather present our empirical findings on portfolio selection in real world settings including real market data and transaction costs. |
Keywords: | Machine Learning, Randomized Signature, Drift estimation, Returns forecast, Portfolio Optimization, Path-dependent Signal |
JEL: | C21 C22 G11 G14 G17 |
Date: | 2024–01 |
URL: | https://d.repec.org/n?u=RePEc:chf:rpseri:rp2479 |
By: | Fetzer, Thiemo (Warwick University & University of Bonn); Lambert, Peter John (London School of Economics and Political Science); Feld, Bennet (London School of Economics and Political Science); Garg, Prashant (Imperial College London) |
Abstract: | This paper leverages generative AI to build a network structure over 5, 000 product nodes, where directed edges represent input-output relationships in production. We layout a two-step build-prune approach using an ensemble of prompt-tuned generative AI classifications. The build step provides an initial distribution of edge predictions, the ‘prune’ step then re-evaluates all edges. With our AI-generated Production Network (AIPNET) in toe, we document a host of shifts in the network position of products and countries during the 21st century. Finally, we study production network spillovers using the natural experiment presented by the 2017 blockade of Qatar. We find strong evidence of such spill-overs, suggestive of on-shoring of critical production. This descriptive and causal evidence demonstrates some of the many research possibilities opened up by our granular measurement of product linkages, including studies of on-shoring, industrial policy, and other recent shifts in global trade. |
Keywords: | Supply-Chain Network Analysis ; Large Language Models ; On-shoring ; industrial policy ; Trade wars ; Econometrics-of-LLMs JEL Codes: F14 ; F23 ; L16 ; F52 ; O25 ; N74 ; C81 |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:wrk:warwec:1528 |
By: | Darudi, Ali |
Abstract: | The Future Electricity Market Model (FEM) is a comprehensive technoeconomic model designed to simulate the investment, dispatch, and trade dynamics within the power systems of Switzerland and Europe. FEM operates as a partial equilibrium model of the wholesale electricity market, minimizing total system costs while adhering to a wide range of technical constraints. It provides projections of capacity mix, hourly prices, generation profiles, storage dispatch, flexible consumption, and cross-border electricity trading across different market areas. The model is formulated as a quadratic programming problem, implemented in Python, and solved using the Gurobi optimizer. With an hourly resolution over a year and a specific focus on the Swiss power system, FEM allows investment decisions solely within Switzerland, while the rest of Europe follows predefined development scenarios. Key features include the modeling of various renewable and conventional energy technologies, integration of storage systems, and incorporation of detailed electricity demand and trade constraints. In order to model the hydro power system more realistically, the model follows a hydro calendar year, i.e., the model starts at the beginning of October. Despite its deterministic approach, assuming perfect foresight, which may introduce an optimism bias, FEM serves as a powerful tool for analyzing the future dynamics of electricity markets under various scenarios. |
Keywords: | Electricity market, Numerical modelling, Energy |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:zbw:esprep:306396 |
By: | Isaak, Niklas (RWI); Jessen, Robin (RWI) |
Abstract: | How much does society value redistribution? The common method to derive inverse-optimum welfare weights is by inverting an optimal-tax model. Our alternative imposes fewer restrictions on labor supply and enables comparisons across household types. We use a structural labor supply model to calculate the marginal value of public funds for various small tax reductions, directly linked to welfare weights. An application to Germany finds: i) The tax-transfer system is optimal if society values one additional Euro for the bottom decile three times as much as for the median. ii) At low-medium incomes, weights for couples exceed those for singles substantially. |
Keywords: | inverse optimum, microsimulation, marginal value of public funds, social welfare function, optimal taxation, labor supply, efficiency |
JEL: | H21 H31 J22 |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:iza:izadps:dp17566 |
By: | Lu, Hongyu; Rodgers, Michael O.; Guensler, Randall |
Abstract: | The Georgia Tech research team developed MOVES-Matrix 3.0 based on the EPA's MOVES3 (version 3.1.0) energy use and emission rate model by running MOVES3 thousands of times on the PACE supercomputing cluster across all combinations of input variables and storing the output as lookup tables. MOVES-Matrix 3.0 allows on-road energy consumption and emissions modeling to be conducted more than 800 times faster than running MOVES, while it generates the exact same results, as verifiedin this report. MOVES-Matrix 3.0 was designed similarly to its predecessor, MOVES-Matrix 2014, but required extensive code modifications to accommodate changes in the MOVES3 environment (including a shift from MySQL to MariaDB and incorporation of new vehicle source sub-types and operating parameters). The review of the fuel and I/M scenarios indicated that MOVES3 now defines 122 modeling regions, as compared with 109 regions in MOVES 2014b (different matrices need to be developed each modeling region). The development of matrices for each modeling region takes approximately 15-20 days on the PACE supercomputing cluster given our assigned resources (compared with only 5-7 days to develop matrices for MOVES 2014). A case study of 3, 000 roadway links using Atlanta's matrices confirmed that MOVES-Matrix 3.0 produces the exact same energy consumption and emissions results as MOVES3, but execution modules operate 800 times faster using MOVES-Matrix lookupsthan running MOVES for any single run. View the NCST Project Webpage |
Keywords: | Engineering, Emission Rate Modeling, Energy Use Modeling, MOVES, MOVES-Matrix |
Date: | 2024–10–01 |
URL: | https://d.repec.org/n?u=RePEc:cdl:itsdav:qt4cs5q28b |