nep-ain New Economics Papers
on Artificial Intelligence
Issue of 2025–01–27
24 papers chosen by
Ben Greiner, Wirtschaftsuniversität Wien


  1. The Emergence of Strategic Reasoning of Large Language Models By Dongwoo Lee; Gavin Kader
  2. Large Language Models: An Applied Econometric Framework By Jens Ludwig; Sendhil Mullainathan; Ashesh Rambachan
  3. New Technologies and Jobs in Europe By Stefania Albanesi
  4. Generative AI Impact on Labor Market: Analyzing ChatGPT's Demand in Job Advertisements By Mahdi Ahmadi; Neda Khosh Kheslat; Adebola Akintomide
  5. Automation, Techies, and Labor Market Restructuring By Ariell Reshef; Farid Toubal
  6. Augmenting Minds or Automating Skills: The Differential Role of Human Capital in Generative AI's Impact on Creative Tasks By Meiling Huang; Ming Jin; Ning Li
  7. Artificial Intelligence, Scientific Discovery, and Product Innovation By Aidan Toner-Rodgers
  8. Concepts and Challenges of Measuring Production of Artificial Intelligence in the U.S. Economy By Tina Highfill; David Wasshausen; Gregory Prunchak
  9. Follow the money: a startup-based measure of AI exposure across occupations, industries and regions By Enrico Maria Fenoaltea; Dario Mazzilli; Aurelio Patelli; Angelica Sbardella; Andrea Tacchella; Andrea Zaccaria; Marco Trombetti; Luciano Pietronero
  10. Safe AI made in the EU By Rehse, Dominik; Valet, Sebastian; Walter, Johannes
  11. A Scoping Review of ChatGPT Research in Accounting and Finance By Mengming Michael Dong; Theophanis C. Stratopoulos; Victor Xiaoqi Wang
  12. The Promise and Peril of Generative AI: Evidence from GPT-4 as Sell-Side Analysts By Edward Li; Zhiyuan Tu; Dexin Zhou
  13. FinGPT: Enhancing Sentiment-Based Stock Movement Prediction with Dissemination-Aware and Context-Enriched LLMs By Yixuan Liang; Yuncong Liu; Boyu Zhang; Christina Dan Wang; Hongyang Yang
  14. Auto-Generating Earnings Report Analysis via a Financial-Augmented LLM By Van-Duc Le
  15. Innovative Sentiment Analysis and Prediction of Stock Price Using FinBERT, GPT-4 and Logistic Regression: A Data-Driven Approach By Olamilekan Shobayo; Sidikat Adeyemi-Longe; Olusogo Popoola; Bayode Ogunleye
  16. INVESTORBENCH: A Benchmark for Financial Decision-Making Tasks with LLM-based Agent By Haohang Li; Yupeng Cao; Yangyang Yu; Shashidhar Reddy Javaji; Zhiyang Deng; Yueru He; Yuechen Jiang; Zining Zhu; Koduvayur Subbalakshmi; Guojun Xiong; Jimin Huang; Lingfei Qian; Xueqing Peng; Qianqian Xie; Jordan W. Suchow
  17. LLMs for Time Series: an Application for Single Stocks and Statistical Arbitrage By Sebastien Valeyre; Sofiane Aboura
  18. Interpretable Company Similarity with Sparse Autoencoders By Marco Molinari; Victor Shao; Vladimir Tregubiak; Abhimanyu Pandey; Mateusz Mikolajczak; Sebastian Kuznetsov Ryder Torres Pereira
  19. Unveiling the Role of Artificial Intelligence and Stock Market Growth in Achieving Carbon Neutrality in the United States: An ARDL Model Analysis By Azizul Hakim Rafi; Abdullah Al Abrar Chowdhury; Adita Sultana; Abdulla All Noman
  20. SusGen-GPT: A Data-Centric LLM for Financial NLP and Sustainability Report Generation By Qilong Wu; Xiaoneng Xiang; Hejia Huang; Xuan Wang; Yeo Wei Jie; Ranjan Satapathy; Ricardo Shirota Filho; Bharadwaj Veeravalli
  21. Generative AI for Economic Research: LLMs Learn to Collaborate and Reason By Anton Korinek
  22. Delving into Youth Perspectives on In-game Gambling-like Elements: A Proof-of-Concept Study Utilising Large Language Models for Analysing User-Generated Text Data By Thomas Krause; Steffen Otterbach; Johannes Singer
  23. Leveraging Large Language Models to Democratize Access to Costly Financial Datasets for Academic Research By Julian Junyan Wang; Victor Xiaoqi Wang
  24. The Value of AI-Generated Metadata for UGC Platforms: Evidence from a Large-scale Field Experiment By Xinyi Zhang; Chenshuo Sun; Renyu Zhang; Khim-Yong Goh

  1. By: Dongwoo Lee; Gavin Kader
    Abstract: As Large Language Models (LLMs) are increasingly used for a variety of complex and critical tasks, it is vital to assess their logical capabilities in strategic environments. This paper examines their ability in strategic reasoning -- the process of choosing an optimal course of action by predicting and adapting to other agents' behavior. Using six LLMs, we analyze responses from play in classical games from behavioral economics (p-Beauty Contest, 11-20 Money Request Game, and Guessing Game) and evaluate their performance through hierarchical models of reasoning (level-$k$ theory and cognitive hierarchy theory). Our findings reveal that while LLMs show understanding of the games, the majority struggle with higher-order strategic reasoning. Although most LLMs did demonstrate learning ability with games involving repeated interactions, they still consistently fall short of the reasoning levels demonstrated by typical behavior from human subjects. The exception to these overall findings is with OpenAI's GPT-o1 -- specifically trained to solve complex reasoning tasks -- which consistently outperforms other LLMs and human subjects. These findings highlight the challenges and pathways in advancing LLMs toward robust strategic reasoning from the perspective of behavioral economics.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.13013
  2. By: Jens Ludwig; Sendhil Mullainathan; Ashesh Rambachan
    Abstract: How can we use the novel capacities of large language models (LLMs) in empirical research? And how can we do so while accounting for their limitations, which are themselves only poorly understood? We develop an econometric framework to answer this question that distinguishes between two types of empirical tasks. Using LLMs for prediction problems (including hypothesis generation) is valid under one condition: no ``leakage'' between the LLM's training dataset and the researcher's sample. No leakage can be ensured by using open-source LLMs with documented training data and published weights. Using LLM outputs for estimation problems to automate the measurement of some economic concept (expressed either by some text or from human subjects) requires the researcher to collect at least some validation data: without such data, the errors of the LLM's automation cannot be assessed and accounted for. As long as these steps are taken, LLM outputs can be used in empirical research with the familiar econometric guarantees we desire. Using two illustrative applications to finance and political economy, we find that these requirements are stringent; when they are violated, the limitations of LLMs now result in unreliable empirical estimates. Our results suggest the excitement around the empirical uses of LLMs is warranted -- they allow researchers to effectively use even small amounts of language data for both prediction and estimation -- but only with these safeguards in place.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.07031
  3. By: Stefania Albanesi (Department of Economics, University of Miami)
    Abstract: We examine the link between labour market developments and new technologies such as artificial intelligence (AI) and software in 16 European countries over the period 2011- 2019. Using data for occupations at the 3-digit level in Europe, we find that on average employment shares have increased in occupations more exposed to AI. This is particularly the case for occupations with a relatively higher proportion of younger and skilled workers. This evidence is in line with the Skill Biased Technological Change theory. While there exists heterogeneity across countries, only very few countries show a decline in employment shares of occupations more exposed to AI-enabled automation. Country heterogeneity for this result seems to be linked to the pace of technology diffusion and education, but also to the level of product market regulation (competition) and employment protection laws. In contrast to the findings for employment, we find little evidence for a relationship between wages and potential exposures to new technologies.
    Keywords: artificial intelligence, employment, skills, occupations
    JEL: J23 O33
    Date: 2023–06–15
    URL: https://d.repec.org/n?u=RePEc:mia:wpaper:wp2023-01.rdf
  4. By: Mahdi Ahmadi; Neda Khosh Kheslat; Adebola Akintomide
    Abstract: The rapid advancement of Generative AI (Gen AI) technologies, particularly tools like ChatGPT, is significantly impacting the labor market by reshaping job roles and skill requirements. This study examines the demand for ChatGPT-related skills in the U.S. labor market by analyzing job advertisements collected from major job platforms between May and December 2023. Using text mining and topic modeling techniques, we extracted and analyzed the Gen AI-related skills that employers are hiring for. Our analysis identified five distinct ChatGPT-related skill sets: general familiarity, creative content generation, marketing, advanced functionalities (such as prompt engineering), and product development. In addition, the study provides insights into job attributes such as occupation titles, degree requirements, salary ranges, and other relevant job characteristics. These findings highlight the increasing integration of Gen AI across various industries, emphasizing the growing need for both foundational knowledge and advanced technical skills. The study offers valuable insights into the evolving demands of the labor market, as employers seek candidates equipped to leverage generative AI tools to improve productivity, streamline processes, and drive innovation.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.07042
  5. By: Ariell Reshef (UP1 - Université Paris 1 Panthéon-Sorbonne, PSE - Paris School of Economics, CESifo - CESifo - Munich); Farid Toubal (CEPII - Centre d'Etudes Prospectives et d'Informations Internationales - Centre d'analyse stratégique, CEPR - Center for Economic Policy Research, LEDa - Laboratoire d'Economie de Dauphine - IRD - Institut de Recherche pour le Développement - Université Paris Dauphine-PSL - PSL - Université Paris Sciences et Lettres - CNRS - Centre National de la Recherche Scientifique)
    Abstract: While job polarization was a salient feature in European economies in the decade up to 2010, this phenomenon has all but disappeared, except in a handful of Southern-European economies. The decade following 2010 is characterized by occupational upgrading, where low-paid jobs shrink and high paid jobs expand. We show that this is associated with automation: employment shares in low paid, highly automatable jobs shrinks, while employment shares of better paid jobs that are unlikely to be automated expands. Techies (engineers and technicians with strong STEM skills) help explain cross country variation in occupational upgrading: economies that are abundant in techies or exhibit high growth of techies see strong skill upgrading; in contrast, polarization is observed in economies with few techies. Robotization is associated with skill upgrading in manufacturing. We discuss the additional roles of globalization, structural change and labor market institutions in driving these phenomena. Hitherto, artificial intelligence (AI) seems to have similar impacts as other automation technologies. However, there is uncertainty about what new AI technologies harbor.
    Keywords: automation, robots, techies, tasks, STEM, occupations, employment, polarization
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:hal:journl:hal-04837769
  6. By: Meiling Huang; Ming Jin; Ning Li
    Abstract: Generative AI is rapidly reshaping creative work, raising critical questions about its beneficiaries and societal implications. This study challenges prevailing assumptions by exploring how generative AI interacts with diverse forms of human capital in creative tasks. Through two random controlled experiments in flash fiction writing and song composition, we uncover a paradox: while AI democratizes access to creative tools, it simultaneously amplifies cognitive inequalities. Our findings reveal that AI enhances general human capital (cognitive abilities and education) by facilitating adaptability and idea integration but diminishes the value of domain-specific expertise. We introduce a novel theoretical framework that merges human capital theory with the automation-augmentation perspective, offering a nuanced understanding of human-AI collaboration. This framework elucidates how AI shifts the locus of creative advantage from specialized expertise to broader cognitive adaptability. Contrary to the notion of AI as a universal equalizer, our work highlights its potential to exacerbate disparities in skill valuation, reshaping workplace hierarchies and redefining the nature of creativity in the AI era. These insights advance theories of human capital and automation while providing actionable guidance for organizations navigating AI integration amidst workforce inequalities.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.03963
  7. By: Aidan Toner-Rodgers
    Abstract: This paper studies the impact of artificial intelligence on innovation, exploiting the randomized introduction of a new materials discovery technology to 1, 018 scientists in the R&D lab of a large U.S. firm. AI-assisted researchers discover 44% more materials, resulting in a 39% increase in patent filings and a 17% rise in downstream product innovation. These compounds possess more novel chemical structures and lead to more radical inventions. However, the technology has strikingly disparate effects across the productivity distribution: while the bottom third of scientists see little benefit, the output of top researchers nearly doubles. Investigating the mechanisms behind these results, I show that AI automates 57% of "idea-generation" tasks, reallocating researchers to the new task of evaluating model-produced candidate materials. Top scientists leverage their domain knowledge to prioritize promising AI suggestions, while others waste significant resources testing false positives. Together, these findings demonstrate the potential of AI-augmented research and highlight the complementarity between algorithms and expertise in the innovative process. Survey evidence reveals that these gains come at a cost, however, as 82% of scientists report reduced satisfaction with their work due to decreased creativity and skill underutilization.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.17866
  8. By: Tina Highfill; David Wasshausen; Gregory Prunchak
    Abstract: Much of the current literature on the economic impact of Artificial Intelligence (AI) focuses on the uses of AI, but little is known about the production of AI and its contribution to economic growth. In this paper, we discuss basic concepts and challenges related to measuring the production of AI within a standard national accounting framework. We first present a variety of examples that illustrate how both the production and use of AI software are currently reflected in macroeconomic statistics like Gross Domestic Product and the Supply and Use Tables. We then discuss a broader approach to measurement using a thematic satellite account framework that highlights production of AI across foundational areas, including manufacturing, software publishing, computer and data services, and research & development. The challenges of identifying and quantifying AI production in the national accounts using existing data sources are discussed and some possible solutions for the future are offered.
    JEL: E01 O30
    Date: 2025–01
    URL: https://d.repec.org/n?u=RePEc:bea:papers:0134
  9. By: Enrico Maria Fenoaltea; Dario Mazzilli; Aurelio Patelli; Angelica Sbardella; Andrea Tacchella; Andrea Zaccaria; Marco Trombetti; Luciano Pietronero
    Abstract: The integration of artificial intelligence (AI) into the workplace is advancing rapidly, necessitating robust metrics to evaluate its tangible impact on the labour market. Existing measures of AI occupational exposure largely focus on AI's theoretical potential to substitute or complement human labour on the basis of technical feasibility, providing limited insight into actual adoption and offering inadequate guidance for policymakers. To address this gap, we introduce the AI Startup Exposure (AISE) index-a novel metric based on occupational descriptions from O*NET and AI applications developed by startups funded by the Y Combinator accelerator. Our findings indicate that while high-skilled professions are theoretically highly exposed according to conventional metrics, they are heterogeneously targeted by startups. Roles involving routine organizational tasks-such as data analysis and office management-display significant exposure, while occupations involving tasks that are less amenable to AI automation due to ethical or high-stakes, more than feasibility, considerations -- such as judges or surgeons -- present lower AISE scores. By focusing on venture-backed AI applications, our approach offers a nuanced perspective on how AI is reshaping the labour market. It challenges the conventional assumption that high-skilled jobs uniformly face high AI risks, highlighting instead the role of today's AI players' societal desirability-driven and market-oriented choices as critical determinants of AI exposure. Contrary to fears of widespread job displacement, our findings suggest that AI adoption will be gradual and shaped by social factors as much as by the technical feasibility of AI applications. This framework provides a dynamic, forward-looking tool for policymakers and stakeholders to monitor AI's evolving impact and navigate the changing labour landscape.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.04924
  10. By: Rehse, Dominik; Valet, Sebastian; Walter, Johannes
    Abstract: We propose an EU Safe Generative AI Innovation Program to address a market failure in generative AI development. While developers can capture significant value from generative AI capability improvements, they bear only a fraction of potential safety failure costs, which leads to underinvestment in the technological breakthroughs necessary to make generative AI safe. The EU should establish explicit incentives for the necessary technological breakthroughs, complementing its existing policy responses to the rapid proliferation of generative AI. We propose a milestone-based incentive scheme where pre-specified payments would reward the achievement of verifiable safety milestones. This "pull" funding mechanism would aim to create predictable development paths for safety improvements, similar to how scaling laws have guided capability advances. The scheme would use robust safety metrics and competitive evaluation to prevent gaming while ensuring meaningful progress. Success would be measured through a combination of specific safety dimensions (like factual accuracy and harm prevention) and broader performance metrics, validated through adversarial testing and public comparative evaluation. The program's design would be technology-neutral and it could be open to all qualified institutions, with rewards calibrated through incentive-compatible elicitation mechanisms. This approach mirrors other applications of outcome-based funding, such as advance market commitments in vaccine development. It might also provide the breeding ground for "Safe AI made in the EU".
    Date: 2024
    URL: https://d.repec.org/n?u=RePEc:zbw:zewpbs:308835
  11. By: Mengming Michael Dong; Theophanis C. Stratopoulos; Victor Xiaoqi Wang
    Abstract: This paper provides a review of recent publications and working papers on ChatGPT and related Large Language Models (LLMs) in accounting and finance. The aim is to understand the current state of research in these two areas and identify potential research opportunities for future inquiry. We identify three common themes from these earlier studies. The first theme focuses on applications of ChatGPT and LLMs in various fields of accounting and finance. The second theme utilizes ChatGPT and LLMs as a new research tool by leveraging their capabilities such as classification, summarization, and text generation. The third theme investigates implications of LLM adoption for accounting and finance professionals, as well as for various organizations and sectors. While these earlier studies provide valuable insights, they leave many important questions unanswered or partially addressed. We propose venues for further exploration and provide technical guidance for researchers seeking to employ ChatGPT and related LLMs as a tool for their research.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.05731
  12. By: Edward Li; Zhiyuan Tu; Dexin Zhou
    Abstract: We investigate how advanced large language models (LLMs), specifically GPT-4, process corporate disclosures to forecast earnings. Using earnings press releases issued around GPT-4's knowledge cutoff date, we address two questions: (1) Do GPT-generated earnings forecasts outperform analysts in accuracy? (2) How is GPT's performance related to its processing of textual and quantitative information? Our findings suggest that GPT forecasts are significantly less accurate than those of analysts. This underperformance can be traced to GPT's distinct textual and quantitative approaches: its textual processing follows a consistent, generalized pattern across firms, highlighting its strengths in language tasks. In contrast, its quantitative processing capabilities vary significantly across firms, revealing limitations tied to the uneven availability of domain-specific training data. Additionally, there is some evidence that GPT's forecast accuracy diminishes beyond its knowledge cutoff, underscoring the need to evaluate LLMs under hindsight-free conditions. Overall, this study provides a novel exploration of the "black box" of GPT-4's information processing, offering insights into LLMs' potential and challenges in financial applications.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.01069
  13. By: Yixuan Liang; Yuncong Liu; Boyu Zhang; Christina Dan Wang; Hongyang Yang
    Abstract: Financial sentiment analysis is crucial for understanding the influence of news on stock prices. Recently, large language models (LLMs) have been widely adopted for this purpose due to their advanced text analysis capabilities. However, these models often only consider the news content itself, ignoring its dissemination, which hampers accurate prediction of short-term stock movements. Additionally, current methods often lack sufficient contextual data and explicit instructions in their prompts, limiting LLMs' ability to interpret news. In this paper, we propose a data-driven approach that enhances LLM-powered sentiment-based stock movement predictions by incorporating news dissemination breadth, contextual data, and explicit instructions. We cluster recent company-related news to assess its reach and influence, enriching prompts with more specific data and precise instructions. This data is used to construct an instruction tuning dataset to fine-tune an LLM for predicting short-term stock price movements. Our experimental results show that our approach improves prediction accuracy by 8\% compared to existing methods.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.10823
  14. By: Van-Duc Le
    Abstract: Financial analysis heavily relies on the evaluation of earnings reports to gain insights into company performance. Traditional generation of these reports requires extensive financial expertise and is time-consuming. With the impressive progress in Large Language Models (LLMs), a wide variety of financially focused LLMs has emerged, addressing tasks like sentiment analysis and entity recognition in the financial domain. This paper presents a novel challenge: developing an LLM specifically for automating the generation of earnings reports analysis. Our methodology involves an in-depth analysis of existing earnings reports followed by a unique approach to fine-tune an LLM for this purpose. This approach combines retrieval augmentation and the generation of instruction-based data, specifically tailored for the financial sector, to enhance the LLM's performance. With extensive financial documents, we construct financial instruction data, enabling the refined adaptation of our LLM to financial contexts. Preliminary results indicate that our augmented LLM outperforms general open-source models and rivals commercial counterparts like GPT-3.5 in financial applications. Our research paves the way for streamlined and insightful automation in financial report generation, marking a significant stride in the field of financial analysis.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.08179
  15. By: Olamilekan Shobayo; Sidikat Adeyemi-Longe; Olusogo Popoola; Bayode Ogunleye
    Abstract: This study explores the comparative performance of cutting-edge AI models, i.e., Finaance Bidirectional Encoder representations from Transsformers (FinBERT), Generatice Pre-trained Transformer GPT-4, and Logistic Regression, for sentiment analysis and stock index prediction using financial news and the NGX All-Share Index data label. By leveraging advanced natural language processing models like GPT-4 and FinBERT, alongside a traditional machine learning model, Logistic Regression, we aim to classify market sentiment, generate sentiment scores, and predict market price movements. This research highlights global AI advancements in stock markets, showcasing how state-of-the-art language models can contribute to understanding complex financial data. The models were assessed using metrics such as accuracy, precision, recall, F1 score, and ROC AUC. Results indicate that Logistic Regression outperformed the more computationally intensive FinBERT and predefined approach of versatile GPT-4, with an accuracy of 81.83% and a ROC AUC of 89.76%. The GPT-4 predefined approach exhibited a lower accuracy of 54.19% but demonstrated strong potential in handling complex data. FinBERT, while offering more sophisticated analysis, was resource-demanding and yielded a moderate performance. Hyperparameter optimization using Optuna and cross-validation techniques ensured the robustness of the models. This study highlights the strengths and limitations of the practical applications of AI approaches in stock market prediction and presents Logistic Regression as the most efficient model for this task, with FinBERT and GPT-4 representing emerging tools with potential for future exploration and innovation in AI-driven financial analytics
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.06837
  16. By: Haohang Li; Yupeng Cao; Yangyang Yu; Shashidhar Reddy Javaji; Zhiyang Deng; Yueru He; Yuechen Jiang; Zining Zhu; Koduvayur Subbalakshmi; Guojun Xiong; Jimin Huang; Lingfei Qian; Xueqing Peng; Qianqian Xie; Jordan W. Suchow
    Abstract: Recent advancements have underscored the potential of large language model (LLM)-based agents in financial decision-making. Despite this progress, the field currently encounters two main challenges: (1) the lack of a comprehensive LLM agent framework adaptable to a variety of financial tasks, and (2) the absence of standardized benchmarks and consistent datasets for assessing agent performance. To tackle these issues, we introduce \textsc{InvestorBench}, the first benchmark specifically designed for evaluating LLM-based agents in diverse financial decision-making contexts. InvestorBench enhances the versatility of LLM-enabled agents by providing a comprehensive suite of tasks applicable to different financial products, including single equities like stocks, cryptocurrencies and exchange-traded funds (ETFs). Additionally, we assess the reasoning and decision-making capabilities of our agent framework using thirteen different LLMs as backbone models, across various market environments and tasks. Furthermore, we have curated a diverse collection of open-source, multi-modal datasets and developed a comprehensive suite of environments for financial decision-making. This establishes a highly accessible platform for evaluating financial agents' performance across various scenarios.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.18174
  17. By: Sebastien Valeyre; Sofiane Aboura
    Abstract: Recently, LLMs (Large Language Models) have been adapted for time series prediction with significant success in pattern recognition. However, the common belief is that these models are not suitable for predicting financial market returns, which are known to be almost random. We aim to challenge this misconception through a counterexample. Specifically, we utilized the Chronos model from Ansari et al.(2024) and tested both pretrained configurations and fine-tuned supervised forecasts on the largest American single stocks using data from Guijarro-Ordonnez et al.(2022). We constructed a long/short portfolio, and the performance simulation indicates that LLMs can in reality handle time series that are nearly indistinguishable from noise, demonstrating an ability to identify inefficiencies amidst randomness and generate alpha. Finally, we compared these results with those of specialized models and smaller deep learning models, highlighting significant room for improvement in LLM performance to further enhance their predictive capabilities.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.09394
  18. By: Marco Molinari; Victor Shao; Vladimir Tregubiak; Abhimanyu Pandey; Mateusz Mikolajczak; Sebastian Kuznetsov Ryder Torres Pereira
    Abstract: Determining company similarity is a vital task in finance, underpinning hedging, risk management, portfolio diversification, and more. Practitioners often rely on sector and industry classifications to gauge similarity, such as SIC-codes and GICS-codes - the former being used by the U.S. Securities and Exchange Commission (SEC), and the latter widely used by the investment community. Since these classifications can lack granularity and often need to be updated, using clusters of embeddings of company descriptions has been proposed as a potential alternative, but the lack of interpretability in token embeddings poses a significant barrier to adoption in high-stakes contexts. Sparse Autoencoders (SAEs) have shown promise in enhancing the interpretability of Large Language Models (LLMs) by decomposing LLM activations into interpretable features. We apply SAEs to company descriptions, obtaining meaningful clusters of equities in the process. We benchmark SAE features against SIC-codes, Major Group codes, and Embeddings. Our results demonstrate that SAE features not only replicate but often surpass sector classifications and embeddings in capturing fundamental company characteristics. This is evidenced by their superior performance in correlating monthly returns - a proxy for similarity - and generating higher Sharpe ratio co-integration strategies, which underscores deeper fundamental similarities among companies.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.02605
  19. By: Azizul Hakim Rafi; Abdullah Al Abrar Chowdhury; Adita Sultana; Abdulla All Noman
    Abstract: Given the fact that climate change has become one of the most pressing problems in many countries in recent years, specialized research on how to mitigate climate change has been adopted by many countries. Within this discussion, the influence of advanced technologies in achieving carbon neutrality has been discussed. While several studies investigated how AI and Digital innovations could be used to reduce the environmental footprint, the actual influence of AI in reducing CO2 emissions (a proxy measuring carbon footprint) has yet to be investigated. This paper studies the role of advanced technologies in general, and Artificial Intelligence (AI) and ICT use in particular, in advancing carbon neutrality in the United States, between 2021. Secondly, this paper examines how Stock Market Growth, ICT use, Gross Domestic Product (GDP), and Population affect CO2 emissions using the STIRPAT model. After examining stationarity among the variables using a variety of unit root tests, this study concluded that there are no unit root problems across all the variables, with a mixed order of integration. The ARDL bounds test for cointegration revealed that variables in this study have a long-run relationship. Moreover, the estimates revealed from the ARDL model in the short- and long-run indicated that economic growth, stock market capitalization, and population significantly contributed to the carbon emissions in both the short-run and long-run. Conversely, AI and ICT use significantly reduced carbon emissions over both periods. Furthermore, findings were confirmed to be robust using FMOLS, DOLS, and CCR estimations. Furthermore, diagnostic tests indicated the absence of serial correlation, heteroscedasticity, and specification errors and, thus, the model was robust.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.16166
  20. By: Qilong Wu; Xiaoneng Xiang; Hejia Huang; Xuan Wang; Yeo Wei Jie; Ranjan Satapathy; Ricardo Shirota Filho; Bharadwaj Veeravalli
    Abstract: The rapid growth of the financial sector and the rising focus on Environmental, Social, and Governance (ESG) considerations highlight the need for advanced NLP tools. However, open-source LLMs proficient in both finance and ESG domains remain scarce. To address this gap, we introduce SusGen-30K, a category-balanced dataset comprising seven financial NLP tasks and ESG report generation, and propose TCFD-Bench, a benchmark for evaluating sustainability report generation. Leveraging this dataset, we developed SusGen-GPT, a suite of models achieving state-of-the-art performance across six adapted and two off-the-shelf tasks, trailing GPT-4 by only 2% despite using 7-8B parameters compared to GPT-4's 1, 700B. Based on this, we propose the SusGen system, integrated with Retrieval-Augmented Generation (RAG), to assist in sustainability report generation. This work demonstrates the efficiency of our approach, advancing research in finance and ESG.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.10906
  21. By: Anton Korinek
    Abstract: Large language models (LLMs) have seen remarkable progress in speed, cost efficiency, accuracy, and the capacity to process larger amounts of text over the past year. This article is a practical guide to update economists on how to use these advancements in their research. The main innovations covered are (i) new reasoning capabilities, (ii) novel workspaces for interactive LLM collaboration such as Claude's Artifacts, ChatGPT's Canvas or Microsoft's Copilot, and (iii) recent improvements in LLM-powered internet search. Incorporating these capabilities in their work allows economists to achieve significant productivity gains. Additionally, I highlight new use cases in promoting research, such as automatically generated blog posts, presentation slides and interviews as well as podcasts via Google's NotebookLM.
    JEL: A10 B4 C88 O33
    Date: 2024–11
    URL: https://d.repec.org/n?u=RePEc:nbr:nberwo:33198
  22. By: Thomas Krause; Steffen Otterbach; Johannes Singer
    Abstract: This report documents the development, test, and application of Large Language Models (LLMs) for automated text analysis, with a specific focus on gambling-like elements in digital games, such as lootboxes. The project aimed not only to analyse user opinions and attitudes towards these mechanics, but also to advance methodological research in text analysis. By employing prompting techniques and iterative prompt refinement processes, the study sought to test and improve the accuracy of LLM-based text analysis. The findings indicate that while LLMs can effectively identify relevant patterns and themes on par with human coders, there are still challenges in handling more complex tasks, underscoring the need for ongoing refinement in methodologies. The methodological advancements achieved through this study significantly enhance the application of LLMs in real-world text analysis. The research provides valuable insights into how these models can be better utilized to analyze complex, user-generated content.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.09345
  23. By: Julian Junyan Wang; Victor Xiaoqi Wang
    Abstract: Unequal access to costly datasets essential for empirical research has long hindered researchers from disadvantaged institutions, limiting their ability to contribute to their fields and advance their careers. Recent breakthroughs in Large Language Models (LLMs) have the potential to democratize data access by automating data collection from unstructured sources. We develop and evaluate a novel methodology using GPT-4o-mini within a Retrieval-Augmented Generation (RAG) framework to collect data from corporate disclosures. Our approach achieves human-level accuracy in collecting CEO pay ratios from approximately 10, 000 proxy statements and Critical Audit Matters (CAMs) from more than 12, 000 10-K filings, with LLM processing times of 9 and 40 minutes respectively, each at a cost under $10. This stands in stark contrast to the hundreds of hours needed for manual collection or the thousands of dollars required for commercial database subscriptions. To foster a more inclusive research community by empowering researchers with limited resources to explore new avenues of inquiry, we share our methodology and the resulting datasets.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.02065
  24. By: Xinyi Zhang; Chenshuo Sun; Renyu Zhang; Khim-Yong Goh
    Abstract: AI-generated content (AIGC), such as advertisement copy, product descriptions, and social media posts, is becoming ubiquitous in business practices. However, the value of AI-generated metadata, such as titles, remains unclear on user-generated content (UGC) platforms. To address this gap, we conducted a large-scale field experiment on a leading short-video platform in Asia to provide about 1 million users access to AI-generated titles for their uploaded videos. Our findings show that the provision of AI-generated titles significantly boosted content consumption, increasing valid watches by 1.6% and watch duration by 0.9%. When producers adopted these titles, these increases jumped to 7.1% and 4.1%, respectively. This viewership-boost effect was largely attributed to the use of this generative AI (GAI) tool increasing the likelihood of videos having a title by 41.4%. The effect was more pronounced for groups more affected by metadata sparsity. Mechanism analysis revealed that AI-generated metadata improved user-video matching accuracy in the platform's recommender system. Interestingly, for a video for which the producer would have posted a title anyway, adopting the AI-generated title decreased its viewership on average, implying that AI-generated titles may be of lower quality than human-generated ones. However, when producers chose to co-create with GAI and significantly revised the AI-generated titles, the videos outperformed their counterparts with either fully AI-generated or human-generated titles, showcasing the benefits of human-AI co-creation. This study highlights the value of AI-generated metadata and human-AI metadata co-creation in enhancing user-content matching and content consumption for UGC platforms.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2412.18337

This nep-ain issue is ©2025 by Ben Greiner. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.