|
on Artificial Intelligence |
By: | Kasy, Maximilian (University of Oxford) |
Abstract: | This chapter discusses the regulation of artificial intelligence (AI) from the vantage point of political economy, based on the following premises: (i) AI systems maximize a single, measurable objective. (ii) In society, different individuals have different objectives. AI systems generate winners and losers. (iii) Society-level assessments of AI require trading off individual gains and losses. (iv) AI requires democratic control of algorithms, data, and computational infrastructure, to align algorithm objectives and social welfare. The chapter addresses several debates regarding the ethics and social impact of AI, including (i) fairness, discrimination, and inequality, (ii) privacy, data property rights, and data governance, (iii) value alignment and the impending robot apocalypse, (iv) explainability and accountability for automated decision-making, and (v) automation and the impact of AI on the labor market and on wage inequality. |
Keywords: | AI, machine learning, regulation, fairness, privacy, value alignment, explain-ability, automation |
JEL: | P00 O3 |
Date: | 2024–04 |
URL: | http://d.repec.org/n?u=RePEc:iza:izadps:dp16948&r= |
By: | Daron Acemoglu |
Abstract: | This paper evaluates claims about large macroeconomic implications of new advances in AI. It starts from a task-based model of AI’s effects, working through automation and task complementarities. So long as AI’s microeconomic effects are driven by cost savings/productivity improvements at the task level, its macroeconomic consequences will be given by a version of Hulten’s theorem: GDP and aggregate productivity gains can be estimated by what fraction of tasks are impacted and average task-level cost savings. Using existing estimates on exposure to AI and productivity improvements at the task level, these macroeconomic effects appear nontrivial but modest—no more than a 0.66% increase in total factor productivity (TFP) over 10 years. The paper then argues that even these estimates could be exaggerated, because early evidence is from easy-to-learn tasks, whereas some of the future effects will come from hard-to-learn tasks, where there are many context-dependent factors affecting decision-making and no objective outcome measures from which to learn successful performance. Consequently, predicted TFP gains over the next 10 years are even more modest and are predicted to be less than 0.53%. I also explore AI’s wage and inequality effects. I show theoretically that even when AI improves the productivity of low-skill workers in certain tasks (without creating new tasks for them), this may increase rather than reduce inequality. Empirically, I find that AI advances are unlikely to increase inequality as much as previous automation technologies because their impact is more equally distributed across demographic groups, but there is also no evidence that AI will reduce labor income inequality. Instead, AI is predicted to widen the gap between capital and labor income. Finally, some of the new tasks created by AI may have negative social value (such as design of algorithms for online manipulation), and I discuss how to incorporate the macroeconomic effects of new tasks that may have negative social value. |
JEL: | E24 J24 O30 O33 |
Date: | 2024–05 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:32487&r= |
By: | Gorny, Paul M.; Groos, Eva; Strobel, Christina |
Abstract: | Regulators of artificial intelligence (AI) emphasize the importance of human autonomy and oversight in AI-assisted decision-making (European Commission, Directorate-General for Communications Networks, Content and Technology, 2021; 117th Congress, 2022). Predictions are the foundation of all AI tools; thus, if AI can predict our decisions, how might these predictions influence our ultimate choices? We examine how salient, personalized AI predictions affect decision outcomes and investigate the role of reactance, i.e., an adverse reaction to a perceived reduction in individual freedom. We trained an AI tool on previous dictator game decisions to generate personalized predictions of dictators’ choices. In our AI treatment, dictators received this prediction before deciding. In a treatment involving human oversight, the decision of whether participants in our experiment were provided with the AI prediction was made by a previous participant (a ‘human overseer’). In the baseline, participants did not receive the prediction. We find that participants sent less to the recipient when they received a personalized prediction but the strongest reduction occurred when the AI’s prediction was intentionally not shared by the human overseer. Our findings underscore the importance of considering human reactions to AI predictions in assessing the accuracy and impact of these tools as well as the potential adverse effects of human oversight. |
Keywords: | Artificial intelligence, Predictions, Decision-making, Reactance, Free will |
JEL: | C90 C91 D01 O33 |
Date: | 2024–05–24 |
URL: | http://d.repec.org/n?u=RePEc:pra:mprapa:121065&r= |
By: | Ziyi Wang; Lijia Wei; Lian Xue |
Abstract: | This study evaluates the effectiveness of Artificial Intelligence (AI) in mitigating medical overtreatment, a significant issue characterized by unnecessary interventions that inflate healthcare costs and pose risks to patients. We conducted a lab-in-the-field experiment at a medical school, utilizing a novel medical prescription task, manipulating monetary incentives and the availability of AI assistance among medical students using a three-by-two factorial design. We tested three incentive schemes: Flat (constant pay regardless of treatment quantity), Progressive (pay increases with the number of treatments), and Regressive (penalties for overtreatment) to assess their influence on the adoption and effectiveness of AI assistance. Our findings demonstrate that AI significantly reduced overtreatment rates by up to 62% in the Regressive incentive conditions where (prospective) physician and patient interests were most aligned. Diagnostic accuracy improved by 17% to 37%, depending on the incentive scheme. Adoption of AI advice was high, with approximately half of the participants modifying their decisions based on AI input across all settings. For policy implications, we quantified the monetary (57%) and non-monetary (43%) incentives of overtreatment and highlighted AI's potential to mitigate non-monetary incentives and enhance social welfare. Our results provide valuable insights for healthcare administrators considering AI integration into healthcare systems. |
Date: | 2024–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2405.10539&r= |
By: | Humlum, Anders (University of Chicago Booth School of Business); Vestergaard, Emilie (University of Copenhagen) |
Abstract: | We study the adoption of ChatGPT, the icon of Generative AI, using a large-scale survey experiment linked to comprehensive register data in Denmark. Surveying 100, 000 workers from 11 exposed occupations, we document ChatGPT is pervasive: half of workers have used it, with younger, less experienced, higher-achieving, and especially male workers leading the curve. Why have some workers adopted ChatGPT, and others not? Workers see a substantial productivity potential in ChatGPT but are often hindered by employer restrictions and required training. Informing workers about expert assessments of ChatGPT shifts workers' beliefs and intentions but has limited impacts on actual adoption. |
Keywords: | technology adoption, labor productivity |
JEL: | J24 O33 |
Date: | 2024–05 |
URL: | http://d.repec.org/n?u=RePEc:iza:izadps:dp16992&r= |
By: | Kässi, Otto |
Abstract: | Abstract We examine the effects of generative artificial intelligence (GenAI) on the labor market, specifically focusing on the impact of ChatGPT on job demand. Using micro-level data from one of the largest online labor platforms, we classify new job postings into three categories: substitutable, augmenting, and unaffected. We apply a difference-in-differences method to explore how ChatGPT’s deployment has altered labor demand within these categories. Our findings show a slight decrease in openings for substitutable jobs, where GenAI can fully perform tasks without loss of quality. However, there is an increase in demand for augmenting and unaffected jobs, which either benefit from faster task completion due to GenAI assistance or remain unchanged by it. The data indicates that ChatGPT’s introduction has not uniformly decreased labor demand but rather redistributed it, leading to growth in some sectors and declines in others. |
Keywords: | Generative artificial intelligence, Technological change, Labour demand, Labour markets |
JEL: | J23 J24 O33 |
Date: | 2024–06–06 |
URL: | https://d.repec.org/n?u=RePEc:rif:briefs:136&r= |
By: | Bloom, David E. (Harvard School of Public Health); Prettner, Klaus (Vienna University of Economics and Business); Saadaoui, Jamel (Université de Strasbourg); Veruete, Mario (Quantum DataLab) |
Abstract: | How will the emergence of ChatGPT and other forms of artificial intelligence (AI) affect the skill premium? To address this question, we propose a nested constant elasticity of substitution production function that distinguishes among three types of capital: traditional physical capital (machines, assembly lines), industrial robots, and AI. Following the literature, we assume that industrial robots predominantly substitute for low-skill workers, whereas AI mainly helps to perform the tasks of high-skill workers. We show that AI reduces the skill premium as long as it is more substitutable for high-skill workers than low-skill workers are for high-skill workers. |
Keywords: | automation, artificial intelligence, ChatGPT, skill premium, wages, productivity |
JEL: | J30 O14 O15 O33 |
Date: | 2024–05 |
URL: | http://d.repec.org/n?u=RePEc:iza:izadps:dp16972&r= |
By: | Kentaro Hoffman; Stephen Salerno; Jeff Leek; Tyler McCormick |
Abstract: | Large-scale prediction models (typically using tools from artificial intelligence, AI, or machine learning, ML) are increasingly ubiquitous across a variety of industries and scientific domains. Such methods are often paired with detailed data from sources such as electronic health records, wearable sensors, and omics data (high-throughput technology used to understand biology). Despite their utility, implementing AI and ML tools at the scale necessary to work with this data introduces two major challenges. First, it can cost tens of thousands of dollars to train a modern AI/ML model at scale. Second, once the model is trained, its predictions may become less relevant as patient and provider behavior change, and predictions made for one geographical area may be less accurate for another. These two challenges raise a fundamental question: how often should you refit the AI/ML model to optimally trade-off between cost and relevance? Our work provides a framework for making decisions about when to {\it refit} AI/ML models when the goal is to maintain valid statistical inference (e.g. estimating a treatment effect in a clinical trial). Drawing on portfolio optimization theory, we treat the decision of {\it recalibrating} versus {\it refitting} the model as a choice between ''investing'' in one of two ''assets.'' One asset, recalibrating the model based on another model, is quick and relatively inexpensive but bears uncertainty from sampling and the possibility that the other model is not relevant to current circumstances. The other asset, {\it refitting} the model, is costly but removes the irrelevance concern (though not the risk of sampling error). We explore the balancing act between these two potential investments in this paper. |
Date: | 2024–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2405.13926&r= |
By: | Tom S\"uhr; Samira Samadi; Chiara Farronato |
Abstract: | Machine learning (ML) models are increasingly used in various applications, from recommendation systems in e-commerce to diagnosis prediction in healthcare. In this paper, we present a novel dynamic framework for thinking about the deployment of ML models in a performative, human-ML collaborative system. In our framework, the introduction of ML recommendations changes the data generating process of human decisions, which are only a proxy to the ground truth and which are then used to train future versions of the model. We show that this dynamic process in principle can converge to different stable points, i.e. where the ML model and the Human+ML system have the same performance. Some of these stable points are suboptimal with respect to the actual ground truth. We conduct an empirical user study with 1, 408 participants to showcase this process. In the study, humans solve instances of the knapsack problem with the help of machine learning predictions. This is an ideal setting because we can see how ML models learn to imitate human decisions and how this learning process converges to a stable point. We find that for many levels of ML performance, humans can improve the ML predictions to dynamically reach an equilibrium performance that is around 92% of the maximum knapsack value. We also find that the equilibrium performance could be even higher if humans rationally followed the ML recommendations. Finally, we test whether monetary incentives can increase the quality of human decisions, but we fail to find any positive effect. Our results have practical implications for the deployment of ML models in contexts where human decisions may deviate from the indisputable ground truth. |
Date: | 2024–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2405.13753&r= |
By: | Hongyang Yang; Boyu Zhang; Neng Wang; Cheng Guo; Xiaoli Zhang; Likun Lin; Junlin Wang; Tianyu Zhou; Mao Guan; Runjia Zhang; Christina Dan Wang |
Abstract: | As financial institutions and professionals increasingly incorporate Large Language Models (LLMs) into their workflows, substantial barriers, including proprietary data and specialized knowledge, persist between the finance sector and the AI community. These challenges impede the AI community's ability to enhance financial tasks effectively. Acknowledging financial analysis's critical role, we aim to devise financial-specialized LLM-based toolchains and democratize access to them through open-source initiatives, promoting wider AI adoption in financial decision-making. In this paper, we introduce FinRobot, a novel open-source AI agent platform supporting multiple financially specialized AI agents, each powered by LLM. Specifically, the platform consists of four major layers: 1) the Financial AI Agents layer that formulates Financial Chain-of-Thought (CoT) by breaking sophisticated financial problems down into logical sequences; 2) the Financial LLM Algorithms layer dynamically configures appropriate model application strategies for specific tasks; 3) the LLMOps and DataOps layer produces accurate models by applying training/fine-tuning techniques and using task-relevant data; 4) the Multi-source LLM Foundation Models layer that integrates various LLMs and enables the above layers to access them directly. Finally, FinRobot provides hands-on for both professional-grade analysts and laypersons to utilize powerful AI techniques for advanced financial analysis. We open-source FinRobot at \url{https://github.com/AI4Finance-Found ation/FinRobot}. |
Date: | 2024–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2405.14767&r= |
By: | Raeid Saqur; Ken Kato; Nicholas Vinden; Frank Rudzicz |
Abstract: | We introduce and make publicly available the NIFTY Financial News Headlines dataset, designed to facilitate and advance research in financial market forecasting using large language models (LLMs). This dataset comprises two distinct versions tailored for different modeling approaches: (i) NIFTY-LM, which targets supervised fine-tuning (SFT) of LLMs with an auto-regressive, causal language-modeling objective, and (ii) NIFTY-RL, formatted specifically for alignment methods (like reinforcement learning from human feedback (RLHF)) to align LLMs via rejection sampling and reward modeling. Each dataset version provides curated, high-quality data incorporating comprehensive metadata, market indices, and deduplicated financial news headlines systematically filtered and ranked to suit modern LLM frameworks. We also include experiments demonstrating some applications of the dataset in tasks like stock price movement and the role of LLM embeddings in information acquisition/richness. The NIFTY dataset along with utilities (like truncating prompt's context length systematically) are available on Hugging Face at https://huggingface.co/datasets/raeidsaq ur/NIFTY. |
Date: | 2024–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2405.09747&r= |
By: | Daniel Aromí (IIEP UBA-Conicet/FCE UBA); Daniel Heymann (IIEP UBA-Conicet/FCE UBA) |
Abstract: | We propose a method to generate “synthetic surveys” that shed light on policymakers’ perceptions and narratives. This exercise is implemented using 80 time-stamped Large Language Models (LLMs) fine-tuned with FOMC meetings’ transcripts. Given a text input, finetuned models identify highly likely responses for the corresponding FOMC meeting. We evaluate this tool in three different tasks: sentiment analysis, evaluation of transparency in Central Bank communication and characterization of policymaking narratives. Our analysis covers the housing bubble and the subsequent Great Recession (2003-2012). For the first task, LLMs are prompted to generate phrases that describe economic conditions. The resulting output is verified to transmit policymakers’ information regarding macroeconomic and financial dynamics. To analyze transparency, we compare the content of each FOMC minutes to content generated synthetically through the corresponding fine-tuned LLM. The evaluation suggests the tone of each meeting is transmitted adequately by the corresponding minutes. In the third task, we show LLMs produce insightul depictions of evolving policymaking narratives. Thisanalysis reveals relevant narratives’ features such as goals, perceived threats, identified macroeconomic drivers, categorizations of the state of the economy and manifestations of emotional states. |
Keywords: | Monetary policy, large language models, narratives, transparency. |
Date: | 2024–05 |
URL: | https://d.repec.org/n?u=RePEc:aoz:wpaper:323&r= |
By: | Knight, Simon (University of Technology Sydney) |
Abstract: | Objectives: There have been recent calls for new ethics guidelines regarding the use of artificial intelligence in research. How should we go about developing such ethics guidance documents with respect to emerging contexts such as new technologies, and established domains such as research in education? This paper provides a PRISMA-ETHICS informed scoping review of approaches to ethics guideline development, the structures of ethics guidelines, and their audiences and purposes particularly in the context of education and AI. Search and synthesis approach: A broad search of scholarly and grey literature was conducted to identify both ethics guidelines and material discussing their development; n = 592 distinct items were identified, including 182 that identified via recent reviews of AI ethics guidelines. n = 47 guideline-sets were identified as meeting our criteria as ‘guidelines’. Data extraction and analysis: Guidelines were analysed with respect to their development approach, audience and purpose, and structural elements through which guidance is delivered; most included statements regarding their development approach (79%) and audience (72%) typically in 1-2 paragraphs in the introduction. Where evidence underpinning the guidance was discussed, it was largely at a global content level (69%), rather than with respect to the specific context/domain of the guideline use, principles drawn on, or approaches and strategies one might adopt in navigating ethical issues (23, 29, and 21% respectively). Consultation with stakeholders and experts were the most common forms of evidence. Across the guidelines there are commonalities in the majority including: an overview statement of the topic, audience, and guideline purpose; an indication of rights or license describing reuse conditions; an overview of the ethical concepts and their detailed elaboration; challenging cases or edge issues; and approaches or strategies one might adopt to navigate these. However, only the first element (overview) was present in all guidelines, a finding born out in the further analysis of items relating specifically to AI and education. Recommendations regarding the development of ethics guidelines, and their structure are provided. Funding: The work was supported through internal funding providing release of the author’s time. Systematic review registration: The review was not pre-registered. |
Date: | 2024–05–08 |
URL: | http://d.repec.org/n?u=RePEc:osf:osfxxx:n43d6&r= |