nep-ain New Economics Papers
on Artificial Intelligence
Issue of 2025–03–17
twelve papers chosen by
Ben Greiner, Wirtschaftsuniversität Wien


  1. How Humans Help LLMs: Assessing and Incentivizing Human Preference Annotators By Shang Liu; Hanzhao Wang; Zhongyao Ma; Xiaocheng Li
  2. Causal Inference on Outcomes Learned from Text By Iman Modarressi; Jann Spiess; Amar Venugopal
  3. AI Reliance and Decision Quality: Fundamentals, Interdependence, and the Effects of Interventions By Schoeffer, Jakob; Jakubik, Johannes; Vössing, Michael; Kühl, Niklas; Satzger, Gerhard
  4. Assessing Generative AI value in a public sector context: evidence from a field experiment By Trevor Fitzpatrick; Seamus Kelly; Patrick Carey; David Walsh; Ruairi Nugent
  5. The Labor Market Impact of Digital Technologies By Sangmin Aum; Yongseok Shin
  6. Wikipedia Contributions in the Wake of ChatGPT By Liang Lyu; James Siderius; Hannah Li; Daron Acemoglu; Daniel Huttenlocher; Asuman Ozdaglar
  7. The amplifier effect of artificial agents in social contagion By Eric Hitz; Mingmin Feng; Radu Tanase; Ren\'e Algesheimer; Manuel S. Mariani
  8. FinRL-DeepSeek: LLM-Infused Risk-Sensitive Reinforcement Learning for Trading Agents By Mostapha Benhenda
  9. Dynamic spillovers and investment strategies across artificial intelligence ETFs, artificial intelligence tokens, and green markets By Ying-Hui Shao; Yan-Hong Yang; Wei-Xing Zhou
  10. LLM Knows Geometry Better than Algebra: Numerical Understanding of LLM-Based Agents in A Trading Arena By Tianmi Ma; Jiawei Du; Wenxin Huang; Wenjie Wang; Liang Xie; Xian Zhong; Joey Tianyi Zhou
  11. Shifting Power: Leveraging LLMs to Simulate Human Aversion in ABMs of Bilateral Financial Exchanges, A bond market study By Alicia Vidler; Toby Walsh
  12. Artificial General Intelligence and the End of Human Employment: The Need to Renegotiate the Social Contract By Pascal Stiefenhofer

  1. By: Shang Liu; Hanzhao Wang; Zhongyao Ma; Xiaocheng Li
    Abstract: Human-annotated preference data play an important role in aligning large language models (LLMs). In this paper, we investigate the questions of assessing the performance of human annotators and incentivizing them to provide high-quality annotations. The quality assessment of language/text annotation faces two challenges: (i) the intrinsic heterogeneity among annotators, which prevents the classic methods that assume the underlying existence of a true label; and (ii) the unclear relationship between the annotation quality and the performance of downstream tasks, which excludes the possibility of inferring the annotators' behavior based on the model performance trained from the annotation data. Then we formulate a principal-agent model to characterize the behaviors of and the interactions between the company and the human annotators. The model rationalizes a practical mechanism of a bonus scheme to incentivize annotators which benefits both parties and it underscores the importance of the joint presence of an assessment system and a proper contract scheme. From a technical perspective, our analysis extends the existing literature on the principal-agent model by considering a continuous action space for the agent. We show the gap between the first-best and the second-best solutions (under the continuous action space) is of $\Theta(1/\sqrt{n \log n})$ for the binary contracts and $\Theta(1/n)$ for the linear contracts, where $n$ is the number of samples used for performance assessment; this contrasts with the known result of $\exp(-\Theta(n))$ for the binary contracts when the action space is discrete. Throughout the paper, we use real preference annotation data to accompany our discussions.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.06387
  2. By: Iman Modarressi; Jann Spiess; Amar Venugopal
    Abstract: We propose a machine-learning tool that yields causal inference on text in randomized trials. Based on a simple econometric framework in which text may capture outcomes of interest, our procedure addresses three questions: First, is the text affected by the treatment? Second, which outcomes is the effect on? And third, how complete is our description of causal effects? To answer all three questions, our approach uses large language models (LLMs) that suggest systematic differences across two groups of text documents and then provides valid inference based on costly validation. Specifically, we highlight the need for sample splitting to allow for statistical validation of LLM outputs, as well as the need for human labeling to validate substantive claims about how documents differ across groups. We illustrate the tool in a proof-of-concept application using abstracts of academic manuscripts.
    Date: 2025–03
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2503.00725
  3. By: Schoeffer, Jakob; Jakubik, Johannes; Vössing, Michael; Kühl, Niklas; Satzger, Gerhard
    Abstract: In AI-assisted decision-making, a central promise of having a human-in-the-loop is that they should be able to complement the AI system by overriding its wrong recommendations. In practice, however, we often see that humans cannot assess the correctness of AI recommendations and, as a result, adhere to wrong or override correct advice. Different ways of relying on AI recommendations have immediate, yet distinct, implications for decision quality. Unfortunately, reliance and decision quality are often inappropriately conflated in the current literature on AI-assisted decision-making. In this work, we disentangle and formalize the relationship between reliance and decision quality, and we characterize the conditions under which human-AI complementarity is achievable. To illustrate how reliance and decision quality relate to one another, we propose a visual framework and demonstrate its usefulness for interpreting empirical findings, including the effects of interventions like explanations. Overall, our research highlights the importance of distinguishing between reliance behavior and decision quality in AI-assisted decision-making.
    Date: 2025–02–02
    URL: https://d.repec.org/n?u=RePEc:osf:osfxxx:cekm9_v2
  4. By: Trevor Fitzpatrick; Seamus Kelly; Patrick Carey; David Walsh; Ruairi Nugent
    Abstract: The emergence of Generative AI (Gen AI) has motivated an interest in understanding how it could be used to enhance productivity across various tasks. We add to research results for the performance impact of Gen AI on complex knowledge-based tasks in a public sector setting. In a pre-registered experiment, after establishing a baseline level of performance, we find mixed evidence for two types of composite tasks related to document understanding and data analysis. For the Documents task, the treatment group using Gen AI had a 17% improvement in answer quality scores (as judged by human evaluators) and a 34% improvement in task completion time compared to a control group. For the Data task, we find the Gen AI treatment group experienced a 12% reduction in quality scores and no significant difference in mean completion time compared to the control group. These results suggest that the benefits of Gen AI may be task and potentially respondent dependent. We also discuss field notes and lessons learned, as well as supplementary insights from a post-trial survey and feedback workshop with participants.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.09479
  5. By: Sangmin Aum; Yongseok Shin
    Abstract: We investigate the impact of digital technology on employment patterns in Korea, where firms have rapidly adopted digital technologies such as artificial intelligence (AI), big data, and the internet of things (IoT). By exploiting regional variations in technology exposure, we find significant negative effects on high-skill and female workers, particularly those in non-IT (information technology) services. This contrasts with previous technological disruptions, such as the IT revolution and robotization, which primarily affected low-skill male workers in manufacturing. In IT services, although high-skill employment declined, vacancy postings for high-skill workers increased, implying a shift in labor demand toward newer skill sets. These findings highlight both the labor displacement and the new opportunities generated by digital transformation.
    JEL: J24 O33
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:nbr:nberwo:33469
  6. By: Liang Lyu; James Siderius; Hannah Li; Daron Acemoglu; Daniel Huttenlocher; Asuman Ozdaglar
    Abstract: How has Wikipedia activity changed for articles with content similar to ChatGPT following its introduction? We estimate the impact using differences-in-differences models, with dissimilar Wikipedia articles as a baseline for comparison, to examine how changes in voluntary knowledge contributions and information-seeking behavior differ by article content. Our analysis reveals that newly created, popular articles whose content overlaps with ChatGPT 3.5 saw a greater decline in editing and viewership after the November 2022 launch of ChatGPT than dissimilar articles did. These findings indicate heterogeneous substitution effects, where users selectively engage less with existing platforms when AI provides comparable content. This points to potential uneven impacts on the future of human-driven online knowledge contributions.
    Date: 2025–03
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2503.00757
  7. By: Eric Hitz; Mingmin Feng; Radu Tanase; Ren\'e Algesheimer; Manuel S. Mariani
    Abstract: Recent advances in artificial intelligence have led to the proliferation of artificial agents in social contexts, ranging from education to online social media and financial markets, among many others. The increasing rate at which artificial and human agents interact makes it urgent to understand the consequences of human-machine interactions for the propagation of new ideas, products, and behaviors in society. Across two distinct empirical contexts, we find here that artificial agents lead to significantly faster and wider social contagion. To this end, we replicate a choice experiment previously conducted with human subjects by using artificial agents powered by large language models (LLMs). We use the experiment's results to measure the adoption thresholds of artificial agents and their impact on the spread of social contagion. We find that artificial agents tend to exhibit lower adoption thresholds than humans, which leads to wider network-based social contagions. Our findings suggest that the increased presence of artificial agents in real-world networks may accelerate behavioral shifts, potentially in unforeseen ways.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.21037
  8. By: Mostapha Benhenda (LAGA)
    Abstract: This paper presents a novel risk-sensitive trading agent combining reinforcement learning and large language models (LLMs). We extend the Conditional Value-at-Risk Proximal Policy Optimization (CPPO) algorithm, by adding risk assessment and trading recommendation signals generated by a LLM from financial news. Our approach is backtested on the Nasdaq-100 index benchmark, using financial news data from the FNSPID dataset and the DeepSeek V3, Qwen 2.5 and Llama 3.3 language models. The code, data, and trading agents are available at: https://github.com/benstaf/FinRL_DeepSee k
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.07393
  9. By: Ying-Hui Shao; Yan-Hong Yang; Wei-Xing Zhou
    Abstract: This paper investigates the risk spillovers among AI ETFs, AI tokens, and green markets using the R2 decomposition method. We reveal several key insights. First, the overall transmission connectedness index (TCI) closely aligns with the contemporaneous TCI, while the lagged TCI is significantly lower. Second, AI ETFs and clean energy act as risk transmitters, whereas AI tokens and green bond function as risk receivers. Third, AI tokens are difficult to hedge and provide limited hedging ability compared to AI ETFs and green assets. However, multivariate portfolios effectively reduce AI tokens investment risk. Among them, the minimum correlation portfolio outperforms the minimum variance and minimum connectedness portfolios.
    Date: 2025–03
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2503.01148
  10. By: Tianmi Ma; Jiawei Du; Wenxin Huang; Wenjie Wang; Liang Xie; Xian Zhong; Joey Tianyi Zhou
    Abstract: Recent advancements in large language models (LLMs) have significantly improved performance in natural language processing tasks. However, their ability to generalize to dynamic, unseen tasks, particularly in numerical reasoning, remains a challenge. Existing benchmarks mainly evaluate LLMs on problems with predefined optimal solutions, which may not align with real-world scenarios where clear answers are absent. To bridge this gap, we design the Agent Trading Arena, a virtual numerical game simulating complex economic systems through zero-sum games, where agents invest in stock portfolios. Our experiments reveal that LLMs, including GPT-4o, struggle with algebraic reasoning when dealing with plain-text stock data, often focusing on local details rather than global trends. In contrast, LLMs perform significantly better with geometric reasoning when presented with visual data, such as scatter plots or K-line charts, suggesting that visual representations enhance numerical reasoning. This capability is further improved by incorporating the reflection module, which aids in the analysis and interpretation of complex data. We validate our findings on NASDAQ Stock dataset, where LLMs demonstrate stronger reasoning with visual data compared to text. Our code and data are publicly available at https://github.com/wekjsdvnm/Agent-Tradi ng-Arena.git.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.17967
  11. By: Alicia Vidler; Toby Walsh
    Abstract: Bilateral markets, such as those for government bonds, involve decentralized and opaque transactions between market makers (MMs) and clients, posing significant challenges for traditional modeling approaches. To address these complexities, we introduce TRIBE an agent-based model augmented with a large language model (LLM) to simulate human-like decision-making in trading environments. TRIBE leverages publicly available data and stylized facts to capture realistic trading dynamics, integrating human biases like risk aversion and ambiguity sensitivity into the decision-making processes of agents. Our research yields three key contributions: first, we demonstrate that integrating LLMs into agent-based models to enhance client agency is feasible and enriches the simulation of agent behaviors in complex markets; second, we find that even slight trade aversion encoded within the LLM leads to a complete cessation of trading activity, highlighting the sensitivity of market dynamics to agents' risk profiles; third, we show that incorporating human-like variability shifts power dynamics towards clients and can disproportionately affect the entire system, often resulting in systemic agent collapse across simulations. These findings underscore the emergent properties that arise when introducing stochastic, human-like decision processes, revealing new system behaviors that enhance the realism and complexity of artificial societies.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2503.00320
  12. By: Pascal Stiefenhofer
    Abstract: The emergence of Artificial General Intelligence (AGI) labor, including AI agents and autonomous systems operating at near-zero marginal cost, reduces the marginal productivity of human labor, ultimately pushing wages toward zero. As AGI labor and capital replace human workers, economic power shifts to capital owners, resulting in extreme wealth concentration, rising inequality, and reduced social mobility. The collapse of human wages causes aggregate demand to deteriorate, creating a paradox where firms produce more using AGI, yet fewer consumers can afford to buy goods. To prevent economic and social instability, new economic structures must emerge, such as Universal Basic Income (UBI), which redistributes AGI-generated wealth, public or cooperative AGI ownership, ensuring broader access to AI-driven profits, and progressive AGI capital taxation, which mitigates inequality and sustains aggregate demand. Addressing these challenges in form of renegotiation the Social Contract is crucial to maintaining economic stability in a post-labor economy.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.07050

This nep-ain issue is ©2025 by Ben Greiner. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.