|
on Artificial Intelligence |
| By: | Mert Demirer; Andrey Fradkin; Nadav Tadelis; Sida Peng |
| Abstract: | We document six facts about the structure and dynamics of the LLM market using API usage data from OpenRouter and Microsoft Azure. First, we show rapid growth in the number of models, creators, and inference providers, driven by open-source entrants. Second, we show price declines and persistent price heterogeneity across and within intelligence tiers, with open-source models being 90% cheaper than comparable closed-source models of the same intelligence. Third, we document market dynamism, with frequent turnover among leading models and creators. Fourth, we present evidence of horizontal and vertical differentiation, with no single model dominating across use cases, and demand for intelligence varying widely across applications. Fifth, we estimate preliminary short-run price elasticities just above one, suggesting limited scope for Jevons-Paradox effects. Finally, we show that although the share of firms that use multiple models increased over time, most firms concentrate their use on a single model, consistent with experimentation rather than persistent reliance on multiple models. |
| JEL: | L0 L10 |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:34608 |
| By: | Radosveta Ivanova-Stenzel (TU Berlin); Michel Tolksdorf (TU Berlin) |
| Abstract: | Despite the documented benefits of algorithmic decision-making, individuals often prefer to retain control rather than delegate decisions to AI agents. To what extent are the aversion to and distrust of algorithms rooted in a fundamental discomfort with giving up decision authority? Using two incentivized laboratory experiments across distinct decision domains, hiring (social decision-making) and forecasting (analytical decision-making), and decision architecture (nature and number of decisions), we elicit participants’ willingness to delegate decisions separately to an AI agent and a human agent. This within-subject design enables a direct comparison of delegation preferences across different agent types. We find that participants consistently underutilize both agents, even when informed of the agents’ superior performance. However, participants are more willing to delegate to the AI agent than to the human agent. Our results suggest that algorithm aversion may be driven less by distrust in AI and more by a general preference for decision autonomy. This implies that efforts to increase algorithm adoption should address broader concerns about control, rather than focusing solely on trust-building interventions. |
| Keywords: | algorithm; delegation; artificial intelligence; trust in ai; experiment; preferences; |
| JEL: | C72 C91 D44 D83 |
| Date: | 2025–12–22 |
| URL: | https://d.repec.org/n?u=RePEc:rco:dpaper:558 |
| By: | Ferraz, Vinícius; Olah, Tamas; Sazedul, Ratin; Schmidt, Robert; Schwieren, Christiane |
| Abstract: | We investigate if Large Language Models (LLMs) exhibit personality-driven strategic behavior in the Ultimatum Game by manipulating Dark Factor of Personality (D-Factor) profiles via standardized prompts. Across 400k decisions from 17 open-source models and 4, 166 human benchmarks, we test whether LLMs playing the proposer and responder roles exhibit systematic behavioral shifts across five D-Factor levels (from least to most selfish). The proposer role exhibited strong monotonic declines in fair offers from 91% (D1) to 17% (D5), mirroring human patterns but with 34% steeper gradients, indicating hypersensitivity to personality prompts. Responders diverged sharply: where humans became more punitive at higher D-levels, LLMs maintained high acceptance rates (75-92%) with weak or reversed D-Factor sensitivity, failing to reproduce reciprocity-punishment dynamics. These role-specific patterns align with strong-weak situation accounts—personality matters when incentives are ambiguous (proposers) but is muted when contingent (responders). Cross-model heterogeneity was substantial: Models exhibiting the closest alignment with human behavior, according to composite similarity scores (integrating prosocial rates, D-Factor correlations, and odds ratios), were dolphin3, deepseek_1.5b, and llama3.2 (0.74-0.85), while others exhibited extreme or non-variable behavior. Temperature settings (0.2 vs. 0.8) exerted minimal influence. We interpret these patterns as prompt-driven regularities rather than genuine motivational processes, suggesting LLMs can approximate but not fully replicate human strategic behavior in social dilemmas. |
| Date: | 2025–12–16 |
| URL: | https://d.repec.org/n?u=RePEc:awi:wpaper:0768 |
| By: | Mira Fischer (Federal Institute for Population Research, WZB Berlin, IZA - Institute of Labor Economics); Holger A. Rau (Georg-August-Universität Göttingen); Rainer Michael Rilke (WHU - Otto Beisheim School of Management) |
| Abstract: | We study how AI tutoring affects learning in higher education through a randomized experiment with 334 university students preparing for an incentivized exam. Students either received only textbook material, restricted access to an AI tutor requiring initial independent reading, or unrestricted access throughout the study period. AI tutor access raises test performance by 0.23 standard deviations relative to control. Surprisingly, unrestricted access significantly outperforms restricted access by 0.21 standard deviations, contradicting concerns about premature AI reliance. Behavioral analysis reveals that unrestricted access fosters gradual integration of AI support, while restricted access induces intensive bursts of prompting that disrupt learning flow. Benefits are heterogeneous: AI tutors prove most effective for students with lower baseline knowledge and stronger self-regulation skills, suggesting that seamless AI integration enhances learning when students can strategically combine independent study with targeted support. |
| Keywords: | ai tutors; large language models; self-regulated learning; higher education; |
| JEL: | C91 I21 D83 |
| Date: | 2025–12–22 |
| URL: | https://d.repec.org/n?u=RePEc:rco:dpaper:557 |
| By: | Storm, Eduard; Gonschor, Myrielle; Schmidt, Marc Justin |
| Abstract: | We study how artificial intelligence (AI) affects workers' earnings and employment stability, combining German job vacancy data with administrative records from 2017-2023. Identification comes from changes in workers' exposure to local AI skill demand over time, instrumented with national demand trends. We find no meaningful displacement or productivity effects on average, but notable skill heterogeneity: expert workers with deep domain knowledge gain while non-experts often lose, with returns shaped by occupational task structures. We also document AI-driven reinstatement effects toward analytic and interactive tasks that raise earnings. Overall, our results imply distributional concerns but also job-augmenting potential of early AI technologies. |
| Keywords: | AI, Online Job Vacancies, Skill Demand, Worker-level Analysis, Employment, Earnings, Expertise |
| JEL: | D22 J23 J24 J31 O33 |
| Date: | 2025 |
| URL: | https://d.repec.org/n?u=RePEc:zbw:rwirep:333893 |
| By: | Massfeller, Anna (University of Bonn); Hermann, Daniel; Leyens, Alexa; Storm, Hugo |
| Abstract: | The advancement of artificial intelligence (AI) technologies has the potential to improve farming efficiency globally, with decision support tools (DSTs) representing a particularly promising application. However, evidence from medical and financial domains reveals a user reluctance to accept AI-based recommendations, even when they outperform human alternatives. This is a phenomenon known as “algorithm aversion” (AA). This study is the first to examine this phenomenon in an agricultural setting. Drawing on survey data from a representative sample of 250 German farmers, we assessed farmers’ intention to use and their willingness-to-pay for DSTs for wheat fungicide application either based on AI or a human advisor. We implemented a novel Bayesian probabilistic programming workflow tailored to experimental studies, enabling a joint analysis that integrates an extended version of the unified theory of acceptance and use of technology with an economic experiment. Our results indicate that AA plays an important role in farmers’ decision-making. For most farmers, an AI-based DST must outperform a human advisor by 11–30% to be considered equally valuable. Similarly, an AI-based DST with equivalent performance must be 21–56% less expensive than the human advisor to be preferred. These findings signify the importance of examining AA as a cognitive bias that may hinder the adoption of promising AI technologies in agriculture. |
| Date: | 2025–12–07 |
| URL: | https://d.repec.org/n?u=RePEc:osf:socarx:54khv_v1 |
| By: | Adhikari, Ela; Lopez, Anitza; Lin, Lifeng (Florida State University); Li, Fiona |
| Abstract: | Background: Artificial intelligence (AI) is reshaping healthcare, presenting new opportunities in diagnostics, clinical decision support, workflow optimization, patient engagement, and population health. Yet important concerns remain about trust, transparency, bias, privacy, and the adequacy of existing regulatory frameworks. The objective of this study was to compare the perspectives of Healthcare Professionals (HCPs) and non-HCPs on the integration of AI in healthcare, with a focus on identifying perceived benefits, risks, ethical concerns, and barriers to adoption. Methods: We conducted an IRB-approved cross-sectional survey of adults aged ≥18, sampling both HCPs and non-HCPs. The questionnaire assessed perceived benefits and risks of AI, trust in AI systems, health bot applications, privacy and ethical concerns, regulatory priorities, and views on AI’s role in clinical decision-making. Responses from HCPs and non-HCPs were compared using descriptive statistics and group-level difference testing. Results: A total of 297 participants completed the survey, including 189 HCPs and 108 non- HCPs. Both groups expressed strong agreement that AI can improve efficiency, enhance access to care, support diagnosis, reduce medical errors, and aid early disease detection. However, trust in AI systems remained limited: nearly two-thirds of respondents expressed no confidence in AI’s ability to ensure privacy, safeguard data, or make unbiased ethical decisions. HCPs demonstrated greater emphasis on safety, accountability, transparency, and regulatory oversight, particularly in high-risk clinical environments, whereas non-HCPs were more likely to endorse shared responsibility when AI causes harm. Across groups, the majority believed that AI should serve primarily as an assistive tool, with humans retaining decision-making authority. Concerns about cost, infrastructure, and digital literacy were prominent barriers to equitable AI adoption. Conclusions: Despite recognizing AI’s potential benefits, both clinicians and the public remain cautious about its risks and ethical limitations. These findings highlight the need for robust governance, transparent design, targeted education, and human-centered approaches to promote trustworthy, safe, and equitable AI integration in healthcare. |
| Date: | 2025–12–06 |
| URL: | https://d.repec.org/n?u=RePEc:osf:socarx:q6gne_v1 |
| By: | Jean Xiao Timmerman |
| Abstract: | This paper examines the evolution of artificial intelligence (AI) patent rates (i.e., the number of AI patents/number of firms of the same type) and concentration metrics (i.e., the Herfindahl-Hirschman Index (HHI) and Gini coefficient) among financial market participants from 2000 to 2020. It documents the historical trajectories of AI innovation for regulated banking entities and less-regulated firms, revealing that nonfinancial companies exhibit the highest baseline AI patent rate, while banks show the highest growth in AI patent rate over time. Banks have the highest HHI, and nonfinancial companies have the highest Gini coefficient, suggesting that a small number of banks dominate AI innovation and the distribution of AI innovation at nonfinancial firms � though higher in number � is highly skewed toward a subset of players. These findings indicate that the AI technological gap between small and large banks may be widening and the diversity of nonfinancial companies serving as third-party AI service providers may be limited. |
| Keywords: | Artificial intelligence; Banking; Financial innovation; Patents; Regulatory perimeter; Technological change |
| JEL: | G21 G23 G28 O31 O33 |
| Date: | 2025–12–12 |
| URL: | https://d.repec.org/n?u=RePEc:fip:fedgfe:2025-104 |
| By: | Jeffrey Allen; Max S. Hatfield |
| Abstract: | We examined the performance of four families of large language models (LLMs) and a variety of common fuzzy matching algorithms in assessing the similarity of names and addresses in a sanctions screening context. On average, across a range of realistic matching thresholds, the LLMs in our study reduced sanctions screening false positives by 92 percent and increased detection rates by 11 percent relative to the best-performing fuzzy matching baseline. Smaller, less computationally intensive models from the same language model families performed comparably, which may support scaling. In terms of computing performance, the LLMs were, on average, over four orders of magnitude slower than the fuzzy methods. To help address this, we propose a model cascade that escalates higher uncertainty screening cases to LLMs, while relying on fuzzy and exact matching for easier cases. The cascade is nearly twice as fast and just as accurate as the pure LLM system. We show even stronger runtime gains and comparable screening accuracy by relying on the fastest language models within the cascade. In the near term, the economic cost of running LLMs, inference latency, and other frictions, including API limits, will likely necessitate using these types of tiered approaches for sanctions screening in high-velocity and high-throughput financial activities, such as payments. Sanctions screening in slower-moving processes, such as customer due diligence for account opening and lending, may be able to rely on LLMs more extensively. |
| Keywords: | Large Language Models; Sanctions Screening; Model cascading |
| Date: | 2025–09–29 |
| URL: | https://d.repec.org/n?u=RePEc:fip:fedgfe:2025-92 |
| By: | Zuoyou Jiang; Li Zhao; Rui Sun; Ruohan Sun; Zhongjian Li; Jing Li; Daxin Jiang; Zuo Bai; Cheng Hua |
| Abstract: | Signal decay and regime shifts pose recurring challenges for data-driven investment strategies in non-stationary markets. Conventional time-series and machine learning approaches, which rely primarily on historical correlations, often struggle to generalize when the economic environment changes. While large language models (LLMs) offer strong capabilities for processing unstructured information, their potential to support quantitative factor screening through explicit economic reasoning remains underexplored. Existing factor-based methods typically reduce alphas to numerical time series, overlooking the semantic rationale that determines when a factor is economically relevant. We propose Alpha-R1, an 8B-parameter reasoning model trained via reinforcement learning for context-aware alpha screening. Alpha-R1 reasons over factor logic and real-time news to evaluate alpha relevance under changing market conditions, selectively activating or deactivating factors based on contextual consistency. Empirical results across multiple asset pools show that Alpha-R1 consistently outperforms benchmark strategies and exhibits improved robustness to alpha decay. The full implementation and resources are available at https://github.com/FinStep-AI/Alpha-R1. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.23515 |
| By: | Muço, Arieda |
| Abstract: | Using Brazilian municipal audit reports, I construct an automated corruption index that combines a dictionary of audit irregularities with principal component analysis. The index validates strongly against independent human coders, explaining 71–73 % of the variation in hand-coded corruption counts in samples where coders themselves exhibit high agreement, and the results are robust within these validation samples. The index behaves as theory predicts, correlating with municipal characteristics that prior research links to corruption. Supervised learning alternatives yield nearly identical municipal rankings (R2=0.98), confirming that the dictionary approach captures the same underlying construct. The method scales to the full audit corpus and offers advantages over both manual coding and Large Language Models (LLMs) in transparency, cost, and long-run replicability. |
| Date: | 2025–12–12 |
| URL: | https://d.repec.org/n?u=RePEc:osf:socarx:cftvk_v1 |
| By: | Richard Mills (University of Nottingham); Stuart Mills (University of Leeds); Cass R. Sunstein (Harvard University) |
| Abstract: | Policymakers and regulators are increasingly interested in behavioural auditing tools to counteract manipulative designs in Online Choice Architecture (OCA). To date, auditing tools have been largely manual, creating a trade-off between time, cost, and scale. This article presents a tool called ‘ManipulationDetect’, an internet browser plug-in that uses AI to detect, highlight, and record potentially manipulative OCA techniques in real-time. We offer a technical overview of how ManipulationDetect works, present an example audit which demonstrates the tool’s advantages, and highlight important practical next steps for further development. |
| Keywords: | AI; tool; manipulation;OCA; policymakers; regulations |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:not:notcdx:2025-04 |