nep-ain New Economics Papers
on Artificial Intelligence
Issue of 2025–11–24
seventeen papers chosen by
Ben Greiner, Wirtschaftsuniversität Wien


  1. Making Talk Cheap: Generative AI and Labor Market Signaling By Anais Galdin; Jesse Silbert
  2. The Use of Generative Artificial Intelligence in Research By Athina Karvounaraki; Alexis Stevenson; Isabelle Labrosse; David Campbell; Henrik Karlstrøm; Eric Iversen; Lili Wang; Ad Notten
  3. Misaligned by Design: Incentive Failures in Machine Learning By David Autor; Andrew Caplin; Daniel Martin; Philip Marx
  4. Algorithmic Advice as a Strategic Signal on Competitive Markets By Tobias R. Rebholz; Maxwell Uphoff; Christian H. R. Bernges; Florian Scholten
  5. AI Images, Labels and News Demand By Maja Adena; Eleonora Alabrese; Francesco Capozza; Isabelle Leader
  6. Why Do Civil Servants Delegate Empathic Engagement with Clients to Artificial Intelligence Systems? Insights from a Discrete Choice Experiment By König, Pascal; Weißmüller, Kristina Sabrina
  7. The Effects of Artificial Intelligence on Jobs: Evidence from an AI Subsidy Program By Hellsten, Mark; Khanna, Shantanu; Lodefalk, Magnus; Yakymovych, Yaroslav
  8. Labor Demand in the Age of Generative AI : Early Evidence from the U.S. Job Posting Data By Liu, Yan; Wang, He; Yu, Shu
  9. Prudential Reliability of Large Language Models in Reinsurance: Governance, Assurance, and Capital Efficiency By Stella C. Dong
  10. The LLM Pro Finance Suite: Multilingual Large Language Models for Financial Applications By Ga\"etan Caillaut; Raheel Qader; Jingshu Liu; Mariam Nakhl\'e; Arezki Sadoune; Massinissa Ahmim; Jean-Gabriel Barthelemy
  11. Measuring and Mitigating Racial Disparities in Large Language Model Mortgage Underwriting By Don; S. Bowen; McKay Price; Luke Stein; Ke Yang
  12. AI Investment and Firm Productivity: How Executive Demographics Drive Technology Adoption and Performance in Japanese Enterprises By Kikuchi, Tatsuru
  13. Information Extraction from Fiscal Documents using LLMs By Vikram Aggarwal; Jay Kulkarni; Aakriti Narang; Aditi Mascarenhas; Siddarth Raman; Ajay Shah; Susan Thomas
  14. Measuring economic outlook in the news timely and efficiently By Elliot Beck; Franziska Eckert; Linus K\"uhne; Helge Liebert; Rina Rosenblatt-Wisch
  15. Reasoning on Time-Series for Financial Technical Analysis By Kelvin J. L. Koa; Jan Chen; Yunshan Ma; Huanhuan Zheng; Tat-Seng Chua
  16. Potential Applications of Generative AI in Economic Simulations By Yusuke Takahashi; Kazuki Otaka; Naoya Kato
  17. Generative Agents and Expectations: Do LLMs Align with Heterogeneous Agent Models? By Filippo Gusella; Eugenio Vicario

  1. By: Anais Galdin; Jesse Silbert
    Abstract: Large language models (LLMs) like ChatGPT have significantly lowered the cost of producing written content. This paper studies how LLMs, through lowering writing costs, disrupt markets that traditionally relied on writing as a costly signal of quality (e.g., job applications, college essays). Using data from Freelancer.com, a major digital labor platform, we explore the effects of LLMs' disruption of labor market signaling on equilibrium market outcomes. We develop a novel LLM-based measure to quantify the extent to which an application is tailored to a given job posting. Taking the measure to the data, we find that employers have a high willingness to pay for workers with more customized applications in the period before LLMs are introduced, but not after. To isolate and quantify the effect of LLMs' disruption of signaling on equilibrium outcomes, we develop and estimate a structural model of labor market signaling, in which workers invest costly effort to produce noisy signals that predict their ability in equilibrium. We use the estimated model to simulate a counterfactual equilibrium in which LLMs render written applications useless in signaling workers' ability. Without costly signaling, employers are less able to identify high-ability workers, causing the market to become significantly less meritocratic: compared to the pre-LLM equilibrium, workers in the top quintile of the ability distribution are hired 19% less often, workers in the bottom quintile are hired 14% more often.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.08785
  2. By: Athina Karvounaraki (European Commission); Alexis Stevenson (European Commission); Isabelle Labrosse (Science-Metrix); David Campbell (Science-Metrix); Henrik Karlstrøm (NIFU); Eric Iversen (NIFU); Lili Wang (UNU/MERIT, Maastricht University); Ad Notten (UNU/MERIT, Maastricht University)
    Abstract: The study examines the surge in GenAI chatbot mentions in scientific literature, showing a 13-fold increase from November 2022 to December 2023. The use of GenAI chatbots in scientific research is mainly in ICT and Applied Sciences, where AI improves research efficiency. Key applications include writing and practical implementation, demonstrating the tool's widespread use in academic writing and research. Nonetheless, the increasing use of AI in research and academia raises concerns about quality assurance and trust issues.
    Keywords: Generative AI, Research, Scientific Literature, Chatbots, ICT, Applied Sciences, Academic Writing, Quality Assurance, Trust Issues
    JEL: O32 O38 C18
    Date: 2025–06
    URL: https://d.repec.org/n?u=RePEc:eug:wpaper:ki-01-25-084-en-n
  3. By: David Autor; Andrew Caplin; Daniel Martin; Philip Marx
    Abstract: The cost of error in many high-stakes settings is asymmetric: misdiagnosing pneumonia when absent is an inconvenience, but failing to detect it when present can be life-threatening. Because of this, artificial intelligence (AI) models used to assist such decisions are frequently trained with asymmetric loss functions that incorporate human decision-makers' trade-offs between false positives and false negatives. In two focal applications, we show that this standard alignment practice can backfire. In both cases, it would be better to train the machine learning model with a loss function that ignores the human's objective and then adjust predictions ex post according to that objective. We rationalize this result using an economic model of incentive design with endogenous information acquisition. The key insight from our theoretical framework is that machine classifiers perform not one but two incentivized tasks: choosing how to classify and learning how to classify. We show that while the adjustments engineers use correctly incentivize choosing, they can simultaneously reduce the incentives to learn. Our formal treatment of the problem reveals that methods embraced for their intuitive appeal can in fact misalign human and machine objectives in predictable ways.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.07699
  4. By: Tobias R. Rebholz; Maxwell Uphoff; Christian H. R. Bernges; Florian Scholten
    Abstract: As algorithms increasingly mediate competitive decision-making, their influence extends beyond individual outcomes to shaping strategic market dynamics. In two preregistered experiments, we examined how algorithmic advice affects human behavior in classic economic games with unique, non-collusive, and analytically traceable equilibria. In Experiment 1 (N = 107), participants played a Bertrand price competition with individualized or collective algorithmic recommendations. Initially, collusively upward-biased advice increased prices, particularly when individualized, but prices gradually converged toward equilibrium over the course of the experiment. However, participants avoided setting prices above the algorithm's recommendation throughout the experiment, suggesting that advice served as a soft upper bound for acceptable prices. In Experiment 2 (N = 129), participants played a Cournot quantity competition with equilibrium-aligned or strategically biased algorithmic recommendations. Here, individualized equilibrium advice supported stable convergence, whereas collusively downward-biased advice led to sustained underproduction and supracompetitive profits - hallmarks of tacit collusion. In both experiments, participants responded more strongly and consistently to individualized advice than collective advice, potentially due to greater perceived ownership of the former. These findings demonstrate that algorithmic advice can function as a strategic signal, shaping coordination even without explicit communication. The results echo real-world concerns about algorithmic collusion and underscore the need for careful design and oversight of algorithmic decision-support systems in competitive environments.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.09454
  5. By: Maja Adena; Eleonora Alabrese; Francesco Capozza; Isabelle Leader
    Abstract: We test whether AI-generated news images affect outlet demand and trust. In a pre-registered experiment with 2, 870 UK adults, the same article was paired with a wire-service photo (with/without credit) or a matched AI image (with/without label). Average newsletter demand changes little. Ex-post photo origin recollection is poor, and many believe even the real photo is synthetic. Beliefs drive behavior: thinking the image is AI cuts demand and perceived outlet quality by about 10 p.p., even when the photo is authentic; believing it is real has the opposite effect. Labels modestly reduce penalties but do little to correct mistaken attributions.
    Keywords: AI, demand for news, trust, online experiment
    JEL: C81 C93 D83
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:ces:ceswps:_12277
  6. By: König, Pascal; Weißmüller, Kristina Sabrina (Vrije Universiteit Amsterdam)
    Abstract: What factors lie behind bureaucrats’ readiness to delegate client interactions that commonly involve human empathy to artificial intelligence (AI) systems? Such delegation entails a crucial trade-off as it may reduce workload but simultaneously introduces inauthentic empathic engagement in citizen-state relations, which may undermine the moral integrity of public administration (PA). Drawing on bureaucratic legitimacy theory, this study tests the impact of efficiency gains, AI features, and organizational norms on civil servants’ willingness to delegate citizen engagement to AI. Findings from a pre-registered discrete choice experiment conducted with 300 active German civil servants (Obs.=3, 000) show that while efficiency gains and norms do have some impact, utilitarian considerations concerning AI’s ability to serve clients well are clearly the most important motivator. The findings show that the acceptability of delegating empathic engagement with citizens to AI can be tied to key dimensions of bureaucratic legitimacy, and provide novel evidence that the delegation of counselling to AI in PA is more strongly linked with public service motivation rather than self-serving efficiency gains. These insights advance theory and inform responsible and client-centered use of AI in public bureaucracies.
    Date: 2025–11–21
    URL: https://d.repec.org/n?u=RePEc:osf:socarx:v9nj3_v1
  7. By: Hellsten, Mark (University of Tubingen); Khanna, Shantanu (Northeastern University); Lodefalk, Magnus (The Ratio Institute); Yakymovych, Yaroslav (Uppsala University)
    Abstract: Artificial intelligence (AI) is expected to reshape labor markets, yet causal evidence remains scarce. We exploit a novel Swedish subsidy program that encouraged small and mid-sized firms to adopt AI. Using a synthetic difference-in-differences design comparing awarded and non-awarded firms, we find that AI subsidies led to a sustained increase in job postings over five years, but with no statistically detectable change in employment. This pattern reflects hiring signals concentrated in AI occupations and white-collar roles. Our findings align with task-based models of automation, in which AI adoption reconfigures work and spurs demand for new skills, but hiring frictions and the need for complementary investments delay workforce expansion.
    Keywords: Artificial intelligence; Labor markets; Hiring; Task content; Technological change
    JEL: J23 J24 O33
    Date: 2025–11–14
    URL: https://d.repec.org/n?u=RePEc:hhs:ratioi:0386
  8. By: Liu, Yan; Wang, He; Yu, Shu
    Abstract: This paper examines the causal impact of generative artificial intelligence on U.S. labor demand using online job posting data. Exploiting ChatGPT’s release in November 2022 as an exogenous shock, the paper applies difference-in-differences and event study designs to estimate the job displacement effects of generative artificial intelligence. The identification strategy compares labor demand for occupations with high versus low artificial intelligence substitution vulnerability following ChatGPT’s launch, conditioning on similar generative artificial intelligence exposure levels to isolate substitution effects from complementary uses. The analysis uses 285 million job postings collected by Lightcast from the first quarter of 2018 to the second quarter of 2025Q2. The findings show that the number of postings for occupations with above-median artificial intelligence substitution scores fell by an average of 12 percent relative to those with below-median scores. The effect increased from 6 percent in the first year after the launch to 18 percent by the third year. Losses were particularly acute for entry-level positions that require neither advanced degrees (18 percent) nor extensive experience (20 percent), as well as those in administrative support (40 percent) and professional services (30 percent). Although generative artificial intelligence generates new occupations and enhances productivity, which may increase labor demand, early evidence suggests th at some occupations may be less likely to be complemented by generative artificial intelligence than others.
    Date: 2025–11–18
    URL: https://d.repec.org/n?u=RePEc:wbk:wbrwps:11263
  9. By: Stella C. Dong
    Abstract: This paper develops a prudential framework for assessing the reliability of large language models (LLMs) in reinsurance. A five-pillar architecture--governance, data lineage, assurance, resilience, and regulatory alignment--translates supervisory expectations from Solvency II, SR 11-7, and guidance from EIOPA (2025), NAIC (2023), and IAIS (2024) into measurable lifecycle controls. The framework is implemented through the Reinsurance AI Reliability and Assurance Benchmark (RAIRAB), which evaluates whether governance-embedded LLMs meet prudential standards for grounding, transparency, and accountability. Across six task families, retrieval-grounded configurations achieved higher grounding accuracy (0.90), reduced hallucination and interpretive drift by roughly 40%, and nearly doubled transparency. These mechanisms lower informational frictions in risk transfer and capital allocation, showing that existing prudential doctrines already accommodate reliable AI when governance is explicit, data are traceable, and assurance is verifiable.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.08082
  10. By: Ga\"etan Caillaut; Raheel Qader; Jingshu Liu; Mariam Nakhl\'e; Arezki Sadoune; Massinissa Ahmim; Jean-Gabriel Barthelemy
    Abstract: The financial industry's growing demand for advanced natural language processing (NLP) capabilities has highlighted the limitations of generalist large language models (LLMs) in handling domain-specific financial tasks. To address this gap, we introduce the LLM Pro Finance Suite, a collection of five instruction-tuned LLMs (ranging from 8B to 70B parameters) specifically designed for financial applications. Our approach focuses on enhancing generalist instruction-tuned models, leveraging their existing strengths in instruction following, reasoning, and toxicity control, while fine-tuning them on a curated, high-quality financial corpus comprising over 50% finance-related data in English, French, and German. We evaluate the LLM Pro Finance Suite on a comprehensive financial benchmark suite, demonstrating consistent improvement over state-of-the-art baselines in finance-oriented tasks and financial translation. Notably, our models maintain the strong general-domain capabilities of their base models, ensuring reliable performance across non-specialized tasks. This dual proficiency, enhanced financial expertise without compromise on general abilities, makes the LLM Pro Finance Suite an ideal drop-in replacement for existing LLMs in financial workflows, offering improved domain-specific performance while preserving overall versatility. We publicly release two 8B-parameters models to foster future research and development in financial NLP applications: https://huggingface.co/collections/Drago nLLM/llm-open-finance.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.08621
  11. By: Don; S. Bowen; McKay Price; Luke Stein; Ke Yang
    Abstract: We conduct the first study exploring the application of large language models (LLMs) to mortgage underwriting, using an audit study design that combines real loan application data with experimentally manipulated race and credit scores. First, we find that LLMs systematically recommend more denials and higher interest rates for Black applicants than otherwise-identical white applicants. These racial disparities are largest for lower-credit-score applicants and riskier loans, and exist across multiple generations of LLMs developed by three leading firms. Second, we identify a straightforward and effective mitigation strategy: Simply instructing the LLM to make unbiased decisions. Doing so eliminates the racial approval gap and significantly reduces interest rate disparities. Finally, we show LLM recommendations correlate strongly with realworld lender decisions, even without fine-tuning, specialized training, macroeconomic context, or extensive application data. Our findings have important implications for financial firms exploring LLM applications and regulators overseeing AI’s rapidly expanding role in finance.
    Keywords: Artificial Intelligence; Fair Lending; Mortgage Underwriting; Racial Bias
    JEL: R3
    Date: 2025–01–01
    URL: https://d.repec.org/n?u=RePEc:arz:wpaper:eres2025_75
  12. By: Kikuchi, Tatsuru
    Abstract: This paper investigates how executive demographics—particularly age and gender—influence artificial intelligence (AI) investment decisions and subsequent firm productivity using comprehensive data from over 500 Japanese enterprises spanning 2018-2023. Our central research question addresses the role of executive characteristics in technology adoption, finding that CEO age and technical background significantly predict AI investment propensity. Employing these demographic characteristics as instrumental variables to address endogeneity concerns, we identify a statistically significant 2.4\% increase in total factor productivity attributable to AI investment adoption. Our novel mechanism decomposition framework reveals that productivity gains operate through three distinct channels: cost reduction (40\% of total effect), revenue enhancement (35\%), and innovation acceleration (25\%). The results demonstrate that younger executives (below 50 years) are 23\% more likely to adopt AI technologies, while firm size significantly moderates this relationship. Aggregate projections suggest potential GDP impacts of ¥1.15 trillion from widespread AI adoption across the Japanese economy. These findings provide crucial empirical guidance for understanding the human factors driving digital transformation and inform both corporate governance and public policy regarding AI investment incentives.
    Keywords: Artificial Intelligence, Executive Demographics, Technology Adoption, Productivity, Digital Transformation
    JEL: D24 L25 M12 O33 O47
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:pra:mprapa:126734
  13. By: Vikram Aggarwal (Google); Jay Kulkarni (xKDR Forum); Aakriti Narang (xKDR Forum); Aditi Mascarenhas (xKDR Forum); Siddarth Raman (xKDR Forum); Ajay Shah (xKDR Forum); Susan Thomas (xKDR Forum)
    Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities in text comprehension, but their ability to process complex, hierarchical tabular data remains underexplored. We present a novel approach to extracting structured data from multi-page government fiscal documents using LLM-based techniques. Applied to large annual fiscal documents from the State of Karnataka in India, our method achieves high accuracy through a multi-stage pipeline that leverages domain knowledge, sequential context, and algorithmic validation. Traditional OCR methods work poorly with errors that are hard to detect. The inherent structure of fiscal tables, with totals at each level of the hierarchy, allows for robust internal validation of the extracted data. We use these hierarchical relationships to create multi-level validation checks. We demonstrate that LLMs can read tables and also process document-specific structural hierarchies, offering a scalable process for converting PDF-based fiscal disclosures into research-ready databases. Our implementation shows promise for broader applications across developing country contexts.
    JEL: H6 H7 Y10
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:anf:wpaper:43
  14. By: Elliot Beck; Franziska Eckert; Linus K\"uhne; Helge Liebert; Rina Rosenblatt-Wisch
    Abstract: We introduce a novel indicator that combines machine learning and large language models with traditional statistical methods to track sentiment regarding the economic outlook in Swiss news. The indicator is interpretable and timely, and it significantly improves the accuracy of GDP growth forecasts. Our approach is resource-efficient, modular, and offers a way of benefitting from state-of-the-art large language models even if data are proprietary and cannot be stored or analyzed on external infrastructure - a restriction faced by many central banks and public institutions.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.04299
  15. By: Kelvin J. L. Koa; Jan Chen; Yunshan Ma; Huanhuan Zheng; Tat-Seng Chua
    Abstract: While Large Language Models have been used to produce interpretable stock forecasts, they mainly focus on analyzing textual reports but not historical price data, also known as Technical Analysis. This task is challenging as it switches between domains: the stock price inputs and outputs lie in the time-series domain, while the reasoning step should be in natural language. In this work, we introduce Verbal Technical Analysis (VTA), a novel framework that combine verbal and latent reasoning to produce stock time-series forecasts that are both accurate and interpretable. To reason over time-series, we convert stock price data into textual annotations and optimize the reasoning trace using an inverse Mean Squared Error (MSE) reward objective. To produce time-series outputs from textual reasoning, we condition the outputs of a time-series backbone model on the reasoning-based attributes. Experiments on stock datasets across U.S., Chinese, and European markets show that VTA achieves state-of-the-art forecasting accuracy, while the reasoning traces also perform well on evaluation by industry experts.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.08616
  16. By: Yusuke Takahashi (Bank of Japan); Kazuki Otaka (Bank of Japan); Naoya Kato (Bank of Japan)
    Abstract: In this article, we present some preliminary analyses in which Large Language Models (LLMs) are used as economic agents in simulations, as an example of utilizing Generative AI in economic analysis. Existing research reports that Generative AI provides responses consistent with predictions suggested in fields like behavioral economics. There are also some studies which have applied Agent-Based Models (ABM) by treating Generative AI as "players" in a market. However, even though Generative AI exhibits behavior similar to actual economic agents, in reality, it is merely outputting statistically consistent responses based on patterns found in its training data. Therefore, whether the results of simulations that treat Generative AI as economic agents are consistent with economic theory depends crucially on the AI's training data. In this article, we conduct simple ABM simulations to demonstrate how Generative AI can be applied, and examine whether its responses are aligned with intuition and economic theory. Our results are consistent with economic theory: (1) consumers adjust their spending in response to real wage fluctuations; and (2) firms find it easier to pass costs on to consumers in a monopoly market compared to a duopoly market. We conclude that it is necessary to continue verifying through other economic analyses whether simulations using Generative AI consistently lead to conclusions congruent with economic theory.
    Keywords: Generative AI; Agent-Based Model; Consumer Behavior; Price Setting Behavior
    JEL: C63 D11 D40
    Date: 2025–11–13
    URL: https://d.repec.org/n?u=RePEc:boj:bojlab:lab25e01
  17. By: Filippo Gusella; Eugenio Vicario
    Abstract: Results in the Heterogeneous Agent Model (HAM) literature determine the proportion of fundamentalists and trend followers in the financial market. This proportion varies according to the periods analyzed. In this paper, we use a large language model (LLM) to construct a generative agent (GA) that determines the probability of adopting one of the two strategies based on current information. The probabilities of strategy adoption are compared with those in the HAM literature for the S\&P 500 index between 1990 and 2020. Our findings suggest that the resulting artificial intelligence (AI) expectations align with those reported in the HAM literature. At the same time, extending the analysis to artificial market data helps us to filter the decision-making process of the AI agent. In the artificial market, results confirm the heterogeneity in expectations but reveal systematic asymmetry toward the fundamentalist behavior.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.08604

This nep-ain issue is ©2025 by Ben Greiner. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.