nep-ain New Economics Papers
on Artificial Intelligence
Issue of 2025–12–22
seven papers chosen by
Ben Greiner, Wirtschaftsuniversität Wien


  1. Foundation Priors By Sanjog Misra
  2. Mitigating Generative AI Hallucinations By Alessandro De Chiara; Ester Manna; Shubhranshu Singh
  3. The Economics of Professional Decision-Making: Can Artificial Intelligence Reduce Decision Uncertainty? By W Bentley MacLeod
  4. Artificial intelligence as a method of invention By Guillermo Arenas Díaz; Mariacristina Piva; Marco Vivarelli
  5. Automated data extraction from unstructured text using LLMs: A scalable workflow for Stata users By Loreta Isaraj
  6. This Candidate is [MASK]. Prompt-based Sentiment Extraction and Reference Letters By Slonimczyk, Fabian
  7. Artificial Intelligence for Detecting Price Surges Based on Network Features of Crypto Asset Transactions By Yuichi IKEDA; Hideaki AOYAMA; Tetsuo HATSUDA; Tomoyuki SHIRAI; Taro HASUI; Yoshimasa HIDAKA; Krongtum SANKAEWTONG; Hiroshi IYETOMI; Yuta YARAI; Abhijit CHAKRABORTY; Yasushi NAKAYAMA; Akihiro FUJIHARA; Pierluigi CESANA; Wataru SOUMA

  1. By: Sanjog Misra
    Abstract: Foundation models, and in particular large language models, can generate highly informative responses, prompting growing interest in using these ''synthetic'' outputs as data in empirical research and decision-making. This paper introduces the idea of a foundation prior, which shows that model-generated outputs are not as real observations, but draws from the foundation prior induced prior predictive distribution. As such synthetic data reflects both the model's learned patterns and the user's subjective priors, expectations, and biases. We model the subjectivity of the generative process by making explicit the dependence of synthetic outputs on the user's anticipated data distribution, the prompt-engineering process, and the trust placed in the foundation model. We derive the foundation prior as an exponential-tilted, generalized Bayesian update of the user's primitive prior, where a trust parameter governs the weight assigned to synthetic data. We then show how synthetic data and the associated foundation prior can be incorporated into standard statistical and econometric workflows, and discuss their use in applications such as refining complex models, informing latent constructs, guiding experimental design, and augmenting random-coefficient and partially linear specifications. By treating generative outputs as structured, explicitly subjective priors rather than as empirical observations, the framework offers a principled way to harness foundation models in empirical work while avoiding the conflation of synthetic ''facts'' with real data.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.01107
  2. By: Alessandro De Chiara (Universitat de Barcelona); Ester Manna (Universitat de Barcelona); Shubhranshu Singh (Carey Business School, Johns Hopkins University)
    Abstract: We theoretically investigate whether AI developers or AI operators should be liable for the harm the AI systems may cause when they hallucinate. We find that the optimal liability framework may vary over time, with the evolution of the AI technology, and that making the AI operators liable can be desirable only if it induces monitoring of the AI systems. We also highlight non-trivial relationships between welfare and reputational concerns, human supervision ability, and the accuracy of the technology. Our results have implications for regulatory design and business strategies.
    Keywords: AI hallucinations, AI liability, AI supervision
    JEL: K2 L51
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:ewp:wpaper:492web
  3. By: W Bentley MacLeod (Cowles Foundation for Research in Economics, Yale University)
    Abstract: This paper outlines an economic model that provides a framework for organising the growing literature on the performance of physicians and judges. The primary task of these professionals is to make decisions based on the information provided by their clients. The paper discusses professional decisions in terms of what Kahneman (2011) calls fast and slow decisions, known as System 1 and System 2 in cognitive science. Slow decisions correspond to the economist's model of rational choice, while System 1 (fast) decisions are high-speed, intuitive choices guided by training and human capital. This distinction is used to provide a model of decision-making under uncertainty based on Bewley (2011)'s theory of Knightian uncertainty to show that human values are an essential input to optimal choice. This, in turn, provides conditions under which artificial intelligence (AI) tools can assist professional decision-making, while pointing to cases where such tools need to explicitly incorporate human values in order to make better decisions.
    Date: 2025–12–01
    URL: https://d.repec.org/n?u=RePEc:cwl:cwldpp:2475
  4. By: Guillermo Arenas Díaz (Dipartimento di Politica Economica, DISCE, Università Cattolica del Sacro Cuore, Milano, Italy); Mariacristina Piva (Dipartimento di Politica Economica, DISCE, Università Cattolica del Sacro Cuore, Piacenza, Italy); Marco Vivarelli (Dipartimento di Politica Economica, DISCE, Università Cattolica del Sacro Cuore, Milano, Italy – UNU-MERIT, Maastricht, The Netherlands – IZA, Bonn, Germany - Global Labor Organization (GLO), Essen, Germany)
    Abstract: This study investigates the relationship between Artificial Intelligence (AI) and innovation inputs in Spanish manufacturing firms. While AI is increasingly recognized as a driver of productivity and economic growth, its role in shaping firms’ innovation strategies remains underexplored. Using firm-level data, our analysis focuses on whether AI complements innovation inputs - specifically R&D and Embodied Technological Change (ETC) - and whether AI can be considered as a Method of Invention, able to trigger subsequent innovation investments. Results show a positive association between AI adoption and both internal R&D and ETC, in a static and a dynamic framework. Furtheremore, empirical evidence also highlights heterogeneity, with important peculiarities affecting large vs small firms and high-tech vs low-tech companies. These findings suggest that AI may act as both a complement and a catalyst, depending on firm characteristics.
    Keywords: Artificial Intelligence, Method of Invention, R&D, Innovation Inputs, Innovative Complementarities
    JEL: O31 O32
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:ctc:serie5:dipe0052
  5. By: Loreta Isaraj (IRCrES-CNR)
    Abstract: In several data-rich domains such as finance, medicine, law, and scientific publishing, most of the valuable information is embedded in unstructured textual formats, from clinical notes and legal briefs to financial statements and research papers. These sources are rarely available in structured formats suitable for immediate quantitative analysis. This presentation introduces a scalable and fully integrated workflow that employs large language models (LLMs), specifically ChatGPT 4.0 via API, in conjunction with Python and Stata to extract structured variables from unstructured documents and make them ready for further statistical processing in Stata. As a representative use case, I demonstrate the extraction of information from a SOAP clinical note, treated as a typical example of unstructured medical documentation. The process begins with a single PDF and extends to an automated pipeline capable of batch-processing multiple documents, highlighting the scalability of this approach. The workflow involves PDF parsing and text preprocessing using Python, followed by prompt engineering designed to optimize the performance of the LLM. In particular, the temperature parameter is tuned to a low value (for example, 0.0–0.3) to promote deterministic and concise extraction, minimizing variation across similar documents and ensuring consistency in output structure. Once the LLM returns structured data, typically in JSON or CSV format, it is seamlessly imported into Stata using custom.do scripts that handle parsing (insheet), transformation (split, reshape), and data cleaning. The final dataset is used for exploratory or inferential analysis, with visualization and summary statistics executed entirely within Stata. The presentation also addresses critical considerations including the computationala cost of using commercial LLM APIs (token-based billing), privacy and compliance risks when processing sensitive data (such as patient records), and the potential for bias or hallucination inherent to generative models. To assess the reliability of the extraction process, I report evaluation metrics such as cosine similarity (for text alignment and summarization accuracy) and F1-score (for evaluating named entity and numerical field extraction). By bridging the capabilities of LLMs with Stata’s powerful analysis tools, this workflow equips researchers and analysts with an accessible method to unlock structured insights from complex unstructured sources, extending the reach of empirical research into previously inaccessible text-heavy datasets.
    Date: 2025–10–01
    URL: https://d.repec.org/n?u=RePEc:boc:isug25:13
  6. By: Slonimczyk, Fabian
    Abstract: I propose a relatively simple way to deploy pre-trained large language models (LLMs) in order to extract sentiment and other useful features from text data. The method, which I refer to as prompt-based sentiment extraction, offers multiple advantages over other methods used in economics and finance. In particular, it accepts the text input as is (without preprocessing) and produces a sentiment score that has a probability interpretation. Unlike other LLM-based approaches, it does not require any fine-tuning or labeled data. I apply my prompt-based strategy to a hand-collected corpus of confidential reference letters (RLs). I show that the sentiment contents of RLs are clearly reflected in job market outcomes. Candidates with higher average sentiment in their RLs perform markedly better regardless of the measure of success chosen. Moreover, I show that sentiment dispersion among letter writers negatively affects the job market candidate’s performance. I compare my sentiment extraction approach to other commonly used methods for sentiment analysis: ‘bag-of-words’ approaches, fine-tuned language models, and querying advanced chatbots. No other method can fully reproduce the results obtained by prompt-based sentiment extraction. Finally, I slightly modify the method to obtain ‘gendered’ sentiment scores (as in Eberhardt et al., 2023). I show that RLs written for female candidates emphasize ‘grindstone’ personality traits, whereas male candidates’ letters emphasize ‘standout’ traits. These gender differences negatively affect women’s job market outcomes.
    Keywords: Large language models; text data; sentiment analysis; reference letters
    JEL: C45 J16 M51
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:pra:mprapa:126675
  7. By: Yuichi IKEDA; Hideaki AOYAMA; Tetsuo HATSUDA; Tomoyuki SHIRAI; Taro HASUI; Yoshimasa HIDAKA; Krongtum SANKAEWTONG; Hiroshi IYETOMI; Yuta YARAI; Abhijit CHAKRABORTY; Yasushi NAKAYAMA; Akihiro FUJIHARA; Pierluigi CESANA; Wataru SOUMA
    Abstract: This study proposes an artificial intelligence framework to detect price surges in crypto assets by leveraging network features extracted from transaction data. Motivated by the challenges in Anti-Money Laundering, Countering the Financing of Terrorism, and Counter-Proliferation Financing, we focus on structural features within crypto asset networks that may precede extreme market events. Building on theories from complex network analysis and rate-induced tipping, we characterize early warning signals. Granger causality is applied for feature selection, identifying network dynamics that causally precede price movements. To quantify surge likelihood, we employ a Boltzmann machine as a generative model to derive nonlinear indicators that are sensitive to critical shifts in transactional topology. Furthermore, we develop a method to trace back and identify individual nodes that contribute significantly to price surges. The findings have practical implications for investors, risk management officers, regulatory supervision by financial authorities, and the evaluation of systemic risk. This framework presents a novel approach to integrating explainable AI, financial network theory, and regulatory objectives in crypto asset markets.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:eti:dpaper:25113

This nep-ain issue is ©2025 by Ben Greiner. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.