nep-ain New Economics Papers
on Artificial Intelligence
Issue of 2025–12–15
fourteen papers chosen by
Ben Greiner, Wirtschaftsuniversität Wien


  1. Equalizer or amplifier? How AI may reshape human cognitive differences By Maria Bigoni; Andrea Ichino; Aldo Rustichini; Giulio Zanella
  2. My Advisor, Her AI and Me: Evidence from a Field Experiment on Human-AI Collaboration and Investment Decisions By Yang, Cathy L.; Bauer, Kevin; Li, Xitong; Hinz, Oliver
  3. Tacit Bidder-Side Collusion: Artificial Intelligence in Dynamic Auctions By Sriram Tolety
  4. Barriers to AI Adoption: Image Concerns at Work By David Almog
  5. From FLOPs to Footprints: The Resource Cost of Artificial Intelligence By Sophia Falk; Nicholas Kluge Corr\^ea; Sasha Luccioni; Lisa Biber-Freudenberger; Aimee van Wynsberghe
  6. Beyond Automation: Redesigning Jobs with LLMs to Enhance Productivity By Andrew Ledingham; Michael Hollins; Matthew Lyon; David Gillespie; Umar Yunis-Guerra; Jamie Siviter; David Duncan; Oliver P. Hauser
  7. The impact of AI exposure on labour market outcomes and well-being: Evidence from Australia By Duran Vanegas, Juan; Tuda, Dora
  8. The Potential Distributive Impact of AI-driven Labor Changes in Latin America By Matias Ciaschi; Guillermo Falcone; Santiago Garganta; Leonardo Gasparini; Octavio Bertín; Lucía Ramirez-Leira
  9. The Impact of Artificial Intelligence on Enterprise Decision-Making Process By Ernest G\'orka; Dariusz Baran; Gabriela Wojak; Micha{\l} \'Cwi\k{a}ka{\l}a; Sebastian Zupok; Dariusz Starkowski; Dariusz Re\'sko; Oliwia Okrasa
  10. Does Firm-Level AI Adoption Improve Early-Warning of Corporate Financial Distress? Evidence from Chinese Non-Financial Firms By Frederik Rech; Fanchen Meng; Hussam Musa; Martin \v{S}ebe\v{n}a; Siele Jean Tuo
  11. Detecting AI Hallucinations in Finance: An Information-Theoretic Method Cuts Hallucination Rate by 92% By Mainak Singha
  12. A Hybrid Architecture for Options Wheel Strategy Decisions: LLM-Generated Bayesian Networks for Transparent Trading By Xiaoting Kuang; Boken Lin
  13. Can technology augment order writing capacity at regulators? By Natasha Aggarwal; Satyavrat Bondre; Amrutha Desikan; Bhavin Patel; Dipyaman Sanyal
  14. Narratives to Numbers: Large Language Models and Economic Policy Uncertainty By Ethan Hartley

  1. By: Maria Bigoni; Andrea Ichino; Aldo Rustichini; Giulio Zanella
    Abstract: Machines have at times equalized physical strength by substituting for human effort, and at other times amplified these differences. Artificial intelligence (AI) may likewise narrow or widen disparities in cognitive ability. Recent evidence from the Information and Communication Technology (ICT) revolution suggests that computers increased inequality by education but reduced it by cognitive ability. Early research on generative AI shows larger productivity gains for less-skilled than for high-skilled workers. Whether AI ultimately acts as an equalizer or an amplifier of human cognitive differences is especially crucial for education systems, which must decide whether -- and how -- to allow students to use AI in coursework and exams. This decision is urgent because employers value workers who can leverage AI effectively rather than operate independently of it.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.03902
  2. By: Yang, Cathy L. (HEC Paris - Department of Information Systems and Operations Management); Bauer, Kevin (Goethe University Frankfurt; Leibniz Institute for Financial Research SAFE); Li, Xitong (HEC Paris); Hinz, Oliver (Goethe University Frankfurt - Faculty of Economics and Business Administration)
    Abstract: Amid ongoing policy and managerial debates on keeping humans in the loop of AI decision-making processes, we investigate whether human involvement in AI-based service production benefits downstream consumers. Partnering with a large savings bank in Europe, we produced pure AI and human-AI collaborative investment advice, which we passed to the bank customers and investigated the degree of their advice-taking in a field experiment. On the production side, contrary to concerns that humans might inefficiently override AI output, our findings show that having a human banker in the loop of AIbased financial advisory by giving her the final say over the advice provided does not compromise the quality of the advice. More importantly, on the consumption side, we find that the bank customers are more likely to align their final investment decisions with advice from the human-AI collaboration, compared to pure AI, especially when facing more risky investments. In our setting, this increased reliance on human-AI collaborative advice leads to higher material welfare for consumers. Additional analyses from the field experiment along with an online controlled experiment indicate that the persuasive efficacy of human-AI collaborative advice cannot be attributed to consumers' belief in increased advice quality resulting from complementarities between human and AI capabilities. Instead, the consumption-side benefits of human involvement in the AI-based service largely stem from human involvement serving as a peripheral cue that enhances the affective appeal of the advice. Our findings indicate that regulations and guidelines should adopt a consumer-centric approach by fostering environments where human capabilities and AI systems can synergize effectively to benefit consumers while safeguarding consumer welfare. These nuanced insights are crucial for managers who face decisions about offering pure AI versus human-AI collaborative services and also for regulators advocating for having humans in the loop.
    Keywords: Human intervention; human-in-the-loop; human-AI collaboration; algorithmic aversion; social influence
    JEL: O30
    Date: 2025–06–04
    URL: https://d.repec.org/n?u=RePEc:ebg:heccah:1570
  3. By: Sriram Tolety
    Abstract: We study whether large language models acting as autonomous bidders can tacitly collude by coordinating when to accept platform posted payouts in repeated Dutch auctions, without any communication. We present a minimal repeated auction model that yields a simple incentive compatibility condition and a closed form threshold for sustainable collusion for subgame-perfect Nash equilibria. In controlled simulations with multiple language models, we observe systematic supra-competitive prices in small auction settings and a return to competitive behavior as the number of bidders in the market increases, consistent with the theoretical model. We also find LLMs use various mechanisms to facilitate tacit coordination, such as focal point acceptance timing versus patient strategies that track the theoretical incentives. The results provide, to our knowledge, the first evidence of bidder side tacit collusion by LLMs and show that market structure levers can be more effective than capability limits for mitigation.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.21802
  4. By: David Almog
    Abstract: Concerns about how workers are perceived can deter effective collaboration with artificial intelligence (AI). In a field experiment on a large online labor market, I hired 450 U.S.-based remote workers to complete an image-categorization job assisted by AI recommendations. Workers were incentivized by the prospect of a contract extension based on an HR evaluator's feedback. I find that workers adopt AI recommendations at lower rates when their reliance on AI is visible to the evaluator, resulting in a measurable decline in task performance. The effects are present despite a conservative design in which workers know that the evaluator is explicitly instructed to assess expected accuracy on the same AI-assisted task. This reduction in AI reliance persists even when the evaluator is reassured about workers' strong performance history on the platform, underscoring how difficult these concerns are to alleviate. Leveraging the platform's public feedback feature, I introduce a novel incentive-compatible elicitation method showing that workers fear heavy reliance on AI signals a lack of confidence in their own judgment, a trait they view as essential when collaborating with AI.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.18582
  5. By: Sophia Falk; Nicholas Kluge Corr\^ea; Sasha Luccioni; Lisa Biber-Freudenberger; Aimee van Wynsberghe
    Abstract: As computational demands continue to rise, assessing the environmental footprint of AI requires moving beyond energy and water consumption to include the material demands of specialized hardware. This study quantifies the material footprint of AI training by linking computational workloads to physical hardware needs. The elemental composition of the Nvidia A100 SXM 40 GB graphics processing unit (GPU) was analyzed using inductively coupled plasma optical emission spectroscopy, which identified 32 elements. The results show that AI hardware consists of about 90% heavy metals and only trace amounts of precious metals. The elements copper, iron, tin, silicon, and nickel dominate the GPU composition by mass. In a multi-step methodology, we integrate these measurements with computational throughput per GPU across varying lifespans, accounting for the computational requirements of training specific AI models at different training efficiency regimes. Scenario-based analyses reveal that, depending on Model FLOPs Utilization (MFU) and hardware lifespan, training GPT-4 requires between 1, 174 and 8, 800 A100 GPUs, corresponding to the extraction and eventual disposal of up to 7 tons of toxic elements. Combined software and hardware optimization strategies can reduce material demands: increasing MFU from 20% to 60% lowers GPU requirements by 67%, while extending lifespan from 1 to 3 years yields comparable savings; implementing both measures together reduces GPU needs by up to 93%. Our findings highlight that incremental performance gains, such as those observed between GPT-3.5 and GPT-4, come at disproportionately high material costs. The study underscores the necessity of incorporating material resource considerations into discussions of AI scalability, emphasizing that future progress in AI must align with principles of resource efficiency and environmental responsibility.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.04142
  6. By: Andrew Ledingham; Michael Hollins; Matthew Lyon; David Gillespie; Umar Yunis-Guerra; Jamie Siviter; David Duncan; Oliver P. Hauser
    Abstract: The adoption of generative artificial intelligence (AI) is predicted to lead to fundamental shifts in the labour market, resulting in displacement or augmentation of AI-exposed roles. To investigate the impact of AI across a large organisation, we assessed AI exposure at the task level within roles at the UK Civil Service (UKCS). Using a novel dataset of UKCS job adverts, covering 193, 497 vacancies over 6 years, our large language model (LLM)-driven analysis estimated AI exposure scores of 1, 542, 411 tasks. By aggregating AI exposure scores for tasks within each role, we calculated the mean and variance of job-level exposure to AI, highlighting the heterogeneous impacts of AI, even for seemingly identical jobs. We then use an LLM to redesign jobs, focusing on task automation, task optimisation, and task reallocation. We find that the redesign process leads to tasks where humans have comparative advantage over AI, including strategic leadership, complex problem resolution, and stakeholder management. Overall, automation and augmentation are expected to have nuanced effects across all levels of the organisational hierarchy. Most economic value of AI is expected to arise from productivity gains rather than role displacement. We contribute to the automation, augmentation and productivity debates as well as advance our understanding of job redesign in the age of AI.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.05659
  7. By: Duran Vanegas, Juan; Tuda, Dora
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:esr:wpaper:wp808
  8. By: Matias Ciaschi (CEDLAS-IIE-FCE-UNLP and CONICET); Guillermo Falcone (CEDLAS-IIE-FCE-UNLP and CONICET); Santiago Garganta (CEDLAS-IIE-FCE-UNLP); Leonardo Gasparini (CEDLAS-IIE-FCE-UNLP and CONICET); Octavio Bertín (CEDLAS-IIE-FCE-UNLP); Lucía Ramirez-Leira (CEDLAS-IIE-FCE-UNLP)
    Abstract: This paper investigates the potential distributional consequences of artificial intelligence (AI) adoption in Latin American labor markets. Using harmonized household survey data from 14 countries, we combine four recently developed AI occupational exposure indices—the AI Occupational Exposure Index (AIOE), the ComplementarityAdjusted AIOE (C-AIOE), the Generative AI Exposure Index (GBB), and the AIGenerated Occupational Exposure Index (GENOE)—to analyze patterns across countries and worker groups. We validate these measures by comparing task profiles between Latin America and high-income economies using PIAAC data, and develop a contextual adjustment that incorporates informality, wage structures, and union coverage. Finally, we simulate first-order impacts of AI-induced displacement on earnings, poverty, and inequality. The results show substantial heterogeneity, with higher levels of AI-related risk among women, younger, more educated, and formal workers. Indices that account for task complementarities show flatter gradients across the income and education distribution. Simulations suggest that displacement effects may lead to only moderate increases in inequality and poverty in the absence of mitigating policies.
    JEL: O33 J21 D31
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:dls:wpaper:0361
  9. By: Ernest G\'orka; Dariusz Baran; Gabriela Wojak; Micha{\l} \'Cwi\k{a}ka{\l}a; Sebastian Zupok; Dariusz Starkowski; Dariusz Re\'sko; Oliwia Okrasa
    Abstract: Artificial intelligence improves enterprise decision-making by accelerating data analysis, reducing human error, and supporting evidence-based choices. A quantitative survey of 92 companies across multiple industries examines how AI adoption influences managerial performance, decision efficiency, and organizational barriers. Results show that 93 percent of firms use AI, primarily in customer service, data forecasting, and decision support. AI systems increase the speed and clarity of managerial decisions, yet implementation faces challenges. The most frequent barriers include employee resistance, high costs, and regulatory ambiguity. Respondents indicate that organizational factors are more significant than technological limitations. Critical competencies for successful AI use include understanding algorithmic mechanisms and change management. Technical skills such as programming play a smaller role. Employees report difficulties in adapting to AI tools, especially when formulating prompts or accepting system outputs. The study highlights the importance of integrating AI with human judgment and communication practices. When supported by adaptive leadership and transparent processes, AI adoption enhances organizational agility and strengthens decision-making performance. These findings contribute to ongoing research on how digital technologies reshape management and the evolution of hybrid human-machine decision environments.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.02048
  10. By: Frederik Rech (School of Economics, Beijing Institute of Technology, Beijing, China); Fanchen Meng (Faculty of Economics, Shenzhen MSU-BIT University, Shenzhen, China); Hussam Musa (Faculty of Economics, Matej Bel University, Bansk\'a Bystrica, Slovakia); Martin \v{S}ebe\v{n}a (Faculty of Arts and Social Sciences, Hong Kong Baptist University, Hong Kong, China); Siele Jean Tuo (Business School, Liaoning University, Shenyang, China)
    Abstract: This study investigates whether firm-level artificial intelligence (AI) adoption improves the out-of-sample prediction of corporate financial distress models beyond traditional financial ratios. Using a sample of Chinese listed firms (2008-2023), we address sparse AI data with a novel pruned training window method, testing multiple machine learning models. We find that AI adoption consistently increases predictive accuracy, with the largest gains in recall rates for identifying distressed firms. Tree-based models and AI density metrics proved most effective. Crucially, models using longer histories outperformed those relying solely on recent "AI-rich" data. The analysis also identifies divergent adoption patterns, with healthy firms exhibiting earlier and higher AI uptake than distressed peers. These findings, while based on Chinese data, provide a framework for early-warning signals and demonstrate the broader potential of AI metrics as a stable, complementary risk indicator distinct from traditional accounting measures.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.02510
  11. By: Mainak Singha
    Abstract: Large language models (LLMs) produce fluent but unsupported answers - hallucinations - limiting safe deployment in high-stakes domains. We propose ECLIPSE, a framework that treats hallucination as a mismatch between a model's semantic entropy and the capacity of available evidence. We combine entropy estimation via multi-sample clustering with a novel perplexity decomposition that measures how models use retrieved evidence. We prove that under mild conditions, the resulting entropy-capacity objective is strictly convex with a unique stable optimum. We evaluate on a controlled financial question answering dataset with GPT-3.5-turbo (n=200 balanced samples with synthetic hallucinations), where ECLIPSE achieves ROC AUC of 0.89 and average precision of 0.90, substantially outperforming a semantic entropy-only baseline (AUC 0.50). A controlled ablation with Claude-3-Haiku, which lacks token-level log probabilities, shows AUC dropping to 0.59 with coefficient magnitudes decreasing by 95% - demonstrating that ECLIPSE is a logprob-native mechanism whose effectiveness depends on calibrated token-level uncertainties. The perplexity decomposition features exhibit the largest learned coefficients, confirming that evidence utilization is central to hallucination detection. We position this work as a controlled mechanism study; broader validation across domains and naturally occurring hallucinations remains future work.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.03107
  12. By: Xiaoting Kuang; Boken Lin
    Abstract: Large Language Models (LLMs) excel at understanding context and qualitative nuances but struggle with the rigorous and transparent reasoning required in high-stakes quantitative domains such as financial trading. We propose a model-first hybrid architecture for the options "wheel" strategy that combines the strengths of LLMs with the robustness of a Bayesian Network. Rather than using the LLM as a black-box decision-maker, we employ it as an intelligent model builder. For each trade decision, the LLM constructs a context-specific Bayesian network by interpreting current market conditions, including prices, volatility, trends, and news, and hypothesizing relationships among key variables. The LLM also selects relevant historical data from an 18.75-year, 8, 919-trade dataset to populate the network's conditional probability tables. This selection focuses on scenarios analogous to the present context. The instantiated Bayesian network then performs transparent probabilistic inference, producing explicit probability distributions and risk metrics to support decision-making. A feedback loop enables the LLM to analyze trade outcomes and iteratively refine subsequent network structures and data selection, learning from both successes and failures. Empirically, our hybrid system demonstrates effective performance on the wheel strategy. Over nearly 19 years of out-of-sample testing, it achieves a 15.3% annualized return with significantly superior risk-adjusted performance (Sharpe ratio 1.08 versus 0.62 for market benchmarks) and dramatically lower drawdown (-8.2% versus -60%) while maintaining a 0% assignment rate through strategic option rolling. Crucially, each trade decision is fully explainable, involving on average 27 recorded decision factors (e.g., volatility level, option premium, risk indicators, market context).
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.01123
  13. By: Natasha Aggarwal (TrustBridge Rule of Law Foundation); Satyavrat Bondre (Dono Consulting); Amrutha Desikan (TrustBridge Rule of Law Foundation); Bhavin Patel (TrustBridge Rule of Law Foundation); Dipyaman Sanyal (Dono Consulting)
    Abstract: This paper critically examines the opportunities and challenges of using technology, in particular Large Language Models (LLMs), to assist regulatory order writing in quasi-judicial settings, with a focus on the Indian context. The paper proposes augmenting rather than replacing human decision-makers, aiming to improve regulatory order writing practice through responsible use of LLMs. It identifies the core principles of administrative law that must be upheld in these settings — such as application of mind, reasoned orders, non arbitrariness, rules against bias, and transparency — and analyses how inherent limitations of LLMs, including their probabilistic reasoning, opacity, potential for bias, confabulation, and lack of metacognition, may undermine these principles. The paper reviews international frameworks and case studies from various jurisdictions, highlighting common design principles like human oversight, transparency, nondiscrimination, and security. It proposes a comprehensive Problem-Solution-Evaluation (PSE) framework for responsibly integrating LLMs into order writing processes. This framework maps specific technical, design, and systemic solutions to each identified risk, and outlines evaluation strategies — end-to-end, component-wise, human-in-theloop, and automated — to ensure ongoing alignment with legal standards. The article concludes with practical recommendations for the development and deployment of LLM-based systems in regulatory environments.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:bjd:wpaper:16
  14. By: Ethan Hartley
    Abstract: This study evaluates large language models as estimable classifiers and clarifies how modeling choices shape downstream measurement error. Revisiting the Economic Policy Uncertainty index, we show that contemporary classifiers substantially outperform dictionary rules, better track human audit assessments, and extend naturally to noisy historical and multilingual news. We use these tools to construct a new nineteenth-century U.S. index from more than 360 million newspaper articles and exploratory cross-country indices with a single multilingual model. Taken together, our results show that LLMs can systematically improve text-derived measures and should be integrated as explicit measurement tools in empirical economics.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2511.17866

This nep-ain issue is ©2025 by Ben Greiner. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.