|
on Artificial Intelligence |
| By: | Rainer Michael Rilke (WHU - Otto Beisheim School of Management); Dirk Sliwka (University of Cologne) |
| Abstract: | A large body of research across management, psychology, accounting, and economics shows that subjective performance evaluations are systematically biased: ratings cluster near the midpoint of scales and are often excessively lenient. As organizations increasingly adopt large language models (LLMs) for evaluative tasks, little is known about how these systems perform when assessing human performance. We document that, in the absence of clear objective standards and when individuals are rated independently, LLMs reproduce the familiar patterns of human raters. However, LLMs generate greater dispersion and accuracy when evaluating multiple individuals simultaneously. With noisy but objective performance signals, LLMs provide substantially more accurate evaluations than human raters, as they (i) are less subject to biases arising from concern for the evaluated employee and (ii) make fewer mistakes in information processing closely approximating rational Bayesian benchmarks. |
| Keywords: | Performance Evaluation, Large Language Models, Signal Objectivity, Algorithmic Judgment, Gen-AI |
| JEL: | J24 J28 M12 M53 |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:ajk:ajkdps:384 |
| By: | Felix Chopra (Frankfurt School of Finance & Management, CESifo); Ingar Haaland (NHH Norwegian School of Economics, FAIR, CEPR, NTNU); Nicolas Roever (University of Cologne); Christopher Roth (University of Cologne and ECONtribute, Max Planck Institute for Behavioral Economics, CEPR, NHH) |
| Abstract: | We test the effectiveness of different AI-delivered conversation protocols to increase people’ motivation for change. In a large-scale experiment with 2, 719 social media users, we randomly assign participants to a control conversation or one of three treatment arms: two Motivational Interviewing protocols promoting self-persuasion (change focus or decisional balance) and a direct persuasion protocol providing unsolicited advice and information. All conversations are led by an AI interviewer, enabling standardized delivery of each protocol at scale. Our results show that all three interventions significantly increase motivation for change and the perceived costs of social media use, with change-focused self-persuasion yielding the largest effects. These effects persist and translate into self-reported reductions in social media use more than two weeks after the intervention. Our findings illustrate how AI-led conversations can serve as a scalable platform both for delivering behavioral interventions and for testing what makes them effective by systematically varying how conversations are conducted. |
| Keywords: | AI interviews, Scaling, Motivation, Persuasion, Social Media, Beliefs |
| JEL: | C90 D83 D91 |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:ajk:ajkdps:385 |
| By: | Pawe{\l} Niszczota; Cassandra Gr\"utzner |
| Abstract: | The rapid spread of large language models (LLMs) has raised concerns about the social reactions they provoke. Prior research documents negative attitudes toward AI users, but it remains unclear whether such disapproval translates into costly action. We address this question in a two-phase online experiment (N = 491 Phase II participants; Phase I provided targets) where participants could spend part of their own endowment to reduce the earnings of peers who had previously completed a real-effort task with or without LLM support. On average, participants destroyed 36% of the earnings of those who relied exclusively on the model, with punishment increasing monotonically with actual LLM use. Disclosure about LLM use created a credibility gap: self-reported null use was punished more harshly than actual null use, suggesting that declarations of "no use" are treated with suspicion. Conversely, at high levels of use, actual reliance on the model was punished more strongly than self-reported reliance. Taken together, these findings provide the first behavioral evidence that the efficiency gains of LLMs come at the cost of social sanctions. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.09772 |
| By: | Shengyu Cao; Ming Hu |
| Abstract: | We study how delegating pricing to large language models (LLMs) can facilitate collusion in a duopoly when both sellers rely on the same pre-trained model. The LLM is characterized by (i) a propensity parameter capturing its internal bias toward high-price recommendations and (ii) an output-fidelity parameter measuring how tightly outputs track that bias; the propensity evolves through retraining. We show that configuring LLMs for robustness and reproducibility can induce collusion via a phase transition: there exists a critical output-fidelity threshold that pins down long-run behavior. Below it, competitive pricing is the unique long-run outcome. Above it, the system is bistable, with competitive and collusive pricing both locally stable and the realized outcome determined by the model's initial preference. The collusive regime resembles tacit collusion: prices are elevated on average, yet occasional low-price recommendations provide plausible deniability. With perfect fidelity, full collusion emerges from any interior initial condition. For finite training batches of size $b$, infrequent retraining (driven by computational costs) further amplifies collusion: conditional on starting in the collusive basin, the probability of collusion approaches one as $b$ grows, since larger batches dampen stochastic fluctuations that might otherwise tip the system toward competition. The indeterminacy region shrinks at rate $O(1/\sqrt{b})$. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.01279 |
| By: | Michael Yang (Eric); Ruijiang Gao (Eric); Zhiqiang (Eric); Zheng |
| Abstract: | The rapid expansion of Artificial Intelligence is hindered by a fundamental friction in data markets: the value-privacy dilemma, where buyers cannot verify a dataset's utility without inspection, yet inspection may expose the data (Arrow's Information Paradox). We resolve this challenge by introducing the Trustworthy Influence Protocol (TIP), a privacy-preserving framework that enables prospective buyers to quantify the utility of external data without ever decrypting the raw assets. By integrating Homomorphic Encryption with gradient-based influence functions, our approach allows for the precise, blinded scoring of data points against a buyer's specific AI model. To ensure scalability for Large Language Models (LLMs), we employ low-rank gradient projections that reduce computational overhead while maintaining near-perfect fidelity to plaintext baselines, as demonstrated across BERT and GPT-2 architectures. Empirical simulations in healthcare and generative AI domains validate the framework's economic potential: we show that encrypted valuation signals achieve a high correlation with realized clinical utility and reveal a heavy-tailed distribution of data value in pre-training corpora where a minority of texts drive capability while the majority degrades it. These findings challenge prevailing flat-rate compensation models and offer a scalable technical foundation for a meritocratic, secure data economy. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.06033 |
| By: | Nikoleta Anesti; Edward Hill; Andreas Joseph |
| Abstract: | This paper investigates the ability of Large Language Models (LLMs), specifically GPT-3.5-turbo (GPT), to form inflation perceptions and expectations based on macroeconomic price signals. We compare the LLM's output to household survey data and official statistics, mimicking the information set and demographic characteristics of the Bank of England's Inflation Attitudes Survey (IAS). Our quasi-experimental design exploits the timing of GPT's training cut-off in September 2021 which means it has no knowledge of the subsequent UK inflation surge. We find that GPT tracks aggregate survey projections and official statistics at short horizons. At a disaggregated level, GPT replicates key empirical regularities of households' inflation perceptions, particularly for income, housing tenure, and social class. A novel Shapley value decomposition of LLM outputs suited for the synthetic survey setting provides well-defined insights into the drivers of model outputs linked to prompt content. We find that GPT demonstrates a heightened sensitivity to food inflation information similar to that of human respondents. However, we also find that it lacks a consistent model of consumer price inflation. More generally, our approach could be used to evaluate the behaviour of LLMs for use in the social sciences, to compare different models, or to assist in survey design. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.14306 |
| By: | Manshu Khanna; Ziyi Wang; Lijia Wei; Lian Xue |
| Abstract: | We document a fundamental paradox in AI transparency: explanations improve decisions when algorithms are correct but systematically worsen them when algorithms err. In an experiment with 257 medical students making 3, 855 diagnostic decisions, we find explanations increase accuracy by 6.3 percentage points when AI is correct (73% of cases) but decrease it by 4.9 points when incorrect (27% of cases). This asymmetry arises because modern AI systems generate equally persuasive explanations regardless of recommendation quality-physicians cannot distinguish helpful from misleading guidance. We show physicians treat explained AI as 15.2 percentage points more accurate than reality, with over-reliance persisting even for erroneous recommendations. Competent physicians with appropriate uncertainty suffer most from the AI transparency paradox (-12.4pp when AI errs), while overconfident novices benefit most (+9.9pp net). Welfare analysis reveals that selective transparency generates \$2.59 billion in annual healthcare value, 43% more than the \$1.82 billion from mandated universal transparency. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.08424 |
| By: | James Alm (Tulane University); Rida Belahouaoui (Cadi Ayyad University) |
| Abstract: | This study examines the role of artificial intelligence (AI) tools in enhancing tax fraud detection within the ambit of the OECD Tax Administration 3.0, focusing on how these technologies streamline the detection process through a new "Adaptive AI Tax Oversight" (AATO) framework. Through a textometric systematic review covering the period from 2014 to 2024, we examine the integration of AI in tax fraud detection. The methodology emphasizes the evaluation of AI's predictive, analytical, and procedural benefits in identifying and combating tax fraud. The research underscores AI's significant impact on increasing detection accuracy, predictive capabilities, and operational efficiency in tax administrations. Key findings reveal the ways by which the development and application of the AATO framework improves the tax fraud detection process, and the implications offer a roadmap for global tax authorities to utilize AI in bolstering detection efforts, potentially lowering compliance expenses and improving regulatory frameworks. |
| Keywords: | Artificial intelligence, tax fraud, AATO framework, blockchain, neural networks, data mining |
| JEL: | C45 H26 |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:tul:wpaper:2511 |
| By: | Robert D. Metcalfe; Andrew Schein; Cohen R. Simpson; Yixin Sun |
| Abstract: | One of the promising opportunities offered by AI to support the decarbonization of electricity grids is to align demand with low-carbon supply. We evaluated the effects of one of the world’s largest AI managed EV charging tariffs (a retail electricity pricing plan) using a large-scale natural field experiment. The tariff dynamically controlled vehicle charging to follow real-time wholesale electricity prices and coordinate and optimize charging for the grid and the consumer through AI. We randomized financial incentives to encourage enrollment onto the tariff. Over more than a year, we found that the tariff led to a 42% reduction in household electricity demand during peak hours, with 100% of this demand shifted to lower-cost and lower-carbon-intensity periods. The tariff generated substantial consumer savings, while demonstrating potential to lower producer costs, energy system costs, and carbon emissions through significant load shifting. Overrides of the AI algorithm were low, suggesting that this tariff was likely more efficient than a real-time-pricing tariff without AI, given our theoretical framework. We found similar plug-in and override behavior in several markets, including the UK, US, Germany, and Spain, implying the potential for comparable demand and welfare effects. Our findings highlight the potential for scalable AI managed charging and its substantial welfare gains for the electricity system and society. We also show that experimental estimates differed meaningfully from those obtained via non-randomized difference-in-differences analysis, due to differences in the samples in the two evaluation strategies, although we can reconcile the estimates with observables. |
| JEL: | Q4 |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:34709 |
| By: | Junhui Jeff Cai; Xian Gu; Liugang Sheng; Mengjia Xia; Linda Zhao; Wu Zhu |
| Abstract: | This paper studies whether, how, and for whom generative artificial intelligence (GenAI) facilitates firm creation. Our identification strategy exploits the November 2022 release of ChatGPT as a global shock that lowered start-up costs and leverages variations across geo-coded grids with differential pre-existing AI-specific human capital. Using high-resolution and universal data on Chinese firm registrations by the end of 2024, we find that grids with stronger AI-specific human capital experienced a sharp surge in new firm formation$\unicode{x2013}$driven entirely by small firms, contributing to 6.0% of overall national firm entry. Large-firm entry declines, consistent with a shift toward leaner ventures. New firms are smaller in capital, shareholder number, and founding team size, especially among small firms. The effects are strongest among firms with potential AI applications, weaker financing needs, and among first-time entrepreneurs. Overall, our results highlight that GenAI serves as a pro-competitive force by disproportionately boosting small-firm entry. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.06506 |
| By: | Morgan R. Frank; Alireza Javadian Sabet; Lisa Simon; Sarah H. Bana; Renzhe Yu |
| Abstract: | Public debate links worsening job prospects for AI-exposed occupations to the release of ChatGPT in late 2022. Using monthly U.S. unemployment insurance records, we measure occupation- and location-specific unemployment risk and find that risk rose in AI-exposed occupations beginning in early 2022, months before ChatGPT. Analyzing millions of LinkedIn profiles, we show that graduate cohorts from 2021 onward entered AI-exposed jobs at lower rates than earlier cohorts, with gaps opening before late 2022. Finally, from millions of university syllabi, we find that graduates taking more AI-exposed curricula had higher first-job pay and shorter job searches after ChatGPT. Together, these results point to forces pre-dating generative AI and to the ongoing value of LLM-relevant education. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.02554 |
| By: | Sam J. Manning; Tomás Aguirre |
| Abstract: | We construct an occupation-level adaptive capacity index that measures a set of worker characteristics relevant for navigating job transitions if displaced, covering 356 occupations that represent 95.9% of the U.S. workforce. We find that AI exposure and adaptive capacity are positively correlated: many occupations highly exposed to AI contain workers with relatively strong means to manage a job transition. Of the 37.1 million workers in the top quartile of AI exposure, 26.5 million are in occupations that also have above-median adaptive capacity, leaving them comparatively well-equipped to handle job transitions if displacement occurs. At the same time, 6.1 million workers (4.2% of the workforce in our sample) work in occupations that are both highly exposed and where workers have low expected adaptive capacity. These workers are concentrated in clerical and administrative roles. Importantly, AI exposure reflects potential changes to work tasks, not inevitable displacement; only some of the changes brought on by AI will result in job loss. By distinguishing between highly exposed workers with relatively strong means to adjust and those with limited adaptive capacity, our analysis shows that exposure measures alone can obscure both areas of resilience to technological change and concentrated pockets of elevated vulnerability if displacement were to occur. |
| JEL: | J01 J20 J21 J24 J29 J63 O33 |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:34705 |
| By: | Jeremy Yang; Noah Yonack; Kate Zyskowski; Denis Yarats; Johnny Ho; Jerry Ma |
| Abstract: | This paper presents the first large-scale field study of the adoption, usage intensity, and use cases of general-purpose AI agents operating in open-world web environments. Our analysis centers on Comet, an AI-powered browser developed by Perplexity, and its integrated agent, Comet Assistant. Drawing on hundreds of millions of anonymized user interactions, we address three fundamental questions: Who is using AI agents? How intensively are they using them? And what are they using them for? Our findings reveal substantial heterogeneity in adoption and usage across user segments. Earlier adopters, users in countries with higher GDP per capita and educational attainment, and individuals working in digital or knowledge-intensive sectors -- such as digital technology, academia, finance, marketing, and entrepreneurship -- are more likely to adopt or actively use the agent. To systematically characterize the substance of agent usage, we introduce a hierarchical agentic taxonomy that organizes use cases across three levels: topic, subtopic, and task. The two largest topics, Productivity & Workflow and Learning & Research, account for 57% of all agentic queries, while the two largest subtopics, Courses and Shopping for Goods, make up 22%. The top 10 out of 90 tasks represent 55% of queries. Personal use constitutes 55% of queries, while professional and educational contexts comprise 30% and 16%, respectively. In the short term, use cases exhibit strong stickiness, but over time users tend to shift toward more cognitively oriented topics. The diffusion of increasingly capable AI agents carries important implications for researchers, businesses, policymakers, and educators, inviting new lines of inquiry into this rapidly emerging class of AI capabilities. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.07828 |
| By: | David Loschiavo; Olivier Armantier; Antonio Dalla Zuanna; Leonardo Gambacorta; Mirko Moscatelli; Ilaria Supino |
| Abstract: | This paper explores the household adoption of Generative Artificial Intelligence (GenAI) in the United States and Italy, leveraging survey data to compare usage patterns, demographic influences, and employment sectoral composition effects. Our findings reveal higher adoption rates in the US, driven by socio-demographic differences between the two countries. Despite their lower usage of GenAI, Italians are more confident in its potential to improve their well-being and financial situation. Both Italian and US users tend to trust GenAI tools less than human-operated services, but Italians report greater relative trust in government and institutions when handling personal data with GenAI tools. |
| Keywords: | generative artificial intelligence, technology adoption, cross-country comparison, socio-demographic factors, trust in technology, cultural attitudes |
| JEL: | O33 D10 J24 |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:bis:biswps:1322 |
| By: | Fischer, Mira; Rau, Holger A.; Rilke, Rainer Michael |
| Abstract: | We study how AI tutoring affects learning in higher education through a randomized experiment with 334 university students preparing for an incentivized exam. Students either received only textbook material, restricted access to an AI tutor requiring initial independent reading, or unrestricted access throughout the study period. AI tutor access raises test performance by 0.23 standard deviations relative to control. Surprisingly, unrestricted access significantly outperforms restricted access by 0.21 standard deviations, contradicting concerns about premature AI reliance. Behavioral analysis reveals that unrestricted access fosters gradual integration of AI support, while restricted access induces intensive bursts of prompting that disrupt learning flow. Benefits are heterogeneous: AI tutors prove most effective for students with lower baseline knowledge and stronger self-regulation skills, suggesting that seamless AI integration enhances learning when students can strategically combine independent study with targeted support. |
| Keywords: | AI Tutors, Large Language Models, Self-regulated Learning, Higher Education |
| JEL: | C91 I21 D83 |
| Date: | 2025 |
| URL: | https://d.repec.org/n?u=RePEc:zbw:wzbmbh:335027 |
| By: | Bojidara Doseva; Catherine Dehon; Antonio Estache |
| Abstract: | The paper reports the results of an experiment designed to compare the impact on financial literacy skills of primary school students of a switch from a traditional pedagogical approach supported by textbooks to one relying on AI-supported methods favouring the gamification of the learning process. The study focuses on 152 students aged 8 to 11 distributed across six classes in a Bulgarian public school. The results show an important statistically significant literacy improvement for the treatment group. It also discusses the contextual dimensions accounted for in control variables that may lead to outcome differences according to the families’ socio-economic background. |
| Keywords: | Artificial Intelligence; Education and Training; Financial Markets; Household Finance |
| Date: | 2025–09–01 |
| URL: | https://d.repec.org/n?u=RePEc:eca:wpaper:2013/401374 |
| By: | Shanoyan, Aleksan; Britton, Logan L.; Bergtold, Jason S.; Hobbs Jr., Lonnie; Sharma, Priyanka |
| Abstract: | The adoption of artificial intelligence (AI) is reshaping teaching in higher education. This study examines how instructors’ perceptions of AI’s instructional impact relate to their personal AI use, teaching experience, disciplinary affiliation, and exposure to AI-specific training. Drawing on survey data from over 600 faculty at a land-grant university, we use regression and latent class analysis to explore variation in perception and adoption. Results highlight that daily teaching use and interactive training formats are associated with more favorable views. Findings offer guidance for developing faculty support strategies and training programs that foster effective and context-aware AI integration. |
| Keywords: | Teaching/Communication/Extension/Profession |
| Date: | 2025 |
| URL: | https://d.repec.org/n?u=RePEc:ags:aaea25:361162 |
| By: | Yu Liu; Wenwen Li; Yifan Dou; Guangnan Ye |
| Abstract: | Understanding decision-making in multi-AI-agent frameworks is crucial for analyzing strategic interactions in network-effect-driven contexts. This study investigates how AI agents navigate network-effect games, where individual payoffs depend on peer participatio--a context underexplored in multi-agent systems despite its real-world prevalence. We introduce a novel workflow design using large language model (LLM)-based agents in repeated decision-making scenarios, systematically manipulating price trajectories (fixed, ascending, descending, random) and network-effect strength. Our key findings include: First, without historical data, agents fail to infer equilibrium. Second, ordered historical sequences (e.g., escalating prices) enable partial convergence under weak network effects but strong effects trigger persistent "AI optimism"--agents overestimate participation despite contradictory evidence. Third, randomized history disrupts convergence entirely, demonstrating that temporal coherence in data shapes LLMs' reasoning, unlike humans. These results highlight a paradigm shift: in AI-mediated systems, equilibrium outcomes depend not just on incentives, but on how history is curated, which is impossible for human. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.11943 |
| By: | Eren Kurshan; Tucker Balch; David Byrd |
| Abstract: | Generative and agentic artificial intelligence is entering financial markets faster than existing governance can adapt. Current model-risk frameworks assume static, well-specified algorithms and one-time validations; large language models and multi-agent trading systems violate those assumptions by learning continuously, exchanging latent signals, and exhibiting emergent behavior. Drawing on complex adaptive systems theory, we model these technologies as decentralized ensembles whose risks propagate along multiple time-scales. We then propose a modular governance architecture. The framework decomposes oversight into four layers of "regulatory blocks": (i) self-regulation modules embedded beside each model, (ii) firm-level governance blocks that aggregate local telemetry and enforce policy, (iii) regulator-hosted agents that monitor sector-wide indicators for collusive or destabilizing patterns, and (iv) independent audit blocks that supply third-party assurance. Eight design strategies enable the blocks to evolve as fast as the models they police. A case study on emergent spoofing in multi-agent trading shows how the layered controls quarantine harmful behavior in real time while preserving innovation. The architecture remains compatible with today's model-risk rules yet closes critical observability and control gaps, providing a practical path toward resilient, adaptive AI governance in financial systems. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.11933 |
| By: | Tianyu Fan; Yuhao Yang; Yangqin Jiang; Yifei Zhang; Yuxuan Chen; Chao Huang |
| Abstract: | Large Language Models (LLMs) have demonstrated remarkable potential as autonomous agents, approaching human-expert performance through advanced reasoning and tool orchestration. However, decision-making in fully dynamic and live environments remains highly challenging, requiring real-time information integration and adaptive responses. While existing efforts have explored live evaluation mechanisms in structured tasks, a critical gap remains in systematic benchmarking for real-world applications, particularly in finance where stringent requirements exist for live strategic responsiveness. To address this gap, we introduce AI-Trader, the first fully-automated, live, and data-uncontaminated evaluation benchmark for LLM agents in financial decision-making. AI-Trader spans three major financial markets: U.S. stocks, A-shares, and cryptocurrencies, with multiple trading granularities to simulate live financial environments. Our benchmark implements a revolutionary fully autonomous minimal information paradigm where agents receive only essential context and must independently search, verify, and synthesize live market information without human intervention. We evaluate six mainstream LLMs across three markets and multiple trading frequencies. Our analysis reveals striking findings: general intelligence does not automatically translate to effective trading capability, with most agents exhibiting poor returns and weak risk management. We demonstrate that risk control capability determines cross-market robustness, and that AI trading strategies achieve excess returns more readily in highly liquid markets than policy-driven environments. These findings expose critical limitations in current autonomous agents and provide clear directions for future improvements. The code and evaluation data are open-sourced to foster community research: https://github.com/HKUDS/AI-Trader. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.10971 |
| By: | Gongao Zhang; Haijiang Zeng; Lu Jiang |
| Abstract: | Financial institutions and regulators require systems that integrate heterogeneous data to assess risks from stock fluctuations to systemic vulnerabilities. Existing approaches often treat these tasks in isolation, failing to capture cross-scale dependencies. We propose Uni-FinLLM, a unified multimodal large language model that uses a shared Transformer backbone and modular task heads to jointly process financial text, numerical time series, fundamentals, and visual data. Through cross-modal attention and multi-task optimization, it learns a coherent representation for micro-, meso-, and macro-level predictions. Evaluated on stock forecasting, credit-risk assessment, and systemic-risk detection, Uni-FinLLM significantly outperforms baselines. It raises stock directional accuracy to 67.4% (from 61.7%), credit-risk accuracy to 84.1% (from 79.6%), and macro early-warning accuracy to 82.3%. Results validate that a unified multimodal LLM can jointly model asset behavior and systemic vulnerabilities, offering a scalable decision-support engine for finance. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.02677 |