nep-big New Economics Papers
on Big Data
Issue of 2025–08–11
24 papers chosen by
Tom Coupé, University of Canterbury


  1. HARLF: Hierarchical Reinforcement Learning and Lightweight LLM-Driven Sentiment Integration for Financial Portfolio Optimization By Benjamin Coriat; Eric Benhamou
  2. The Post Double LASSO for Efficiency Analysis By Christopher Parmeter; Artem Prokhorov; Valentin Zelenyuk
  3. Detecting Bubbles by Machine Learning Prediction. By MINAMI, Koutaroh
  4. Artificial intelligence, distributional fairness, and pivotality By Victor Klockmann; Alicia von Schenk; Marie Claire Villeval
  5. Machine learning the first stage in 2SLS: Practical guidance from bias decomposition and simulation By Connor Lennon; Edward Rubin; Glen Waddell
  6. The Evolution of Alpha in Finance Harnessing Human Insight and LLM Agents By Mohammad Rubyet Islam
  7. FinDPO: Financial Sentiment Analysis for Algorithmic Trading through Preference Optimization of LLMs By Giorgos Iacovides; Wuyang Zhou; Danilo Mandic
  8. Hedging with memory: shallow and deep learning with signatures By Eduardo Abi Jaber; Louis-Amand Gérard
  9. Deep limit order book forecasting: a microstructural guide By Briola, Antonio; Bartolucci, Silvia; Aste, Tomaso
  10. Harnessing AI for accounting integrity: Innovations in fraud detection and prevention By Dulgeridis, Marcel; Schubart, Constantin; Dulgeridis, Sabrina
  11. Weather-Aware AI Systems versus Route-Optimization AI: A Comprehensive Analysis of AI Applications in Transportation Productivity By Tatsuru Kikuchi
  12. Machine learning approach to stock price crash risk By Abdullah Karasan; Ozge Sezgin Alp; Gerhard-Wilhelm Weber
  13. Central bank and media sentiment on central bank digital currency: an international perspective By Boris Hofmann; Xiaorui Tang; Feng Zhu
  14. Public Communication and Collusion: New Screening Tools for Competition Authorities By Tomaso Duso; Joseph E. Harrington Jr.; Carl Kreuzberg; Geza Sapi
  15. Zero-Shot Forecasting Mortality Rates: A Global Study By Gabor Petnehazi; Laith Al Shaggah; Jozsef Gall; Bernadett Aradi
  16. Decoding Consumer Preferences Using Attention-Based Language Models By Joshua Foster; Fredrik Odegaard
  17. Factor Investing with Delays By DICKERSON, Alexander; NOZAWA, Yoshio; ROBOTTI, Cesare
  18. Explainable Graph Neural Networks via Structural Externalities By Lijun Wu; Dong Hao; Zhiyi Fan
  19. Finding John Smith: Using Extra Information for Historical Record Linkage By Ran Abramitzky; Leah Platt Boustan; Harriet M. Brookes Gray; Katherine Eriksson; Santiago Pérez; Hannah M. Postel; Myera Rashid; Noah Simon
  20. Deep Learning for Continuous-time Stochastic Control with Jumps By Patrick Cheridito; Jean-Loup Dupret; Donatien Hainaut
  21. Words Matter: Central Bank Communication and Household Expectations in a Global Panel By Martin Feldkircher; Christos A. Makridis
  22. AI Employment and Political Risk Disclosures in Earnings Calls By Erdinc Akyildirim; Gamze Ozturk Danisman; Steven Ongena
  23. Geoeconomic Pressure By Christopher Clayton; Antonio Coppola; Matteo Maggiori; Jesse Schreger
  24. Uncovering Economic Policy Uncertainty During Conflict By Christopher Rauh; Sophie Brochet; Hannes Mueller

  1. By: Benjamin Coriat; Eric Benhamou
    Abstract: This paper presents a novel hierarchical framework for portfolio optimization, integrating lightweight Large Language Models (LLMs) with Deep Reinforcement Learning (DRL) to combine sentiment signals from financial news with traditional market indicators. Our three-tier architecture employs base RL agents to process hybrid data, meta-agents to aggregate their decisions, and a super-agent to merge decisions based on market data and sentiment analysis. Evaluated on data from 2018 to 2024, after training on 2000-2017, the framework achieves a 26% annualized return and a Sharpe ratio of 1.2, outperforming equal-weighted and S&P 500 benchmarks. Key contributions include scalable cross-modal integration, a hierarchical RL structure for enhanced stability, and open-source reproducibility.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.18560
  2. By: Christopher Parmeter; Artem Prokhorov; Valentin Zelenyuk
    Abstract: Big data and machine learning methods have become commonplace across economic milieus. One area that has not seen as much attention to these important topics yet is efficiency analysis. We show how the availability of big (wide) data can actually make detection of inefficiency more challenging. We then show how machine learning methods can be leveraged to adequately estimate the primitives of the frontier itself as well as inefficiency using the `post double LASSO' by deriving Neyman orthogonal moment conditions for this problem. Finally, an application is presented to illustrate key differences of the post-double LASSO compared to other approaches.
    Date: 2025–05
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2505.14282
  3. By: MINAMI, Koutaroh
    Abstract: This study explores the potential of machine learning, Long Short-Term Memory (LSTM), to detect asset price bubbles by analyzing prediction errors. Using monthly data of the Nikkei225 Index, I evaluate the performance of LSTM model in forecasting prices and compare with the GSADF test. I find that LSTM’s prediction accuracy significantly deteriorates during periods associated with asset bubbles, suggesting the presence of structural changes. In particular, the LSTM approach of this paper captures both the emergence and collapse of Japan’s late 1980s bubble separately. In addition, it can also capture structural changes related to policy changes in the 2010s Japan, which are not identified by the GSADF test. These findings suggest that machine learning can be used for not only identifying bubbles but also policy evaluations.
    Keywords: Bubbles, Generalized Supremum Augmented Dickey-Fuller test (GSADF), Machine learning, Long Short Term Memory (LSTM)
    JEL: G10 G17
    Date: 2025–06
    URL: https://d.repec.org/n?u=RePEc:hit:hcfrwp:g-1-30
  4. By: Victor Klockmann (JMU - Julius-Maximilians-Universität Würzburg = University of Würzburg [Würsburg, Germany], Goethe University Frankfurt = Goethe-Universität Frankfurt am Main, Max Planck Institute for Human Development - Max-Planck-Gesellschaft); Alicia von Schenk (JMU - Julius-Maximilians-Universität Würzburg = University of Würzburg [Würsburg, Germany], Goethe University Frankfurt = Goethe-Universität Frankfurt am Main, Max Planck Institute for Human Development - Max-Planck-Gesellschaft); Marie Claire Villeval (GATE Lyon Saint-Étienne - Groupe d'Analyse et de Théorie Economique Lyon - Saint-Etienne - UL2 - Université Lumière - Lyon 2 - UJM - Université Jean Monnet - Saint-Étienne - EM - EMLyon Business School - CNRS - Centre National de la Recherche Scientifique)
    Abstract: In the field of machine learning, the decisions of algorithms depend on extensive training data contributed by numerous, often human, sources. How does this property affect the social nature of human decisions that serve to train these algorithms? By experimentally manipulating the pivotality of individual decisions for a supervised machine learning algorithm, we show that the diffusion of responsibility weakened revealed social preferences, leading to algorithmic models favoring selfish decisions. Importantly, this phenomenon cannot be attributed to shifts in incentive structures or the presence of externalities. Rather, our results suggest that the expansive nature of Big Data fosters a sense of diminished responsibility and serves as an excuse for selfish behavior that impacts individuals and the whole society.
    Keywords: Artificial intelligence, Big data, Pivotality, Distributional fairness, Experiment
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:hal:journl:hal-05165240
  5. By: Connor Lennon; Edward Rubin; Glen Waddell
    Abstract: Machine learning (ML) primarily evolved to solve "prediction problems." The first stage of two-stage least squares (2SLS) is a prediction problem, suggesting potential gains from ML first-stage assistance. However, little guidance exists on when ML helps 2SLS$\unicode{x2014}$or when it hurts. We investigate the implications of inserting ML into 2SLS, decomposing the bias into three informative components. Mechanically, ML-in-2SLS procedures face issues common to prediction and causal-inference settings$\unicode{x2014}$and their interaction. Through simulation, we show linear ML methods (e.g., post-Lasso) work well, while nonlinear methods (e.g., random forests, neural nets) generate substantial bias in second-stage estimates$\unicode{x2014}$potentially exceeding the bias of endogenous OLS.
    Date: 2025–05
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2505.13422
  6. By: Mohammad Rubyet Islam
    Abstract: The pursuit of alpha returns that exceed market benchmarks has undergone a profound transformation, evolving from intuition-driven investing to autonomous, AI powered systems. This paper introduces a comprehensive five stage taxonomy that traces this progression across manual strategies, statistical models, classical machine learning, deep learning, and agentic architectures powered by large language models (LLMs). Unlike prior surveys focused narrowly on modeling techniques, this review adopts a system level lens, integrating advances in representation learning, multimodal data fusion, and tool augmented LLM agents. The strategic shift from static predictors to contextaware financial agents capable of real time reasoning, scenario simulation, and cross modal decision making is emphasized. Key challenges in interpretability, data fragility, governance, and regulatory compliance areas critical to production deployment are examined. The proposed taxonomy offers a unified framework for evaluating maturity, aligning infrastructure, and guiding the responsible development of next generation alpha systems.
    Date: 2025–05
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2505.14727
  7. By: Giorgos Iacovides; Wuyang Zhou; Danilo Mandic
    Abstract: Opinions expressed in online finance-related textual data are having an increasingly profound impact on trading decisions and market movements. This trend highlights the vital role of sentiment analysis as a tool for quantifying the nature and strength of such opinions. With the rapid development of Generative AI (GenAI), supervised fine-tuned (SFT) large language models (LLMs) have become the de facto standard for financial sentiment analysis. However, the SFT paradigm can lead to memorization of the training data and often fails to generalize to unseen samples. This is a critical limitation in financial domains, where models must adapt to previously unobserved events and the nuanced, domain-specific language of finance. To this end, we introduce FinDPO, the first finance-specific LLM framework based on post-training human preference alignment via Direct Preference Optimization (DPO). The proposed FinDPO achieves state-of-the-art performance on standard sentiment classification benchmarks, outperforming existing supervised fine-tuned models by 11% on the average. Uniquely, the FinDPO framework enables the integration of a fine-tuned causal LLM into realistic portfolio strategies through a novel 'logit-to-score' conversion, which transforms discrete sentiment predictions into continuous, rankable sentiment scores (probabilities). In this way, simulations demonstrate that FinDPO is the first sentiment-based approach to maintain substantial positive returns of 67% annually and strong risk-adjusted performance, as indicated by a Sharpe ratio of 2.0, even under realistic transaction costs of 5 basis points (bps).
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.18417
  8. By: Eduardo Abi Jaber (CMAP - Centre de Mathématiques Appliquées de l'Ecole polytechnique - Inria - Institut National de Recherche en Informatique et en Automatique - X - École polytechnique - IP Paris - Institut Polytechnique de Paris - CNRS - Centre National de la Recherche Scientifique); Louis-Amand Gérard (CES - Centre d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique)
    Abstract: We investigate the use of path signatures in a machine learning context for hedging exotic derivatives under non-Markovian stochastic volatility models. In a deep learning setting, we use signatures as features in feedforward neural networks and show that they outperform LSTMs in most cases, with orders of magnitude less training compute. In a shallow learning setting, we compare two regression approaches: the first directly learns the hedging strategy from the expected signature of the price process; the second models the dynamics of volatility using a signature volatility model, calibrated on the expected signature of the volatility. Solving the hedging problem in the calibrated signature volatility model yields more accurate and stable results across different payoffs and volatility dynamics.
    Keywords: Deep-hedging, non-Markovian stochastic volatility models, path-signatures, exotic derivatives, Fourier methods
    Date: 2025–08–03
    URL: https://d.repec.org/n?u=RePEc:hal:cesptp:hal-05197836
  9. By: Briola, Antonio; Bartolucci, Silvia; Aste, Tomaso
    Abstract: We exploit cutting-edge deep learning methodologies to explore the predictability of high-frequency Limit Order Book mid-price changes for a heterogeneous set of stocks traded on the NASDAQ exchange. In so doing, we release ‘LOBFrame’, an open-source code base to efficiently process large-scale Limit Order Book data and quantitatively assess state-of-the-art deep learning models' forecasting capabilities. Our results are twofold. We demonstrate that the stocks' microstructural characteristics influence the efficacy of deep learning methods and that their high forecasting power does not necessarily correspond to actionable trading signals. We argue that traditional machine learning metrics fail to adequately assess the quality of forecasts in the Limit Order Book context. As an alternative, we propose an innovative operational framework that evaluates predictions' practicality by focusing on the probability of accurately forecasting complete transactions. This work offers academics and practitioners an avenue to make informed and robust decisions on the application of deep learning techniques, their scope and limitations, effectively exploiting emergent statistical properties of the Limit Order Book.
    Keywords: deep learning; econophysics; high frequency trading; limit order book; market microstructure
    JEL: J1 F3 G3
    Date: 2025–07–22
    URL: https://d.repec.org/n?u=RePEc:ehl:lserod:128950
  10. By: Dulgeridis, Marcel; Schubart, Constantin; Dulgeridis, Sabrina
    Abstract: Accounting fraud poses significant financial and reputational risks for organizations. Traditional detection methods - such as manual audits and red-flag indicators - struggle to keep pace with the growing volume and complexity of financial data. In contrast, artificial intelligence technologies, including machine learning, anomaly detection, and natural language processing, offer scalable, realtime solutions to identify suspicious activity more efficiently. This paper compares conventional fraud detection techniques with AI-driven approaches, highlighting their respective strengths and limitations in terms of accuracy, efficiency, scalability, and adaptability. While AI enables faster and more comprehensive analysis, it also raises challenges related to data quality, algorithmic bias, and transparency. Ethical and legal considerations, including data privacy and compliance with regulations, are crucial for responsible implementation. The paper concludes with strategic recommendations for adopting AI-based fraud detection systems - emphasizing AI readiness, robust data governance, and human oversight. With a thoughtful approach, AI has the potential to significantly enhance the detection and prevention of accounting fraud.
    Keywords: Artificial Intelligence, Fraud Detection, Machine Learning, Anomaly Detection, Natural LanguageProcessing, Data Quality, Financial Fraud, Auditor Oversight, Transparency, AI Implementation
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:zbw:iubhbm:321858
  11. By: Tatsuru Kikuchi
    Abstract: While recent research demonstrates that AI route-optimization systems improve taxi driver productivity by 14\%, this study reveals that such findings capture only a fraction of AI's potential in transportation. We examine comprehensive weather-aware AI systems that integrate deep learning meteorological prediction with machine learning positioning optimization, comparing their performance against traditional operations and route-only AI approaches. Using simulation data from 10, 000 taxi operations across varied weather conditions, we find that weather-aware AI systems increase driver revenue by 107.3\%, compared to 14\% improvements from route-optimization alone. Weather prediction contributes the largest individual productivity gain, with strong correlations between meteorological conditions and demand ($r=0.575$). Economic analysis reveals annual earnings increases of 13.8 million yen per driver, with rapid payback periods and superior return on investment. These findings suggest that current AI literature significantly underestimates AI's transformative potential by focusing narrowly on routing algorithms, while weather intelligence represents an untapped \$8.9 billion market opportunity. Our results indicate that future AI implementations should adopt comprehensive approaches that address multiple operational challenges simultaneously rather than optimizing isolated functions.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.17099
  12. By: Abdullah Karasan; Ozge Sezgin Alp; Gerhard-Wilhelm Weber
    Abstract: In this study, we propose a novel machine-learning-based measure for stock price crash risk, utilizing the minimum covariance determinant methodology. Employing this newly introduced dependent variable, we predict stock price crash risk through cross-sectional regression analysis. The findings confirm that the proposed method effectively captures stock price crash risk, with the model demonstrating strong performance in terms of both statistical significance and economic relevance. Furthermore, leveraging a newly developed firm-specific investor sentiment index, the analysis identifies a positive correlation between stock price crash risk and firm-specific investor sentiment. Specifically, higher levels of sentiment are associated with an increased likelihood of stock price crash risk. This relationship remains robust across different firm sizes and when using the detoned version of the firm-specific investor sentiment index, further validating the reliability of the proposed approach.
    Date: 2025–05
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2505.16287
  13. By: Boris Hofmann; Xiaorui Tang; Feng Zhu
    Abstract: This paper examines the sentiments of central banks and the media regarding central bank digital currencies across 15 major global economies. Leveraging large language models, we develop jurisdiction-level central bank digital currency sentiment indices derived from central bank publications and news articles on a daily basis. Our findings reveal significant divergences between central bank and media sentiments, with notable variations over time and across jurisdictions. Analyzing the interplay between these sentiments, we observe that central bank sentiment tends to exert a stronger influence on media sentiment than the reverse. Additionally, we identify substantial cross-border sentiment spillovers, where sentiment in leading economies shapes sentiment in other regions. Through an event study approach, we demonstrate that cryptocurrency and equity markets primarily respond to shifts in central bank sentiments. Specifically, more positive central bank sentiments on central bank digital currency are associated with negative impacts on cryptocurrency market returns and the stock performance of banking and payment-related firms.
    Keywords: Central bank digital currency (CBDC), central bank communication, media sentiment, large language model (LLM), financial market
    JEL: E58 G12 G18
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:bis:biswps:1279
  14. By: Tomaso Duso; Joseph E. Harrington Jr.; Carl Kreuzberg; Geza Sapi
    Abstract: Competition authorities increasingly rely on economic screening tools to identify markets where firms deviate from competitive norms. Traditional screening methods assume that collusion occurs through secret agreements. However, recent research highlights that firms can use public announcements to coordinate decisions, reducing competition while avoiding detection. We propose a novel approach to screening for collusion in public corporate statements. Using natural language processing, we analyze more than 300, 000 earnings call transcripts issued worldwide between 2004 and 2022. By identifying expressions commonly associated with collusion, our method provides competition authorities with a tool to detect potentially anticompetitive behavior in public communications. Our approach can extend beyond earnings calls to other sources, such as news articles, trade press, and industry reports. Our method informed the European Commission’s 2024 unannounced inspections in the car tire sector, prompted by concerns over price coordination through public communication.
    Keywords: Communication, Collusion, NLP, Screening, Text Analysis
    JEL: C23 D22 L1 L4 L64
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:diw:diwwpp:dp2131
  15. By: Gabor Petnehazi; Laith Al Shaggah; Jozsef Gall; Bernadett Aradi
    Abstract: This study explores the potential of zero-shot time series forecasting, an innovative approach leveraging pre-trained foundation models, to forecast mortality rates without task-specific fine-tuning. We evaluate two state-of-the-art foundation models, TimesFM and CHRONOS, alongside traditional and machine learning-based methods across three forecasting horizons (5, 10, and 20 years) using data from 50 countries and 111 age groups. In our investigations, zero-shot models showed varying results: while CHRONOS delivered competitive shorter-term forecasts, outperforming traditional methods like ARIMA and the Lee-Carter model, TimesFM consistently underperformed. Fine-tuning CHRONOS on mortality data significantly improved long-term accuracy. A Random Forest model, trained on mortality data, achieved the best overall performance. These findings underscore the potential of zero-shot forecasting while highlighting the need for careful model selection and domain-specific adaptation.
    Date: 2025–05
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2505.13521
  16. By: Joshua Foster; Fredrik Odegaard
    Abstract: This paper proposes a new demand estimation method using attention-based language models. An encoder-only language model is trained in a two-stage process to analyze the natural language descriptions of used cars from a large US-based online auction marketplace. The approach enables semi-nonparametrically estimation for the demand primitives of a structural model representing the private valuations and market size for each vehicle listing. In the first stage, the language model is fine-tuned to encode the target auction outcomes using the natural language vehicle descriptions. In the second stage, the trained language model's encodings are projected into the parameter space of the structural model. The model's capability to conduct counterfactual analyses within the trained market space is validated using a subsample of withheld auction data, which includes a set of unique "zero shot" instances.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.17564
  17. By: DICKERSON, Alexander; NOZAWA, Yoshio; ROBOTTI, Cesare
    Abstract: We present a tractable framework for evaluating the cost of delays induced by infrequent trading in the corporate bond market. Using 341 corporate bond factors from OpenBondAssetPricing.com and machine learning models trained on their underlying signals, we demonstrate that, before transaction costs, 51 factors outperform the bond market. However, this number drops to nearly zero after accounting for trading frictions because the cost of delay is amplified for highly profitable factors. Trading a subset of liquid bonds does not eliminate this cost because liquidity is hard to predict and sales delays cannot be avoided, underscoring the critical impact of delay costs.
    Keywords: Corporate Bonds, Liquidity, Market Efficiency, Fixed-Income Securities, Credit Risk, Machine Learning
    JEL: G12 G13
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:hit:hituec:771
  18. By: Lijun Wu; Dong Hao; Zhiyi Fan
    Abstract: Graph Neural Networks (GNNs) have achieved outstanding performance across a wide range of graph-related tasks. However, their "black-box" nature poses significant challenges to their explainability, and existing methods often fail to effectively capture the intricate interaction patterns among nodes within the network. In this work, we propose a novel explainability framework, GraphEXT, which leverages cooperative game theory and the concept of social externalities. GraphEXT partitions graph nodes into coalitions, decomposing the original graph into independent subgraphs. By integrating graph structure as an externality and incorporating the Shapley value under externalities, GraphEXT quantifies node importance through their marginal contributions to GNN predictions as the nodes transition between coalitions. Unlike traditional Shapley value-based methods that primarily focus on node attributes, our GraphEXT places greater emphasis on the interactions among nodes and the impact of structural changes on GNN predictions. Experimental studies on both synthetic and real-world datasets show that GraphEXT outperforms existing baseline methods in terms of fidelity across diverse GNN architectures , significantly enhancing the explainability of GNN models.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.17848
  19. By: Ran Abramitzky; Leah Platt Boustan; Harriet M. Brookes Gray; Katherine Eriksson; Santiago Pérez; Hannah M. Postel; Myera Rashid; Noah Simon
    Abstract: We introduce a new rule-based linking method for historical Census records. We augment earlier algorithms based on name, age and place of birth (Abramitzky, Boustan, Eriksson, 2012, or “basic ABE”), with five matching characteristics – middle initial, county of residence, and spouse and parents’ names. Relative to basic ABE, ABE-Extra Information (“ABE-EI”) greatly increases match rates, improves accuracy and is similarly representative of the population on most attributes, with geographic mobility being one important exception. Relative to machine learning algorithms, ABE-EI has somewhat lower match rates, improved representativeness, and offers full replicability. We also create the first ABE-based links for women.
    JEL: N31 N32
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:nbr:nberwo:33999
  20. By: Patrick Cheridito; Jean-Loup Dupret; Donatien Hainaut
    Abstract: In this paper, we introduce a model-based deep-learning approach to solve finite-horizon continuous-time stochastic control problems with jumps. We iteratively train two neural networks: one to represent the optimal policy and the other to approximate the value function. Leveraging a continuous-time version of the dynamic programming principle, we derive two different training objectives based on the Hamilton-Jacobi-Bellman equation, ensuring that the networks capture the underlying stochastic dynamics. Empirical evaluations on different problems illustrate the accuracy and scalability of our approach, demonstrating its effectiveness in solving complex, high-dimensional stochastic control tasks.
    Date: 2025–05
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2505.15602
  21. By: Martin Feldkircher; Christos A. Makridis
    Abstract: This paper studies how the linguistic features of central bank communication affect household economic sentiment. Linking central bank speeches from 29 countries to individual-level data from the Gallup World Poll (2006-2023), we examine whether speech complexity--captured through sentiment, tone, length, and readability--is associated with public perceptions of the economy and labor market. We find that longer and more syntactically complex speeches are consistently linked to lower economic confidence and less favorable views of the job climate. Positive sentiment in modal (i.e., policy-relevant) sentences is associated with more optimistic household outlooks. These effects are stronger among younger and college-educated respondents, suggesting differential processing of complex information. These findings underscore the importance of clarity and tone in central bank messaging and support the view that communication is a key behavioral channel of monetary policy transmission.
    Keywords: central bank communication, household expectations, sentiment analysis, monetary policy, public trust, global survey data
    JEL: E52 E58 D84 H63 C23
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:een:camaaa:2025-43
  22. By: Erdinc Akyildirim (University of Nottingham); Gamze Ozturk Danisman (Istanbul Bilgi University); Steven Ongena (University of Zurich - Department Finance; Swiss Finance Institute; KU Leuven; NTNU Business School; Centre for Economic Policy Research (CEPR))
    Abstract: Using a panel of 929 U.S. publicly listed firms, this paper investigates the impact of artificial intelligence (AI) employment on the disclosure of political risk in corporate earnings calls. We utilize the firm-level AI employment measure developed by Babina et al. (2024), based on resume and job posting records. Furthermore, we supplement it with our newly generated AI disclosure indices at the firm level, created through textual analysis of earnings call transcripts. Our findings indicate that firms with greater AI employment are significantly less likely to disclose information about political risk during earnings calls. We propose a dual mechanism that underpins this association. First, AI enables narrative management: firms use AI tools to strategically alter the tone and wording of disclosures, avoiding phrases that may elicit unfavorable sentiment, leading to a reduction in reputational risk. Second, AI improves firms’ internal performance and risk management, hence reducing the need for voluntary political risk disclosures. Our findings add to the literature on voluntary disclosure and the economic implications of AI by indicating that AI, as a general-purpose technology, has unintended consequences for corporate transparency.
    Keywords: Artificial Intelligence (AI), political risk, voluntary disclosures, earnings calls, textual analysis, AI disclosure index
    Date: 2025–06
    URL: https://d.repec.org/n?u=RePEc:chf:rpseri:rp2556
  23. By: Christopher Clayton; Antonio Coppola; Matteo Maggiori; Jesse Schreger
    Abstract: Geoeconomic pressure—the use of existing economic relationships by governments to achieve geopolitical or economic goals—is a prominent feature of global power dynamics. This paper introduces a methodology using large language models (LLMs) to systematically identify the application of and response to geoeconomic pressure from large textual corpora. We classify which governments apply pressure to which foreign targets, using which instruments, firms, and products. We demonstrate that firms affected by tariffs respond primarily with price changes whereas firms affected by export controls respond disproportionately by investing in research and development. We document significant heterogeneity in how firms respond to pressure based on whether their home government is applying the pressure, whether their home country is the recipient of the pressure, or whether they are based in an affected third party country. Finally, we quantify the degree of measurement uncertainty generated by the LLM-based analysis by comparing the classifications across multiple open-weight models as well as considering a wide range of variations of our prompts.
    JEL: C4 F3 F4 G3
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:nbr:nberwo:34020
  24. By: Christopher Rauh; Sophie Brochet; Hannes Mueller
    Abstract: The correct measurement of economic policy uncertainty (EPU) plays a critical role in many policy settings -in particular where economic policy decisions need to be taken in response to large shocks. One such large shock is armed conflict. But, counterintuitively, the standard text-based EPU index systematically declines during armed conflict periods. Using a global news corpus covering 192 countries and over 5 million articles, we show that this decline is driven not by reduced uncertainty, but by a crowding out of reporting on economics and policy. We show that a combination of topic modeling and two-way fixed effects can be used to adjust the measurement of EPU, providing a new view on political risk during armed conflict. After adjustment, the EPU aligns more closely with firm perceptions, political risk insurance and investment during armed conflict.
    Keywords: armed conflict, Macroeconomic uncertainty, Measurement Bias, Text-Based Indices
    JEL: C43 D80 E32 F51 H56
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:bge:wpaper:1503

This nep-big issue is ©2025 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.