nep-big New Economics Papers
on Big Data
Issue of 2026–01–05
twenty papers chosen by
Tom Coupé, University of Canterbury


  1. Machine learning models for predicting catastrophe bond coupons using climate data By Julia Ko\'nczal; Micha{\l} Balcerek; Krzysztof Burnecki
  2. Alpha-R1: Alpha Screening with LLM Reasoning via Reinforcement Learning By Zuoyou Jiang; Li Zhao; Rui Sun; Ruohan Sun; Zhongjian Li; Jing Li; Daxin Jiang; Zuo Bai; Cheng Hua
  3. Parallel Trends Forest: Data-Driven Control Sample Selection in Difference-in-Differences By Yesol Huh; Matthew Kling
  4. The Nonstationarity-Complexity Tradeoff in Return Prediction By Agostino Capponi; Chengpiao Huang; J. Antonio Sidaoui; Kaizheng Wang; Jiacheng Zou
  5. Measuring Corruption from Text Data By Muço, Arieda
  6. Is smooth Energiewende possible? Improving the performance of climate policies in Germany by optimizing the risk of electricity delivery By Jakub Bandurski; Eliza Hałatek; Adam Łaziński; Michał Künstler
  7. Sentiment and Uncertainty Indices from economic news in Colombia By Rocío Clara A. Mora-Quiñones; Antonio José Orozco-Gallo; Dora Alicia Mora-Pérez
  8. Nowcasting GCC GDP: A Machine Learning Solution for Enhanced Non-Oil GDP Prediction By Greta Polo; Yuan Gao Rollinson; Ms. Yevgeniya Korniyenko; Tongfang Yuan
  9. Interpretable Deep Learning for Stock Returns: A Consensus-Bottleneck Asset Pricing Model By Bong-Gyu Jang; Younwoo Jeong; Changeun Kim
  10. Automated Credit Limit Increases and Consumer Welfare By Vitaly M. Bord; Agnes Kovacs; Patrick Moran
  11. From Tweets to Transactions: High-Frequency Inflation Expectations, Consumption, and Stock Returns By Benjamin Born; Nora Lamersdorf; Jana-Lynn Schuster; Sascha Steffen
  12. Model Estimation using Categorical Satellite Data with Misclassification By Wardle, Arthur R.; Bruno, Ellen
  13. Copyright and Competition: Estimating Supply and Demand with Unstructured Data By Sukjin Han; Kyungho Lee
  14. The Big Three in Marriage Talk: LLM-Assisted Analysis of Moral Ethics and Sentiment on Weibo and Xiaohongshu By Frank Tian-Fang Ye; Xiaozi Gao
  15. Childhood Aspirations and Adult Outcomes By Margaret Leighton; Irina Merkurieva
  16. When Artificial Minds Negotiate: Dark Personality and the Ultimatum Game in Large Language Models By Ferraz, Vinícius; Olah, Tamas; Sazedul, Ratin; Schmidt, Robert; Schwieren, Christiane
  17. Can LLMs Improve Sanctions Screening in the Financial System? Evidence from a Fuzzy Matching Assessment By Jeffrey Allen; Max S. Hatfield
  18. One Fed, Many Voices: Coordinated Communication vs. Transparent Debate By Milena Djourelova; Filippo Ferroni; Leonardo Melosi; Alessandro Villa
  19. Speaking to the Markets: The Role of IMF Announcements in Investors’ Confidence By Beatrice Maryline Sagna; Solo Zerbo
  20. One Sentence at a Time: A Quantitative History of Rationality in Economic Thought By Delcey, Thomas; Goutsmedt, Aurélien; Truc, Alexandre

  1. By: Julia Ko\'nczal; Micha{\l} Balcerek; Krzysztof Burnecki
    Abstract: In recent years, the growing frequency and severity of natural disasters have increased the need for effective tools to manage catastrophe risk. Catastrophe (CAT) bonds allow the transfer of part of this risk to investors, offering an alternative to traditional reinsurance. This paper examines the role of climate variability in CAT bond pricing and evaluates the predictive performance of various machine learning models in forecasting CAT bond coupons. We combine features typically used in the literature with a new set of climate indicators, including Oceanic Ni{\~n}o Index, Arctic Oscillation, North Atlantic Oscillation, Outgoing Longwave Radiation, Pacific-North American pattern, Pacific Decadal Oscillation, Southern Oscillation Index, and sea surface temperatures. We compare the performance of linear regression with several machine learning algorithms, such as random forest, gradient boosting, extremely randomized trees, and extreme gradient boosting. Our results show that including climate-related variables improves predictive accuracy across all models, with extremely randomized trees achieving the lowest root mean squared error (RMSE). These findings suggest that large-scale climate variability has a measurable influence on CAT bond pricing and that machine learning methods can effectively capture these complex relationships.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.22660
  2. By: Zuoyou Jiang; Li Zhao; Rui Sun; Ruohan Sun; Zhongjian Li; Jing Li; Daxin Jiang; Zuo Bai; Cheng Hua
    Abstract: Signal decay and regime shifts pose recurring challenges for data-driven investment strategies in non-stationary markets. Conventional time-series and machine learning approaches, which rely primarily on historical correlations, often struggle to generalize when the economic environment changes. While large language models (LLMs) offer strong capabilities for processing unstructured information, their potential to support quantitative factor screening through explicit economic reasoning remains underexplored. Existing factor-based methods typically reduce alphas to numerical time series, overlooking the semantic rationale that determines when a factor is economically relevant. We propose Alpha-R1, an 8B-parameter reasoning model trained via reinforcement learning for context-aware alpha screening. Alpha-R1 reasons over factor logic and real-time news to evaluate alpha relevance under changing market conditions, selectively activating or deactivating factors based on contextual consistency. Empirical results across multiple asset pools show that Alpha-R1 consistently outperforms benchmark strategies and exhibits improved robustness to alpha decay. The full implementation and resources are available at https://github.com/FinStep-AI/Alpha-R1.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.23515
  3. By: Yesol Huh; Matthew Kling
    Abstract: This paper introduces parallel trends forest, a novel approach to selecting optimal control samples when using difference-in-differences (DiD) in a relatively long panel data with little randomization in treatment assignment. Our method uses machine learning techniques to find control units that best meet the parallel trends assumption. We demonstrate that our approach outperforms existing methods, particularly with noisy, granular data. Applying the parallel trends forest to analyze the impact of post-trade transparency in corporate bond markets, we find that it produces more robust estimates compared to traditional two-way fixed effects models. Our results suggest that the effect of transparency on bond turnover is small and not statistically significant when allowing for constrained deviations from parallel trends. This method offers researchers a powerful tool for conducting more reliable DiD analyses in complex, real-world settings.
    Keywords: Causal inference; Difference-in-differences; Parallel trends assumption; Random forest
    JEL: C10 C21 C23 G12
    Date: 2025–09–29
    URL: https://d.repec.org/n?u=RePEc:fip:fedgfe:2025-91
  4. By: Agostino Capponi; Chengpiao Huang; J. Antonio Sidaoui; Kaizheng Wang; Jiacheng Zou
    Abstract: We investigate machine learning models for stock return prediction in non-stationary environments, revealing a fundamental nonstationarity-complexity tradeoff: complex models reduce misspecification error but require longer training windows that introduce stronger non- stationarity. We resolve this tension with a novel model selection method that jointly optimizes model class and training window size using a tournament procedure that adaptively evaluates candidates on non-stationary validation data. Our theoretical analysis demonstrates that this approach balances misspecification error, estimation variance, and non-stationarity, performing close to the best model in hindsight. Applying our method to 17 industry portfolio returns, we consistently outperform standard rolling-window benchmarks, improving out-of-sample $R^2$ by 14-23% on average. During NBER- designated recessions, improvements are substantial: our method achieves positive $R^2$ during the Gulf War recession while benchmarks are negative, and improves $R^2$ in absolute terms by at least 80bps during the 2001 recession as well as superior performance during the 2008 Financial Crisis. Economically, a trading strategy based on our selected model generates 31% higher cumulative returns averaged across the industries.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.23596
  5. By: Muço, Arieda
    Abstract: Using Brazilian municipal audit reports, I construct an automated corruption index that combines a dictionary of audit irregularities with principal component analysis. The index validates strongly against independent human coders, explaining 71–73 % of the variation in hand-coded corruption counts in samples where coders themselves exhibit high agreement, and the results are robust within these validation samples. The index behaves as theory predicts, correlating with municipal characteristics that prior research links to corruption. Supervised learning alternatives yield nearly identical municipal rankings (R2=0.98), confirming that the dictionary approach captures the same underlying construct. The method scales to the full audit corpus and offers advantages over both manual coding and Large Language Models (LLMs) in transparency, cost, and long-run replicability.
    Date: 2025–12–12
    URL: https://d.repec.org/n?u=RePEc:osf:socarx:cftvk_v1
  6. By: Jakub Bandurski (University of Warsaw, Faculty of Economic Sciences); Eliza Hałatek (University of Warsaw, Faculty of Economic Sciences); Adam Łaziński (University of Warsaw, Faculty of Economic Sciences); Michał Künstler (University of Warsaw, Faculty of Economic Sciences)
    Abstract: The Energiewende is a deep-rooted notion in the German economy. The main goal is to achieve climate neutrality by transitioning to renewable energy sources. However, the feasibility of this transition is partially hindered by power grid congestion, which undermines system efficiency and leads to both economic and environmental costs. We address this issue by making a prediction of the likelihood of congestion occurrence within the German TenneT DE electricity network in the years 2020-2023. We propose a twofold approach offering a combination of advanced econometric models and state-of-the-art machine learning methods. We offer separate solutions for up congestion when additional energy needs to be pushed to the network as well as down congestion when energy needs to be pulled away from the network. Analyzing the CatBoost with XAI, we identify factors that play a significant role in driving redispatch events within the German electricity network.
    Keywords: energiewende, econometrics, machine learning, climate policy, catboost
    JEL: Q47 Q48 Q54 C01 C53
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:war:wpaper:2025-30
  7. By: Rocío Clara A. Mora-Quiñones; Antonio José Orozco-Gallo; Dora Alicia Mora-Pérez
    Abstract: This study introduces an approach for measuring sentiment and uncertainty indices in Colombia through text mining. Economic news from digital media, spanning March 2020 to September 2024, is analyzed using dictionary-based methods and predefined word lists. The constructed indices reflect major macroeconomic events, such as the phased reopening during the pandemic, the national strike in May 2021, and the decline in demand associated with elevated inflation. These indices function as leading indicators and exhibit statistically significant associations with high-frequency economic data. Incorporating news-based sentiment and uncertainty indices improves the precision of nowcasting Colombia’s economic activity using a dynamic factor model. The results indicate that incorporating qualitative, forward-looking news with traditional data enhances the monitoring of short-term economic fluctuations and the identification of turning points. *****RESUMEN: Este estudio presenta un método para medir el sentimiento y la incertidumbre económica en Colombia mediante técnicas de minería de texto. A partir de noticias publicadas entre marzo de 2020 y septiembre de 2024 y empleando metodologías de diccionario basadas en listas predefinidas de palabras positivas y negativas, se construyeron los índices de sentimiento e incertidumbre. Estos índices identificaron episodios macroeconómicos relevantes, como la reapertura gradual tras la pandemia, el Paro Nacional de 2021 y la desaceleración de la demanda en un entorno de elevada inflación. Los índices exhiben propiedades de series adelantadas y mantienen relaciones estadísticamente significativas con variables económicas de alta frecuencia. El análisis empírico muestra que su incorporación en modelos factoriales dinámicos mejora de manera sistemática la precisión en los pronósticos de la actividad económica. Los resultados muestran que la información cualitativa y prospectiva contenida en las noticias complementa los datos tradicionales y fortalece la capacidad para determinar dinámicas de corto plazo y anticipar puntos de inflexión de la actividad económica colombiana.
    Keywords: sentiment, uncertainty, artificial intelligence, text analysis techniques, natural language processing, dynamic factor model, sentimiento, incertidumbre, inteligencia artificial, técnicas de análisis de texto, procesamiento de lenguaje natural, modelo factorial dinámico.
    JEL: C53 C82 E27
    Date: 2026–01
    URL: https://d.repec.org/n?u=RePEc:bdr:borrec:1340
  8. By: Greta Polo; Yuan Gao Rollinson; Ms. Yevgeniya Korniyenko; Tongfang Yuan
    Abstract: This paper presents a machine learning–based nowcasting framework for estimating quarterly non-oil GDP growth in the Gulf Cooperation Council (GCC) countries. Leveraging machine learning models tailored to each country, the framework integrates a broad range of high-frequency indicators—including real activity, financial conditions, trade, and oil-related variables—to produce timely, sector-specific estimates. Advancing the nowcasting literature for the MENA region, this approach moves beyond single-model methodologies by incorporating a richer set of high-frequency, cross-border indicators. It presents two key innovations: (i) a tailored data integration strategy that broadens and automates the use of high-frequency indicators; and (ii) a novel application of Shapley value decompositions to enhance model interpretability and guide the iterative selection of predictive indicators. The framework’s flexibility allows it to account for the region’s unique economic structures, ongoing reform agendas, and the spillover effects of oil market volatility on non-oil sectors. By enhancing the granularity, responsiveness, and transparency of short-term forecasts, the model enables faster, data-driven policy decisions strengthening economic surveillance and enhancing policy agility across the GCC amid a rapidly evolving global environment.
    Keywords: GCC; Nowcasting; Machine Learning; Non-oil Growth
    Date: 2025–12–19
    URL: https://d.repec.org/n?u=RePEc:imf:imfwpa:2025/268
  9. By: Bong-Gyu Jang; Younwoo Jeong; Changeun Kim
    Abstract: We introduce the \textit{Consensus-Bottleneck Asset Pricing Model} (CB-APM), a partially interpretable neural network that replicates the reasoning processes of sell-side analysts by capturing how dispersed investor beliefs are compressed into asset prices through a consensus formation process. By modeling this ``bottleneck'' to summarize firm- and macro-level information, CB-APM not only predicts future risk premiums of U.S. equities but also links belief aggregation to expected returns in a structurally interpretable manner. The model improves long-horizon return forecasts and outperforms standard deep learning approaches in both predictive accuracy and explanatory power. Comprehensive portfolio analyses show that CB-APM's out-of-sample predictions translate into economically meaningful payoffs, with monotonic return differentials and stable long-short performance across regularization settings. Empirically, CB-APM leverages consensus as a regularizer to amplify long-horizon predictability and yields interpretable consensus-based components that clarify how information is priced in returns. Moreover, regression and GRS-based pricing diagnostics reveal that the learned consensus representations capture priced variation only partially spanned by traditional factor models, demonstrating that CB-APM uncovers belief-driven structure in expected returns beyond the canonical factor space. Overall, CB-APM provides an interpretable and empirically grounded framework for understanding belief-driven return dynamics.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.16251
  10. By: Vitaly M. Bord; Agnes Kovacs; Patrick Moran
    Abstract: In the United States, credit card companies frequently use machine learning algorithms to proactively raise credit limits for borrowers. In contrast, an increasing number of countries have begun to prohibit credit limit increases initiated by banks rather than consumers. In this paper, we exploit detailed regulatory micro data to examine the extent to which bank-initiated credit limit increases are directed towards individuals with revolving debt. We then develop a model that captures the costs and benefits of regulating proactive credit limit increases, which we use to quantify their importance and evaluate the implications for household well-being.
    Keywords: Algorithmic lending; Behavioral finance; Consumer protection; Credit cards; Credit limit increases; Financial regulation
    JEL: D14 D18 D91 G21 G28 G51 L51
    Date: 2025–09–24
    URL: https://d.repec.org/n?u=RePEc:fip:fedgfe:2025-88
  11. By: Benjamin Born; Nora Lamersdorf; Jana-Lynn Schuster; Sascha Steffen
    Abstract: Using modern natural language processing, we construct a high-frequency inflation expectations index from German-language tweets. This index closely tracks realized inflation and aligns even more closely with household survey expectations. It also improves short-run forecasts relative to standard benchmarks. In response to monetary policy tightening, the index declines within about a week, with the effects concentrated in tweets by private individuals and during the recent period of elevated inflation. Using 117 million online transactions from German retailers, we show that higher inflation expectations are followed by lower household spending on discretionary goods. By linking these shifts in demand to stock returns, we find that, during periods of elevated inflation, firms operating in discretionary sectors experience significantly lower stock returns when inflation expectations rise. Thus, our Twitter-based index provides market participants and policymakers with a timely tool to monitor inflation sentiment and its economic consequences.
    Keywords: inflation expectations, social media (Twitter/X), large language models (LLMs), NLP, household consumption, stock returns, monetary policy
    JEL: E31 D84 E58 C45 C81
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:ces:ceswps:_12361
  12. By: Wardle, Arthur R.; Bruno, Ellen
    Keywords: Research Methods/Statistical Methods, Crop Production/Industries
    Date: 2024
    URL: https://d.repec.org/n?u=RePEc:ags:aaea24:343766
  13. By: Sukjin Han; Kyungho Lee
    Abstract: We study the competitive and welfare effects of copyright in creative industries in the face of cost-reducing technologies such as generative artificial intelligence. Creative products often feature unstructured attributes (e.g., images and text) that are complex and high-dimensional. To address this challenge, we study a stylized design product—fonts—using data from the world’s largest font marketplace. We construct neural network embeddings to quantify unstructured attributes and measure visual similarity in a manner consistent with human perception. Spatial regression and event-study analyses demonstrate that competition is local in the visual characteristics space. Building on this evidence, we develop a structural model of supply and demand that incorporates embeddings and captures product positioning under copyright-based similarity constraints. Our estimates reveal consumers’ heterogeneous design preferences and producers’ cost-effective mimicry advantages. Counterfactual analyses show that copyright protection can raise consumer welfare by encouraging product relocation, and that the optimal policy depends on the interaction between copyright and cost-reducing technologies.
    Date: 2025–04–02
    URL: https://d.repec.org/n?u=RePEc:bri:uobdis:25/816
  14. By: Frank Tian-Fang Ye (Division of Social Sciences, The HKU SPACE Community College, Hong Kong SAR, PRC); Xiaozi Gao (Department of Early Childhood Education, Education University of Hong Kong, Hong Kong SAR, PRC)
    Abstract: China's marriage registrations have declined dramatically, dropping from 13.47 million couples in 2013 to 6.1 million in 2024. Understanding public attitudes toward marriage requires examining not only emotional sentiment but also the moral reasoning underlying these evaluations. This study analyzed 219, 358 marriage-related posts from two major Chinese social media platforms (Sina Weibo and Xiaohongshu) using large language model (LLM)-assisted content analysis. Drawing on Shweder's Big Three moral ethics framework, posts were coded for sentiment (positive, negative, neutral) and moral dimensions (Autonomy, Community, Divinity). Results revealed platform differences: Weibo discourse skewed positive, while Xiaohongshu was predominantly neutral. Most posts across both platforms lacked explicit moral framing. However, when moral ethics were invoked, significant associations with sentiment emerged. Posts invoking Autonomy ethics and Community ethics were predominantly negative, whereas Divinity-framed posts tended toward neutral or positive sentiment. These findings suggest that concerns about both personal autonomy constraints and communal obligations drive negative marriage attitudes in contemporary China. The study demonstrates LLMs' utility for scaling qualitative analysis and offers insights for developing culturally informed policies addressing marriage decline in Chinese contexts.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.23609
  15. By: Margaret Leighton (University of St Andrews); Irina Merkurieva (University of St Andrews)
    Abstract: This paper extracts aspirations from texts written in childhood by members of a British longitudinal cohort and explores how these relate to later life outcomes. Applying Natural Language Processing (NLP) tools to short essays collected at age 11, we identify four aspiration themes: family, hobbies, financial success, and career. The weight of these four themes varies substantially across respondents, with girls on average placing more weight on family, and boys on financial success. Aspirations extracted using our method are strongly predictive of later life outcomes, even when controlling for detailed measures of early life environment, ability, and family background. These associations are often highly heterogeneous by gender; for example, family-related aspirations are associated with higher educational attainment for men, but lower educational attainment for women.
    Keywords: Aspirations; Education; Natural Language Processing; NCDS
    JEL: J24 J26 Z13
    Date: 2025–12–17
    URL: https://d.repec.org/n?u=RePEc:san:econdp:2505
  16. By: Ferraz, Vinícius; Olah, Tamas; Sazedul, Ratin; Schmidt, Robert; Schwieren, Christiane
    Abstract: We investigate if Large Language Models (LLMs) exhibit personality-driven strategic behavior in the Ultimatum Game by manipulating Dark Factor of Personality (D-Factor) profiles via standardized prompts. Across 400k decisions from 17 open-source models and 4, 166 human benchmarks, we test whether LLMs playing the proposer and responder roles exhibit systematic behavioral shifts across five D-Factor levels (from least to most selfish). The proposer role exhibited strong monotonic declines in fair offers from 91% (D1) to 17% (D5), mirroring human patterns but with 34% steeper gradients, indicating hypersensitivity to personality prompts. Responders diverged sharply: where humans became more punitive at higher D-levels, LLMs maintained high acceptance rates (75-92%) with weak or reversed D-Factor sensitivity, failing to reproduce reciprocity-punishment dynamics. These role-specific patterns align with strong-weak situation accounts—personality matters when incentives are ambiguous (proposers) but is muted when contingent (responders). Cross-model heterogeneity was substantial: Models exhibiting the closest alignment with human behavior, according to composite similarity scores (integrating prosocial rates, D-Factor correlations, and odds ratios), were dolphin3, deepseek_1.5b, and llama3.2 (0.74-0.85), while others exhibited extreme or non-variable behavior. Temperature settings (0.2 vs. 0.8) exerted minimal influence. We interpret these patterns as prompt-driven regularities rather than genuine motivational processes, suggesting LLMs can approximate but not fully replicate human strategic behavior in social dilemmas.
    Date: 2025–12–16
    URL: https://d.repec.org/n?u=RePEc:awi:wpaper:0768
  17. By: Jeffrey Allen; Max S. Hatfield
    Abstract: We examined the performance of four families of large language models (LLMs) and a variety of common fuzzy matching algorithms in assessing the similarity of names and addresses in a sanctions screening context. On average, across a range of realistic matching thresholds, the LLMs in our study reduced sanctions screening false positives by 92 percent and increased detection rates by 11 percent relative to the best-performing fuzzy matching baseline. Smaller, less computationally intensive models from the same language model families performed comparably, which may support scaling. In terms of computing performance, the LLMs were, on average, over four orders of magnitude slower than the fuzzy methods. To help address this, we propose a model cascade that escalates higher uncertainty screening cases to LLMs, while relying on fuzzy and exact matching for easier cases. The cascade is nearly twice as fast and just as accurate as the pure LLM system. We show even stronger runtime gains and comparable screening accuracy by relying on the fastest language models within the cascade. In the near term, the economic cost of running LLMs, inference latency, and other frictions, including API limits, will likely necessitate using these types of tiered approaches for sanctions screening in high-velocity and high-throughput financial activities, such as payments. Sanctions screening in slower-moving processes, such as customer due diligence for account opening and lending, may be able to rely on LLMs more extensively.
    Keywords: Large Language Models; Sanctions Screening; Model cascading
    Date: 2025–09–29
    URL: https://d.repec.org/n?u=RePEc:fip:fedgfe:2025-92
  18. By: Milena Djourelova; Filippo Ferroni; Leonardo Melosi; Alessandro Villa
    Abstract: We analyze 481 speeches by FOMC members since 2007, excluding official press conferences. Combining high-frequency financial data with text analysis, we identify monetary policy surprises and measure each speech’s similarity to the Chair’s press conference preceding it. On average, monetary surprises around these speeches have no significant effect on inflation expectations or stock prices. Yet, speeches closely aligned with the Chair’s press conference amplify policy transmission, while less coordinated remarks dilute earlier effects on yields, inflation expectations, and equities. A general equilibrium model with incomplete information rationalizes these findings.
    Keywords: monetary policy communication; FOMC; Text Analysis; Central bank; Market expectations
    JEL: C55 D83 E52 E58 G14
    Date: 2025–11–20
    URL: https://d.repec.org/n?u=RePEc:fip:fedhwp:102273
  19. By: Beatrice Maryline Sagna; Solo Zerbo
    Abstract: This paper examines the effect of IMF staff and executive board announcements on sovereign bond spreads across emerging and developing economies during economic uncertainty. We derive testable predictions from a stylized model in which IMF announcements serve as a signal to ambiguity-averse investors, narrowing their posterior beliefs about macroeconomic parameters and lowering the ambiguity premium in bond pricing. To empirically assess this, we train a large language model to identify and categorize economic signals by topic, assessing both the sentiment and the intensity of their coverage. The analysis yields several key insights. First, IMF announcements exhibit strong interdependencies among topics indicating a holistic approach to economic diagnostics. Second, using a local projection approach, we show that increase in the positivity of an IMF press release’s tone leads to a significant reduction in sovereign spreads in the short term. This effect is more pronounced for countries with higher spreads, larger program size, and for topics such as debt, FX and reserves, and fiscal.
    Keywords: Economic Uncertainty; IMF Announcements; Sovereign Spreads; Ambiguity Premium; Large Language Model; Local Projection
    Date: 2025–12–12
    URL: https://d.repec.org/n?u=RePEc:imf:imfwpa:2025/258
  20. By: Delcey, Thomas; Goutsmedt, Aurélien (UC Louvain - F.R.S-FNRS); Truc, Alexandre
    Abstract: This article demonstrates how unsupervised quantitative methods can enrich the history of economic thought. Using the largest English-language corpus ever assembled for the field—nearly 290, 000 economics journal articles from 1900 to 2009 with citation data—we analyze the evolution of the concept of rationality. Combining large language model–based semantic analysis with bibliometric and network methods, we identify and cluster discussions of rationality across time and scales, such as the circulation of bounded rationality and the emergence of behavioral economics. We provide an open-source interactive tool to support transparency and reuse.
    Date: 2025–12–23
    URL: https://d.repec.org/n?u=RePEc:osf:socarx:38na2_v1

This nep-big issue is ©2026 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.