nep-big New Economics Papers
on Big Data
Issue of 2025–10–20
twenty-one papers chosen by
Tom Coupé, University of Canterbury


  1. From Headlines to Holdings: Deep Learning for Smarter Portfolio Decisions By Yun Lin; Jiawei Lou; Jinghe Zhang
  2. Interpretable Machine Learning for Predicting Startup Funding, Patenting, and Exits By Saeid Mashhadi; Amirhossein Saghezchi; Vesal Ghassemzadeh Kashani
  3. Neural Network Convergence for Variational Inequalities By Yun Zhao; Harry Zheng
  4. Deep Learning in the Sequence Space By Marlon Azinovic-Yang; Jan \v{Z}emli\v{c}ka
  5. Identifying and Quantifying Financial Bubbles with the Hyped Log-Periodic Power Law Model By Zheng Cao; Xingran Shao; Yuheng Yan; Helyette Geman
  6. Denoised IPW-Lasso for Heterogeneous Treatment Effect Estimation in Randomized Experiments By Mingqian Guan; Komei Fujita; Naoya Sueishi; Shota Yasui
  7. Determinants of Latin American students academic resilience-Insights based on PISA 2022 using an explainable machine learning approach By Marcos Delprato
  8. Application of Deep Reinforcement Learning to At-the-Money S&P 500 Options Hedging By Zofia Bracha; Pawe{\l} Sakowski; Jakub Micha\'nk\'ow
  9. Macroeconomic Forecasting and Machine Learning By Ta-Chung Chi; Ting-Han Fan; Raffaele M. Ghigliazza; Domenico Giannone; Zixuan; Wang
  10. Continuous-Time Reinforcement Learning for Asset-Liability Management By Yilie Huang
  11. A Practitioner's Guide to AI+ML in Portfolio Investing By Mehmet Caner Qingliang Fan
  12. Multimodal Language Models with Modality-Specific Experts for Financial Forecasting from Interleaved Sequences of Text and Time Series By Ross Koval; Nicholas Andrews; Xifeng Yan
  13. Forecasting Liquidity Withdraw with Machine Learning Models By Haochuan; Wang
  14. Sensitivity Analysis for Causal ML: A Use Case at Booking.com By Philipp Bach; Victor Chernozhukov; Carlos Cinelli; Lin Jia; Sven Klaassen; Nils Skotara; Martin Spindler
  15. Board Gender Diversity and Carbon Emissions Performance: Insights from Panel Regressions, Machine Learning and Explainable AI By Mohammad Hassan Shakil; Arne Johan Pollestad; Khine Kyaw; Ziaul Haque Munim
  16. Mapping the space of central bankers' ideas By Taejin Park; Fernando Perez-Cruz; Hyun Song Shin
  17. Predictive Performance of LSTM Networks on Sectoral Stocks in an Emerging Market: A Case Study of the Pakistan Stock Exchange By Ahad Yaqoob; Syed M. Abdullah
  18. Recidivism and Peer Influence with LLM Text Embeddings in Low Security Correctional Facilities By Shanjukta Nath; Jiwon Hong; Jae Ho Chang; Keith Warren; Subhadeep Paul
  19. Leveraging LLMs to Improve Experimental Design: A Generative Stratification Approach By George Gui; Seungwoo Kim
  20. Extracting the Structure of Press Releases for Predicting Earnings Announcement Returns By Yuntao Wu; Ege Mert Akin; Charles Martineau; Vincent Gr\'egoire; Andreas Veneris
  21. Beyond Words: Fed Chairs' Voice Sentiments and US Bank Stock Price Crash Risk By Dimitrios Anastasiou; Apostolos G. Katsafados; Steven Ongena; Christos Tzomakas

  1. By: Yun Lin; Jiawei Lou; Jinghe Zhang
    Abstract: Deep learning offers new tools for portfolio optimization. We present an end-to-end framework that directly learns portfolio weights by combining Long Short-Term Memory (LSTM) networks to model temporal patterns, Graph Attention Networks (GAT) to capture evolving inter-stock relationships, and sentiment analysis of financial news to reflect market psychology. Unlike prior approaches, our model unifies these elements in a single pipeline that produces daily allocations. It avoids the traditional two-step process of forecasting asset returns and then applying mean--variance optimization (MVO), a sequence that can introduce instability. We evaluate the framework on nine U.S. stocks spanning six sectors, chosen to balance sector diversity and news coverage. In this setting, the model delivers higher cumulative returns and Sharpe ratios than equal-weighted and CAPM-based MVO benchmarks. Although the stock universe is limited, the results underscore the value of integrating price, relational, and sentiment signals for portfolio management and suggest promising directions for scaling the approach to larger, more diverse asset sets.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2509.24144
  2. By: Saeid Mashhadi; Amirhossein Saghezchi; Vesal Ghassemzadeh Kashani
    Abstract: This study develops an interpretable machine learning framework to forecast startup outcomes, including funding, patenting, and exit. A firm-quarter panel for 2010-2023 is constructed from Crunchbase and matched to U.S. Patent and Trademark Office (USPTO) data. Three horizons are evaluated: next funding within 12 months, patent-stock growth within 24 months, and exit through an initial public offering (IPO) or acquisition within 36 months. Preprocessing is fit on a development window (2010-2019) and applied without change to later cohorts to avoid leakage. Class imbalance is addressed using inverse-prevalence weights and the Synthetic Minority Oversampling Technique for Nominal and Continuous features (SMOTE-NC). Logistic regression and tree ensembles, including Random Forest, XGBoost, LightGBM, and CatBoost, are compared using the area under the precision-recall curve (PR-AUC) and the area under the receiver operating characteristic curve (AUROC). Patent, funding, and exit predictions achieve AUROC values of 0.921, 0.817, and 0.872, providing transparent and reproducible rankings for innovation finance.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.09465
  3. By: Yun Zhao; Harry Zheng
    Abstract: We propose an approach to applying neural networks on linear parabolic variational inequalities. We use loss functions that directly incorporate the variational inequality on the whole domain to bypass the need to determine the stopping region in advance and prove the existence of neural networks whose losses converge to zero. We also prove the functional convergence in the Sobolev space. We then apply our approach to solving an optimal investment and stopping problem in finance. By leveraging duality, we convert the nonlinear HJB-type variational inequality of the primal problem into a linear variational inequality of the dual problem and prove the convergence of the primal value function from the dual neural network solution, an outcome made possible by our Sobolev norm analysis. We illustrate the versatility and accuracy of our method with numerical examples for both power and non-HARA utilities as well as high-dimensional American put option pricing. Our results underscore the potential of neural networks for solving variational inequalities in optimal stopping and control problems.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2509.26535
  4. By: Marlon Azinovic-Yang; Jan \v{Z}emli\v{c}ka
    Abstract: We develop a deep learning algorithm for approximating functional rational expectations equilibria of dynamic stochastic economies in the sequence space. We use deep neural networks to parameterize equilibrium objects of the economy as a function of truncated histories of exogenous shocks. We train the neural networks to fulfill all equilibrium conditions along simulated paths of the economy. To illustrate the performance of our method, we solve three economies of increasing complexity: the stochastic growth model, a high-dimensional overlapping generations economy with multiple sources of aggregate risk, and finally an economy where households and firms face uninsurable idiosyncratic risk, shocks to aggregate productivity, and shocks to idiosyncratic and aggregate volatility. Furthermore, we show how to design practical neural policy function architectures that guarantee monotonicity of the predicted policies, facilitating the use of the endogenous grid method to simplify parts of our algorithm.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2509.13623
  5. By: Zheng Cao; Xingran Shao; Yuheng Yan; Helyette Geman
    Abstract: We propose a novel model, the Hyped Log-Periodic Power Law Model (HLPPL), to the problem of quantifying and detecting financial bubbles, an ever-fascinating one for academics and practitioners alike. Bubble labels are generated using a Log-Periodic Power Law (LPPL) model, sentiment scores, and a hype index we introduced in previous research on NLP forecasting of stock return volatility. Using these tools, a dual-stream transformer model is trained with market data and machine learning methods, resulting in a time series of confidence scores as a Bubble Score. A distinctive feature of our framework is that it captures phases of extreme overpricing and underpricing within a unified structure. We achieve an average yield of 34.13 percentage annualized return when backtesting U.S. equities during the period 2018 to 2024, while the approach exhibits a remarkable generalization ability across industry sectors. Its conservative bias in predicting bubble periods minimizes false positives, a feature which is especially beneficial for market signaling and decision-making. Overall, this approach utilizes both theoretical and empirical advances for real-time positive and negative bubble identification and measurement with HLPPL signals.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.10878
  6. By: Mingqian Guan; Komei Fujita; Naoya Sueishi; Shota Yasui
    Abstract: This paper proposes a new method for estimating conditional average treatment effects (CATE) in randomized experiments. We adopt inverse probability weighting (IPW) for identification; however, IPW-transformed outcomes are known to be noisy, even when true propensity scores are used. To address this issue, we introduce a noise reduction procedure and estimate a linear CATE model using Lasso, achieving both accuracy and interpretability. We theoretically show that denoising reduces the prediction error of the Lasso. The method is particularly effective when treatment effects are small relative to the variability of outcomes, which is often the case in empirical applications. Applications to the Get-Out-the-Vote dataset and Criteo Uplift Modeling dataset demonstrate that our method outperforms fully nonparametric machine learning methods in identifying individuals with higher treatment effects. Moreover, our method uncovers informative heterogeneity patterns that are consistent with previous empirical findings.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.10527
  7. By: Marcos Delprato
    Abstract: The learning crisis in the Latin American region (i.e., higher rates of students not reaching basic competencies at secondary level) is worrying, particularly post-pandemic given the stronger role of inequality behind achievement. Within this scenario, the concept of student academic resilience (SAR), students who despite coming from disadvantaged backgrounds reach good performance levels, and an analysis of its determinants, are policy relevant. In this paper, using advancements on explainable machine learning methods (the SHAP method) and relying on PISA 2022 data for 9 countries from the region, I identify leading factors behind SAR using diverse indicators. I find that household inputs (books and digital devices), gender, homework, repetition and work intensity are leading factors for one indicator of academic resilience, whereas for other indicator leading drives fall into the school domain: school size, the ratio of PC connected to the internet, STR and teaching quality proxied by certified teachers and professional development rates and school type (private school). Also, I find negative associations of SAR with the length of school closures and barriers for remote learning during the pandemic. The paper's findings adds to the scare regional literature as well as they contribute to future policy designs where key features behind SAR can be used to lift disadvantaged students from lower achievement groups towards being academic resilient.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2509.24830
  8. By: Zofia Bracha; Pawe{\l} Sakowski; Jakub Micha\'nk\'ow
    Abstract: This paper explores the application of deep Q-learning to hedging at-the-money options on the S\&P~500 index. We develop an agent based on the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm, trained to simulate hedging decisions without making explicit model assumptions on price dynamics. The agent was trained on historical intraday prices of S\&P~500 call options across years 2004--2024, using a single time series of six predictor variables: option price, underlying asset price, moneyness, time to maturity, realized volatility, and current hedge position. A walk-forward procedure was applied for training, which led to nearly 17~years of out-of-sample evaluation. The performance of the deep reinforcement learning (DRL) agent is benchmarked against the Black--Scholes delta-hedging strategy over the same period. We assess both approaches using metrics such as annualized return, volatility, information ratio, and Sharpe ratio. To test the models' adaptability, we performed simulations across varying market conditions and added constraints such as transaction costs and risk-awareness penalties. Our results show that the DRL agent can outperform traditional hedging methods, particularly in volatile or high-cost environments, highlighting its robustness and flexibility in practical trading contexts. While the agent consistently outperforms delta-hedging, its performance deteriorates when the risk-awareness parameter is higher. We also observed that the longer the time interval used for volatility estimation, the more stable the results.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.09247
  9. By: Ta-Chung Chi (Kevin); Ting-Han Fan (Kevin); Raffaele M. Ghigliazza (Kevin); Domenico Giannone (Kevin); Zixuan (Kevin); Wang
    Abstract: We forecast the full conditional distribution of macroeconomic outcomes by systematically integrating three key principles: using high-dimensional data with appropriate regularization, adopting rigorous out-of-sample validation procedures, and incorporating nonlinearities. By exploiting the rich information embedded in a large set of macroeconomic and financial predictors, we produce accurate predictions of the entire profile of macroeconomic risk in real time. Our findings show that regularization via shrinkage is essential to control model complexity, while introducing nonlinearities yields limited improvements in predictive accuracy. Out-of-sample validation plays a critical role in selecting model architecture and preventing overfitting.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.11008
  10. By: Yilie Huang
    Abstract: This paper proposes a novel approach for Asset-Liability Management (ALM) by employing continuous-time Reinforcement Learning (RL) with a linear-quadratic (LQ) formulation that incorporates both interim and terminal objectives. We develop a model-free, policy gradient-based soft actor-critic algorithm tailored to ALM for dynamically synchronizing assets and liabilities. To ensure an effective balance between exploration and exploitation with minimal tuning, we introduce adaptive exploration for the actor and scheduled exploration for the critic. Our empirical study evaluates this approach against two enhanced traditional financial strategies, a model-based continuous-time RL method, and three state-of-the-art RL algorithms. Evaluated across 200 randomized market scenarios, our method achieves higher average rewards than all alternative strategies, with rapid initial gains and sustained superior performance. The outperformance stems not from complex neural networks or improved parameter estimation, but from directly learning the optimal ALM strategy without learning the environment.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2509.23280
  11. By: Mehmet Caner Qingliang Fan
    Abstract: In this review, we provide practical guidance on some of the main machine learning tools used in portfolio weight formation. This is not an exhaustive list, but a fraction of the ones used and have some statistical analysis behind it. All this research is essentially tied to precision matrix of excess asset returns. Our main point is that the techniques should be used in conjunction with outlined objective functions. In other words, there should be joint analysis of Machine Learning (ML) technique with the possible portfolio choice-objective functions in terms of test period Sharpe Ratio or returns. The ML method with the best objective function should provide the weight for portfolio formation. Empirically we analyze five time periods of interest, that are out-sample and show performance of some ML-Artificial Intelligence (AI) methods. We see that nodewise regression with Global Minimum Variance portfolio based weights deliver very good Sharpe Ratio and returns across five time periods in this century we analyze. We cover three downturns, and 2 long term investment spans.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2509.25456
  12. By: Ross Koval; Nicholas Andrews; Xifeng Yan
    Abstract: Text and time series data offer complementary views of financial markets: news articles provide narrative context about company events, while stock prices reflect how markets react to those events. However, despite their complementary nature, effectively integrating these interleaved modalities for improved forecasting remains challenging. In this work, we propose a unified neural architecture that models these interleaved sequences using modality-specific experts, allowing the model to learn unique time series patterns, while still enabling joint reasoning across modalities and preserving pretrained language understanding capabilities. To further improve multimodal understanding, we introduce a cross-modal alignment framework with a salient token weighting mechanism that learns to align representations across modalities with a focus on the most informative tokens. We demonstrate the effectiveness of our approach on a large-scale financial forecasting task, achieving state-of-the-art performance across a wide variety of strong unimodal and multimodal baselines. We develop an interpretability method that reveals insights into the value of time series-context and reinforces the design of our cross-modal alignment objective. Finally, we demonstrate that these improvements translate to meaningful economic gains in investment simulations.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2509.19628
  13. By: Haochuan (Kevin); Wang
    Abstract: Liquidity withdrawal is a critical indicator of market fragility. In this project, I test a framework for forecasting liquidity withdrawal at the individual-stock level, ranging from less liquid stocks to highly liquid large-cap tickers, and evaluate the relative performance of competing model classes in predicting short-horizon order book stress. We introduce the Liquidity Withdrawal Index (LWI) -- defined as the ratio of order cancellations to the sum of standing depth and new additions at the best quotes -- as a bounded, interpretable measure of transient liquidity removal. Using Nasdaq market-by-order (MBO) data, we compare a spectrum of approaches: linear benchmarks (AR, HAR), and non-linear tree ensembles (XGBoost), across horizons ranging from 250\, ms to 5\, s. Beyond predictive accuracy, our results provide insights into order placement and cancellation dynamics, identify regimes where linear versus non-linear signals dominate, and highlight how early-warning indicators of liquidity withdrawal can inform both market surveillance and execution.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2509.22985
  14. By: Philipp Bach; Victor Chernozhukov; Carlos Cinelli; Lin Jia; Sven Klaassen; Nils Skotara; Martin Spindler
    Abstract: Causal Machine Learning has emerged as a powerful tool for flexibly estimating causal effects from observational data in both industry and academia. However, causal inference from observational data relies on untestable assumptions about the data-generating process, such as the absence of unobserved confounders. When these assumptions are violated, causal effect estimates may become biased, undermining the validity of research findings. In these contexts, sensitivity analysis plays a crucial role, by enabling data scientists to assess the robustness of their findings to plausible violations of unconfoundedness. This paper introduces sensitivity analysis and demonstrates its practical relevance through a (simulated) data example based on a use case at Booking.com. We focus our presentation on a recently proposed method by Chernozhukov et al. (2023), which derives general non-parametric bounds on biases due to omitted variables, and is fully compatible with (though not limited to) modern inferential tools of Causal Machine Learning. By presenting this use case, we aim to raise awareness of sensitivity analysis and highlight its importance in real-world scenarios.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.09109
  15. By: Mohammad Hassan Shakil; Arne Johan Pollestad; Khine Kyaw; Ziaul Haque Munim
    Abstract: With the European Union introducing gender quotas on corporate boards, this study investigates the impact of board gender diversity (BGD) on firms' carbon emission performance (CEP). Using panel regressions and advanced machine learning algorithms on data from European firms between 2016 and 2022, the analyses reveal a significant non-linear relationship. Specifically, CEP improves with BGD up to an optimal level of approximately 35 percent, beyond which further increases in BGD yield no additional improvement in CEP. A minimum threshold of 22 percent BGD is necessary for meaningful improvements in CEP. To assess the legitimacy of CEP outcomes, this study examines whether ESG controversies affect the relationship between BGD and CEP. The results show no significant effect, suggesting that the effect of BGD is driven by governance mechanisms rather than symbolic actions. Additionally, structural equation modelling (SEM) indicates that while environmental innovation contributes to CEP, it is not the mediating channel through which BGD promotes CEP. The results have implications for academics, businesses, and regulators.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.00244
  16. By: Taejin Park; Fernando Perez-Cruz; Hyun Song Shin
    Abstract: This paper explores the landscape of economic ideas as revealed in the machine learning embedding of a comprehensive dataset of central bank speeches. This dataset, maintained by the BIS, encompasses 19, 742 speeches delivered by almost 1, 000 officials from over 100 central banks over a period spanning three decades, from 1997 to 2025. As well as topic analysis of speeches at any moment in time, the evolution of the topics over time provides insights into how the focus of central bank thinking has been shaped by shifting policy challenges since 1997. Parsing the embedding both through topics and through time provides rich insights into how economic ideas have taken shape through communication practices of central banks worldwide. To demonstrate its utility, we have conducted a series of analyses that map the global landscape of monetary policy discourse. Furthermore, we construct a quantitative framework-referred to as the "space of central bankers' ideas"-which uncovers institutional patterns and highlights shifts in policy approaches over time.
    Keywords: central bank communication, central bank speeches, AI, topic modeling, embeddings
    JEL: E52 E58 C55 C38
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:bis:biswps:1299
  17. By: Ahad Yaqoob; Syed M. Abdullah
    Abstract: The application of deep learning models for stock price forecasting in emerging markets remains underexplored despite their potential to capture complex temporal dependencies. This study develops and evaluates a Long Short-Term Memory (LSTM) network model for predicting the closing prices of ten major stocks across diverse sectors of the Pakistan Stock Exchange (PSX). Utilizing historical OHLCV data and an extensive set of engineered technical indicators, we trained and validated the model on a multi-year dataset. Our results demonstrate strong predictive performance ($R^2 > 0.87$) for stocks in stable, high-liquidity sectors such as power generation, cement, and fertilizers. Conversely, stocks characterized by high volatility, low liquidity, or sensitivity to external shocks (e.g., global oil prices) presented significant forecasting challenges. The study provides a replicable framework for LSTM-based forecasting in data-scarce emerging markets and discusses implications for investors and future research.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2509.14401
  18. By: Shanjukta Nath; Jiwon Hong; Jae Ho Chang; Keith Warren; Subhadeep Paul
    Abstract: We find AI embeddings obtained using a pre-trained transformer-based Large Language Model (LLM) of 80, 000-120, 000 written affirmations and correction exchanges among residents in low-security correctional facilities to be highly predictive of recidivism. The prediction accuracy is 30\% higher with embedding vectors than with only pre-entry covariates. However, since the text embedding vectors are high-dimensional, we perform Zero-Shot classification of these texts to a low-dimensional vector of user-defined classes to aid interpretation while retaining the predictive power. To shed light on the social dynamics inside the correctional facilities, we estimate peer effects in these LLM-generated numerical representations of language with a multivariate peer effect model, adjusting for network endogeneity. We develop new methodology and theory for peer effect estimation that accommodate sparse networks, multivariate latent variables, and correlated multivariate outcomes. With these new methods, we find significant peer effects in language usage for interaction and feedback.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2509.20634
  19. By: George Gui; Seungwoo Kim
    Abstract: Pre-experiment stratification, or blocking, is a well-established technique for designing more efficient experiments and increasing the precision of the experimental estimates. However, when researchers have access to many covariates at the experiment design stage, they often face challenges in effectively selecting or weighting covariates when creating their strata. This paper proposes a Generative Stratification procedure that leverages Large Language Models (LLMs) to synthesize high-dimensional covariate data to improve experimental design. We demonstrate the value of this approach by applying it to a set of experiments and find that our method would have reduced the variance of the treatment effect estimate by 10%-50% compared to simple randomization in our empirical applications. When combined with other standard stratification methods, it can be used to further improve the efficiency. Our results demonstrate that LLM-based simulation is a practical and easy-to-implement way to improve experimental design in covariate-rich settings.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2509.25709
  20. By: Yuntao Wu; Ege Mert Akin; Charles Martineau; Vincent Gr\'egoire; Andreas Veneris
    Abstract: We examine how textual features in earnings press releases predict stock returns on earnings announcement days. Using over 138, 000 press releases from 2005 to 2023, we compare traditional bag-of-words and BERT-based embeddings. We find that press release content (soft information) is as informative as earnings surprise (hard information), with FinBERT yielding the highest predictive power. Combining models enhances explanatory strength and interpretability of the content of press releases. Stock prices fully reflect the content of press releases at market open. If press releases are leaked, it offers predictive advantage. Topic analysis reveals self-serving bias in managerial narratives. Our framework supports real-time return prediction through the integration of online learning, provides interpretability and reveals the nuanced role of language in price formation.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2509.24254
  21. By: Dimitrios Anastasiou (Athens University of Economics and Business - Department of Business Administration); Apostolos G. Katsafados (Athens University of Economics and Business - Department of Accounting and Finance; Bank of Greece); Steven Ongena (University of Zurich - Department Finance; Swiss Finance Institute; KU Leuven; NTNU Business School; Centre for Economic Policy Research (CEPR)); Christos Tzomakas (Athens University of Economics and Business)
    Abstract: Building on the methodology of Gorodnichenko et al. (2023), we reconstruct and propose a novel measure that quantifies the voice sentiment of the Chair of the Federal Reserve press conference responses and examine its impact on the stock price crash risk of U.S. banks. We find that a more positive vocal sentiment, indicative of happiness, significantly reduces banks' ex-ante crash risk, whereas negative emotions, such as sadness and anger, amplify it. Our findings suggest that, beyond the textual content of monetary policy statements, the emotional delivery of central bank communication plays a critical role in shaping financial stability outcomes.
    Keywords: US banks, Stock Price Crash Risk, Voice Sentiment, Financial Stability
    JEL: G01 G21 G41
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:chf:rpseri:rp2572

This nep-big issue is ©2025 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.