nep-big New Economics Papers
on Big Data
Issue of 2025–11–03
25 papers chosen by
Tom Coupé, University of Canterbury


  1. A three-step machine learning approach to predict market bubbles with financial news By Abraham Atsiwo
  2. Combining machine learning techniques with NDEA methodology: the use of R.F. and A.N.N. By Pinto, Claudio
  3. Comparing LLMs for Sentiment Analysis in Financial Market News By Lucas Eduardo Pereira Teles; Carlos M. S. Figueiredo
  4. News-Aware Direct Reinforcement Trading for Financial Markets By Qing-Yu Lan; Zhan-He Wang; Jun-Qian Jiang; Yu-Tong Wang; Yun-Song Piao
  5. Fusing Narrative Semantics for Financial Volatility Forecasting By Yaxuan Kong; Yoontae Hwang; Marcus Kaiser; Chris Vryonides; Roel Oomen; Stefan Zohren
  6. Physics-Informed Graph Neural Networks for Attack Path Prediction By Marin François; Pierre-Emmanuel Arduin; Myriam Merad
  7. Quantum and Classical Machine Learning in Decentralized Finance: Comparative Evidence from Multi-Asset Backtesting of Automated Market Makers By Chi-Sheng Chen; Aidan Hung-Wen Tsai
  8. A Topological Approach to Parameterizing Deep Hedging Networks By Alok Das; Kiseop Lee
  9. Convolutional Attention in Betting Exchange Markets By Rui Gon\c{c}alves; Vitor Miguel Ribeiro; Roman Chertovskih; Ant\'onio Pedro Aguiar
  10. At-Risk Transformation for U.S. Recession Prediction By Rahul Billakanti; Minchul Shin
  11. Beating the Winner's Curse via Inference-Aware Policy Optimization By Hamsa Bastani; Osbert Bastani; Bryce McLaughlin
  12. Sentiment and Volatility in Financial Markets: A Review of BERT and GARCH Applications during Geopolitical Crises By Domenica Mino; Cillian Williamson
  13. Parameter Proliferation in Nowcasting: Issues and Approaches—An Application to Nowcasting China’s Real GDP By Mr. Paul Cashin; Mr. Fei Han; Ivy Sabuga; Jing Xie; Fan Zhang
  14. Disentangling Age, Time, and Cohort Effects in Income Inequality: A Proxy Machine Learning Approach By David Bruns-Smith; Emi Nakamura; Jón Steinsson
  15. Quantum Machine Learning methods for Fourier-based distribution estimation with application in option pricing By Fernando Alonso; \'Alvaro Leitao; Carlos V\'azquez
  16. Bitcoin Price Forecasting Based on Hybrid Variational Mode Decomposition and Long Short Term Memory Network By Emmanuel Boadi
  17. Aligning Language Models with Investor and Market Behavior for Financial Recommendations By Fernando Spadea; Oshani Seneviratne
  18. Spiking Neural Network for Cross-Market Portfolio Optimization in Financial Markets: A Neuromorphic Computing Approach By Amarendra Mohan; Ameer Tamoor Khan; Shuai Li; Xinwei Cao; Zhibin Li
  19. From Reviews to Actionable Insights: An LLM-Based Approach for Attribute and Feature Extraction By Khaled Boughanmi; Kamel Jedidi; Nour Jedidi
  20. Integrating Transparent Models, LLMs, and Practitioner-in-the-Loop: A Case of Nonprofit Program Evaluation By Ji Ma; Albert Casella
  21. A Neural Network-VAR for Long-Term Forecasting: An Application to Monetary Policy Effects in the Euro Area By Diana Barro; Antonella Basso; Marco Corazza; Guglielmo Alessandro Visentin
  22. A Multi-Layer Machine Learning and Econometric Pipeline for Forecasting Market Risk: Evidence from Cryptoasset Liquidity Spillovers By Yimeng Qiu; Feihuang Fang
  23. How Did People Tweet against Inflation in Japan? By SEKINE, Toshitaka; WADA, Tetsuro
  24. Bridging Language Barriers: The Impact of Large Language Models on Academic Writing By Dalaman, Burak; Kalay, Ali Furkan; Kettlewell, Nathan
  25. Robust Yield Curve Estimation for Mortgage Bonds Using Neural Networks By Sina Molavipour; Alireza M. Javid; Cassie Ye; Bj\"orn L\"ofdahl; Mikhail Nechaev

  1. By: Abraham Atsiwo
    Abstract: This study presents a three-step machine learning framework to predict bubbles in the S&P 500 stock market by combining financial news sentiment with macroeconomic indicators. Building on traditional econometric approaches, the proposed approach predicts bubble formation by integrating textual and quantitative data sources. In the first step, bubble periods in the S&P 500 index are identified using a right-tailed unit root test, a widely recognized real-time bubble detection method. The second step extracts sentiment features from large-scale financial news articles using natural language processing (NLP) techniques, which capture investors' expectations and behavioral patterns. In the final step, ensemble learning methods are applied to predict bubble occurrences based on high sentiment-based and macroeconomic predictors. Model performance is evaluated through k-fold cross-validation and compared against benchmark machine learning algorithms. Empirical results indicate that the proposed three-step ensemble approach significantly improves predictive accuracy and robustness, providing valuable early warning insights for investors, regulators, and policymakers in mitigating systemic financial risks.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.16636
  2. By: Pinto, Claudio
    Abstract: The objective of the present work is to combine NDEA approach with machine learning techniques and neural networks. At this end we exploit the models proposed in Pinto, 2024. The integration process involves the application of a machine learning technique upstream of the resolution of NDEA models and the application of an artificial neural network downstream the resolution of a NDEA models. In particular here we propose the application of a Random Forest algorithm in regression models to adjust data on: 1) input and output, 2) resource allocation preferences among sub-processes, 3) cost budgets, revenue targets and profit targets, from the influence of internal and external factors in order to improve the calculation of optimal weights. Downstream of the resolution of NDEA models, the use of several artificial neural network models is to prosed to optimise the calculation of the economic quantities of interest derived from optimal NDEA solutions. The approach enhances the discrimination power and robustness of optimal NDEA weights as well as the robustness of the calculation of formulas of the economic quatities.
    Keywords: Network Data Envelopment Analisys, Random Forest Regression, Artificial Neural Network, external factors
    JEL: C45 C53 C61 L20
    Date: 2025–09–07
    URL: https://d.repec.org/n?u=RePEc:pra:mprapa:126539
  3. By: Lucas Eduardo Pereira Teles; Carlos M. S. Figueiredo
    Abstract: This article presents a comparative study of large language models (LLMs) in the task of sentiment analysis of financial market news. This work aims to analyze the performance difference of these models in this important natural language processing task within the context of finance. LLM models are compared with classical approaches, allowing for the quantification of the benefits of each tested model or approach. Results show that large language models outperform classical models in the vast majority of cases.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.15929
  4. By: Qing-Yu Lan; Zhan-He Wang; Jun-Qian Jiang; Yu-Tong Wang; Yun-Song Piao
    Abstract: The financial market is known to be highly sensitive to news. Therefore, effectively incorporating news data into quantitative trading remains an important challenge. Existing approaches typically rely on manually designed rules and/or handcrafted features. In this work, we directly use the news sentiment scores derived from large language models, together with raw price and volume data, as observable inputs for reinforcement learning. These inputs are processed by sequence models such as recurrent neural networks or Transformers to make end-to-end trading decisions. We conduct experiments using the cryptocurrency market as an example and evaluate two representative reinforcement learning algorithms, namely Double Deep Q-Network (DDQN) and Group Relative Policy Optimization (GRPO). The results demonstrate that our news-aware approach, which does not depend on handcrafted features or manually designed rules, can achieve performance superior to market benchmarks. We further highlight the critical role of time-series information in this process.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.19173
  5. By: Yaxuan Kong; Yoontae Hwang; Marcus Kaiser; Chris Vryonides; Roel Oomen; Stefan Zohren
    Abstract: We introduce M2VN: Multi-Modal Volatility Network, a novel deep learning-based framework for financial volatility forecasting that unifies time series features with unstructured news data. M2VN leverages the representational power of deep neural networks to address two key challenges in this domain: (i) aligning and fusing heterogeneous data modalities, numerical financial data and textual information, and (ii) mitigating look-ahead bias that can undermine the validity of financial models. To achieve this, M2VN combines open-source market features with news embeddings generated by Time Machine GPT, a recently introduced point-in-time LLM, ensuring temporal integrity. An auxiliary alignment loss is introduced to enhance the integration of structured and unstructured data within the deep learning architecture. Extensive experiments demonstrate that M2VN consistently outperforms existing baselines, underscoring its practical value for risk management and financial decision-making in dynamic markets.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.20699
  6. By: Marin François (LAMSADE - Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision - Université Paris Dauphine-PSL - PSL - Université Paris Sciences et Lettres - CNRS - Centre National de la Recherche Scientifique); Pierre-Emmanuel Arduin (DRM - Dauphine Recherches en Management - Université Paris Dauphine-PSL - PSL - Université Paris Sciences et Lettres - CNRS - Centre National de la Recherche Scientifique); Myriam Merad (LAMSADE - Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision - Université Paris Dauphine-PSL - PSL - Université Paris Sciences et Lettres - CNRS - Centre National de la Recherche Scientifique)
    Abstract: The automated identification and evaluation of potential attack paths within infrastructures is a critical aspect of cybersecurity risk assessment. However, existing methods become impractical when applied to complex infrastructures. While machine learning (ML) has proven effective in predicting the exploitation of individual vulnerabilities, its potential for full-path prediction remains largely untapped. This challenge stems from two key obstacles: the lack of adequate datasets for training the models and the dimensionality of the learning problem. To address the first issue, we provide a dataset of 1033 detailed environment graphs and associated attack paths, with the objective of supporting the community in advancing ML-based attack path prediction. To tackle the second, we introduce a novel Physics-Informed Graph Neural Network (PIGNN) architecture for attack path prediction. Our experiments demonstrate its effectiveness, achieving an F1 score of 0.9308 for full-path prediction. We also introduce a self-supervised learning architecture for initial access and impact prediction, achieving F1 scores of 0.9780 and 0.8214, respectively. Our results indicate that the PIGNN effectively captures adversarial patterns in high-dimensional spaces, demonstrating promising generalization potential towards fully automated assessments.
    Keywords: Attack path prediction, Deep learning, Physics-informed neural networks, Graph neural networks
    Date: 2025–04–10
    URL: https://d.repec.org/n?u=RePEc:hal:journl:hal-05323716
  7. By: Chi-Sheng Chen; Aidan Hung-Wen Tsai
    Abstract: This study presents a comprehensive empirical comparison between quantum machine learning (QML) and classical machine learning (CML) approaches in Automated Market Makers (AMM) and Decentralized Finance (DeFi) trading strategies through extensive backtesting on 10 models across multiple cryptocurrency assets. Our analysis encompasses classical ML models (Random Forest, Gradient Boosting, Logistic Regression), pure quantum models (VQE Classifier, QNN, QSVM), hybrid quantum-classical models (QASA Hybrid, QASA Sequence, QuantumRWKV), and transformer models. The results demonstrate that hybrid quantum models achieve superior overall performance with 11.2\% average return and 1.42 average Sharpe ratio, while classical ML models show 9.8\% average return and 1.47 average Sharpe ratio. The QASA Sequence hybrid model achieves the highest individual return of 13.99\% with the best Sharpe ratio of 1.76, demonstrating the potential of quantum-classical hybrid approaches in AMM and DeFi trading strategies.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.15903
  8. By: Alok Das; Kiseop Lee
    Abstract: Deep hedging uses recurrent neural networks to hedge financial products that cannot be fully hedged in incomplete markets. Previous work in this area focuses on minimizing some measure of quadratic hedging error by calculating pathwise gradients, but doing so requires large batch sizes and can make training effective models in a reasonable amount of time challenging. We show that by adding certain topological features, we can reduce batch sizes substantially and make training these models more practically feasible without greatly compromising hedging performance.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.16938
  9. By: Rui Gon\c{c}alves; Vitor Miguel Ribeiro; Roman Chertovskih; Ant\'onio Pedro Aguiar
    Abstract: This study presents the implementation of a short-term forecasting system for price movements in exchange markets, using market depth data and a systematic procedure to enable a fully automated trading system. The case study focuses on the UK to Win Horse Racing market during the pre-live stage on the world's leading betting exchange, Betfair. Innovative convolutional attention mechanisms are introduced and applied to multiple recurrent neural networks and bi-dimensional convolutional recurrent neural network layers. Additionally, a novel padding method for convolutional layers is proposed, specifically designed for multivariate time series processing. These innovations are thoroughly detailed, along with their execution process. The proposed architectures follow a standard supervised learning approach, involving model training and subsequent testing on new data, which requires extensive pre-processing and data analysis. The study also presents a complete end-to-end framework for automated feature engineering and market interactions using the developed models in production. The key finding of this research is that all proposed innovations positively impact the performance metrics of the classification task under examination, thereby advancing the current state-of-the-art in convolutional attention mechanisms and padding methods applied to multivariate time series problems.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.16008
  10. By: Rahul Billakanti; Minchul Shin
    Abstract: We propose a simple binarization of predictors—an “at-risk” transformation—as an alternative to the standard practice of using continuous, standardized variables in recession forecasting models. By converting predictors into indicators of unusually weak states, we demonstrate their ability to capture the discrete nature of rare events such as U.S. recessions. Using a large panel of monthly U.S. macroeconomic and financial data, we show that binarized predictors consistently improve out-of-sample forecasting performance—often making linear models competitive with flexible machine learning methods—and that the gains are particularly pronounced around the onset of recessions
    Keywords: Recession Forecasting; Machine Learning; Feature Engineering; At-Risk Transformation; Binarized Predictors; Diffusion Index
    JEL: C25 C53 E32 E37
    Date: 2025–10–30
    URL: https://d.repec.org/n?u=RePEc:fip:fedpwp:102004
  11. By: Hamsa Bastani; Osbert Bastani; Bryce McLaughlin
    Abstract: There has been a surge of recent interest in automatically learning policies to target treatment decisions based on rich individual covariates. A common approach is to train a machine learning model to predict counterfactual outcomes, and then select the policy that optimizes the predicted objective value. In addition, practitioners also want confidence that the learned policy has better performance than the incumbent policy according to downstream policy evaluation. However, due to the winner's curse-an issue where the policy optimization procedure exploits prediction errors rather than finding actual improvements-predicted performance improvements are often not substantiated by downstream policy optimization. To address this challenge, we propose a novel strategy called inference-aware policy optimization, which modifies policy optimization to account for how the policy will be evaluated downstream. Specifically, it optimizes not only for the estimated objective value, but also for the chances that the policy will be statistically significantly better than the observational policy used to collect data. We mathematically characterize the Pareto frontier of policies according to the tradeoff of these two goals. Based on our characterization, we design a policy optimization algorithm that uses machine learning to predict counterfactual outcomes, and then plugs in these predictions to estimate the Pareto frontier; then, the decision-maker can select the policy that optimizes their desired tradeoff, after which policy evaluation can be performed on the test set as usual. Finally, we perform simulations to illustrate the effectiveness of our methodology.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.18161
  12. By: Domenica Mino; Cillian Williamson
    Abstract: Artificial intelligence techniques have increasingly been applied to understand the complex relationship between public sentiment and financial market behaviour. This study explores the relationship between the sentiment of news related to the Russia-Ukraine war and the volatility of the stock market. A comprehensive dataset of news articles from major US platforms, published between January 1 and July 17, 2024, was analysed using a fine-tuned Bidirectional Encoder Representations from Transformers (BERT) model adapted for financial language. We extracted sentiment scores and applied a Generalised Autoregressive Conditional Heteroscedasticity (GARCH) model, enhanced with a Student-t distribution to capture the heavy-tailed nature of financial returns data. The results reveal a statistically significant negative relationship between negative news sentiment and market stability, suggesting that pessimistic war coverage is associated with increased volatility in the S&P 500 index. This research demonstrates how artificial intelligence and natural language processing can be integrated with econometric modelling to assess real-time market dynamics, offering valuable tools for financial risk analysis during geopolitical crises.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.16503
  13. By: Mr. Paul Cashin; Mr. Fei Han; Ivy Sabuga; Jing Xie; Fan Zhang
    Abstract: This paper evaluates three approaches to address parameter proliferation issue in nowcasting: (i) variable selection using adjusted stepwise autoregressive integrated moving average with exogenous variables (AS-ARIMAX); (ii) regularization in machine learning (ML); and (iii) dimensionality reduction via principal component analysis (PCA). Utilizing 166 variables, we estimate our models from 2007Q2 to 2019Q4 using rolling-window regression, while applying these three approaches. We then conduct a pseudo out-of-sample performance comparison of various nowcasting models—including Bridge, MIDAS, U-MIDAS, dynamic factor model (DFM), and machine learning techniques including Ridge Regression, LASSO, and Elastic Net to predict China's annualized real GDP growth rate from 2020Q1 to 2023Q1. Our findings suggest that the LASSO method outperform all other models, but only when guided by economic judgment and sign restrictions in variable selection. Notably, simpler models like Bridge with AS-ARIMAX variable selection yield reliable estimates nearly comparable to those from LASSO, underscoring the importance of effective variable selection in capturing strong signals.
    Keywords: China; GDP; Nowcasting
    Date: 2025–10–24
    URL: https://d.repec.org/n?u=RePEc:imf:imfwpa:2025/217
  14. By: David Bruns-Smith; Emi Nakamura; Jón Steinsson
    Abstract: A canonical finding from earlier research is that the cross-sectional variance of income increases sharply with age Deaton and Paxson (1994). However, the trend in this age profile is not separately identified from time and cohort trends. Conventional methods solve this identification problem by ruling out "time effects." This strong assumption is rejected by the data. We propose a new proxy variable machine learning approach to disentangle age, time and cohort effects. Using this method, we estimate a significantly smaller slope of the age profile of income variance for the US than conventional methods, as well as less erratic slopes for 11 other countries.
    JEL: E20 J20
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:nbr:nberwo:34380
  15. By: Fernando Alonso; \'Alvaro Leitao; Carlos V\'azquez
    Abstract: The ongoing progress in quantum technologies has fueled a sustained exploration of their potential applications across various domains. One particularly promising field is quantitative finance, where a central challenge is the pricing of financial derivatives-traditionally addressed through Monte Carlo integration techniques. In this work, we introduce two hybrid classical-quantum methods to address the option pricing problem. These approaches rely on reconstructing Fourier series representations of statistical distributions from the outputs of Quantum Machine Learning (QML) models based on Parametrized Quantum Circuits (PQCs). We analyze the impact of data size and PQC dimensionality on performance. Quantum Accelerated Monte Carlo (QAMC) is employed as a benchmark to quantitatively assess the proposed models in terms of computational cost and accuracy in the extraction of Fourier coefficients. Through the numerical experiments, we show that the proposed methods achieve remarkable accuracy, becoming a competitive quantum alternative for derivatives valuation.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.19494
  16. By: Emmanuel Boadi
    Abstract: This study proposes a hybrid deep learning model for forecasting the price of Bitcoin, as the digital currency is known to exhibit frequent fluctuations. The models used are the Variational Mode Decomposition (VMD) and the Long Short-Term Memory (LSTM) network. First, VMD is used to decompose the original Bitcoin price series into Intrinsic Mode Functions (IMFs). Each IMF is then modeled using an LSTM network to capture temporal patterns more effectively. The individual forecasts from the IMFs are aggregated to produce the final prediction of the original Bitcoin Price Series. To determine the prediction power of the proposed hybrid model, a comparative analysis was conducted against the standard LSTM. The results confirmed that the hybrid VMD+LSTM model outperforms the standard LSTM across all the evaluation metrics, including RMSE, MAE and R2 and also provides a reliable 30-day forecast.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.15900
  17. By: Fernando Spadea; Oshani Seneviratne
    Abstract: Most financial recommendation systems often fail to account for key behavioral and regulatory factors, leading to advice that is misaligned with user preferences, difficult to interpret, or unlikely to be followed. We present FLARKO (Financial Language-model for Asset Recommendation with Knowledge-graph Optimization), a novel framework that integrates Large Language Models (LLMs), Knowledge Graphs (KGs), and Kahneman-Tversky Optimization (KTO) to generate asset recommendations that are both profitable and behaviorally aligned. FLARKO encodes users' transaction histories and asset trends as structured KGs, providing interpretable and controllable context for the LLM. To demonstrate the adaptability of our approach, we develop and evaluate both a centralized architecture (CenFLARKO) and a federated variant (FedFLARKO). To our knowledge, this is the first demonstration of combining KTO for fine-tuning of LLMs for financial asset recommendation. We also present the first use of structured KGs to ground LLM reasoning over behavioral financial data in a federated learning (FL) setting. Evaluated on the FAR-Trans dataset, FLARKO consistently outperforms state-of-the-art recommendation baselines on behavioral alignment and joint profitability, while remaining interpretable and resource-efficient.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.15993
  18. By: Amarendra Mohan (IIT Kharagpur); Ameer Tamoor Khan (University of Copenhagen); Shuai Li (University of Oulu); Xinwei Cao (Jiangnan University); Zhibin Li (Chengdu University of Information Technology)
    Abstract: Cross-market portfolio optimization has become increasingly complex with the globalization of financial markets and the growth of high-frequency, multi-dimensional datasets. Traditional artificial neural networks, while effective in certain portfolio management tasks, often incur substantial computational overhead and lack the temporal processing capabilities required for large-scale, multi-market data. This study investigates the application of Spiking Neural Networks (SNNs) for cross-market portfolio optimization, leveraging neuromorphic computing principles to process equity data from both the Indian (Nifty 500) and US (S&P 500) markets. A five-year dataset comprising approximately 1, 250 trading days of daily stock prices was systematically collected via the Yahoo Finance API. The proposed framework integrates Leaky Integrate-andFire neuron dynamics with adaptive thresholding, spike-timingdependent plasticity, and lateral inhibition to enable event-driven processing of financial time series. Dimensionality reduction is achieved through hierarchical clustering, while populationbased spike encoding and multiple decoding strategies support robust portfolio construction under realistic trading constraints, including cardinality limits, transaction costs, and adaptive risk aversion. Experimental evaluation demonstrates that the SNN-based framework delivers superior risk-adjusted returns and reduced volatility compared to ANN benchmarks, while substantially improving computational efficiency. These findings highlight the promise of neuromorphic computation for scalable, efficient, and robust portfolio optimization across global financial markets.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.15921
  19. By: Khaled Boughanmi; Kamel Jedidi; Nour Jedidi
    Abstract: This research proposes a systematic, large language model (LLM) approach for extracting product and service attributes, features, and associated sentiments from customer reviews. Grounded in marketing theory, the framework distinguishes perceptual attributes from actionable features, producing interpretable and managerially actionable insights. We apply the methodology to 20, 000 Yelp reviews of Starbucks stores and evaluate eight prompt variants on a random subset of reviews. Model performance is assessed through agreement with human annotations and predictive validity for customer ratings. Results show high consistency between LLMs and human coders and strong predictive validity, confirming the reliability of the approach. Human coders required a median of six minutes per review, whereas the LLM processed each in two seconds, delivering comparable insights at a scale unattainable through manual coding. Managerially, the analysis identifies attributes and features that most strongly influence customer satisfaction and their associated sentiments, enabling firms to pinpoint "joy points, " address "pain points, " and design targeted interventions. We demonstrate how structured review data can power an actionable marketing dashboard that tracks sentiment over time and across stores, benchmarks performance, and highlights high-leverage features for improvement. Simulations indicate that enhancing sentiment for key service features could yield 1-2% average revenue gains per store.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.16551
  20. By: Ji Ma; Albert Casella
    Abstract: Public and nonprofit organizations often hesitate to adopt AI tools because most models are opaque even though standard approaches typically analyze aggregate patterns rather than offering actionable, case-level guidance. This study tests a practitioner-in-the-loop workflow that pairs transparent decision-tree models with large language models (LLMs) to improve predictive accuracy, interpretability, and the generation of practical insights. Using data from an ongoing college-success program, we build interpretable decision trees to surface key predictors. We then provide each tree's structure to an LLM, enabling it to reproduce case-level predictions grounded in the transparent models. Practitioners participate throughout feature engineering, model design, explanation review, and usability assessment, ensuring that field expertise informs the analysis at every stage. Results show that integrating transparent models, LLMs, and practitioner input yields accurate, trustworthy, and actionable case-level evaluations, offering a viable pathway for responsible AI adoption in the public and nonprofit sectors.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.19799
  21. By: Diana Barro (Ca’ Foscari University of Venice); Antonella Basso (Ca’ Foscari University of Venice); Marco Corazza (Ca’ Foscari University of Venice); Guglielmo Alessandro Visentin (Henley Business School, University of Reading)
    Abstract: We propose a hybrid approach that combines Neural Networks with a Vector Autoregression (VAR) model to generate long-term forecasts of time series. We apply this methodology to forecast the impact of shifts in monetary policies within the Euro area on a comprehensive set of macroeconomic variables. Our analysis begins with a standard (linear) VAR model, which is then enhanced by incorporating Neural Networks to generate long-term forecasts for key variables such as the interest rate, inflation, real output, narrow money, exchange rate, and corporate bond spread. The results suggest that a Neural Network-VAR model offers improvements over the traditional linear VAR for forecasting certain macroeconomic variables in the long run. However, due to the limited sample size, the nonlinear model does not consistently outperform the linear VAR.
    Keywords: Forecasting; VAR; Neural Networks; Monetary policies; Euro area
    JEL: C32 C45 C53 E52
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:ven:wpaper:2025:24
  22. By: Yimeng Qiu; Feihuang Fang
    Abstract: We study whether liquidity and volatility proxies of a core set of cryptoassets generate spillovers that forecast market-wide risk. Our empirical framework integrates three statistical layers: (A) interactions between core liquidity and returns, (B) principal-component relations linking liquidity and returns, and (C) volatility-factor projections that capture cross-sectional volatility crowding. The analysis is complemented by vector autoregression impulse responses and forecast error variance decompositions (see Granger 1969; Sims 1980), heterogeneous autoregressive models with exogenous regressors (HAR-X, Corsi 2009), and a leakage-safe machine learning protocol using temporal splits, early stopping, validation-only thresholding, and SHAP-based interpretation. Using daily data from 2021 to 2025 (1462 observations across 74 assets), we document statistically significant Granger-causal relationships across layers and moderate out-of-sample predictive accuracy. We report the most informative figures, including the pipeline overview, Layer A heatmap, Layer C robustness analysis, vector autoregression variance decompositions, and the test-set precision-recall curve. Full data and figure outputs are provided in the artifact repository.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.20066
  23. By: SEKINE, Toshitaka; WADA, Tetsuro
    Abstract: During the chronic deflation era starting in the 1990s, Japanese inflation expectations were said to be firmly anchored at a very low level, say, around zero. These expectations seemed to have become something like the social norm. Households were quite against any price hikes, and as a consequence, firms hesitated to raise their prices — when they raised prices, they apologized for their misbehavior. People not only expected that prices would not increase, but also believed that prices should not increase. That social norm may have changed in response to inflationary shocks after COVID-19 and the Ukraine war. We applied a natural language processing technique to tweets that commented on price hikes and found an increase in posts after 2021 that accepted price hikes for various goods. Some of these posts indicated even positive feelings and mentioned salary hikes.
    Keywords: tweet, natural language processing, sentiment analysis, inflation expectation, monetary policy, Japan
    JEL: C0 E31
    Date: 2025–08–31
    URL: https://d.repec.org/n?u=RePEc:hit:hiasdp:hias-e-150
  24. By: Dalaman, Burak (University of London); Kalay, Ali Furkan (Macquarie University, Sydney); Kettlewell, Nathan (University of Technology, Sydney)
    Abstract: Large language models (LLMs) have altered the nature of academic writing. While the influence of LLMs on academic writing is not uncontroversial, one promise for this technology is to bridge language barriers faced by nonnative English-speaking researchers. This study empirically demonstrates that LLMs have led to convergence in the lexical diversity of native and nonnative speakers, potentially helping to level the playing field. There has also been an increase in language complexity for nonnatives. We classify over one million authors as native or nonnative English speakers based on the etymological origins of their names and analyze over one million abstracts from arXiv.org, evaluating changes in lexical diversity and readability before and after ChatGPT’s release in November 2022. The results demonstrate a sharp increase in writing sophistication among all researchers, with nonnative English speakers showing the greatest gains across all writing metrics. Our findings provide empirical evidence on the impact of LLMs in academic writing, supporting recent speculations about their potential to bridge language barriers.
    Keywords: technology adoption, large language models, academic equity, generative AI, language barrier, bayesian structural time series
    JEL: J24 I23
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:iza:izadps:dp18215
  25. By: Sina Molavipour; Alireza M. Javid; Cassie Ye; Bj\"orn L\"ofdahl; Mikhail Nechaev
    Abstract: Robust yield curve estimation is crucial in fixed-income markets for accurate instrument pricing, effective risk management, and informed trading strategies. Traditional approaches, including the bootstrapping method and parametric Nelson-Siegel models, often struggle with overfitting or instability issues, especially when underlying bonds are sparse, bond prices are volatile, or contain hard-to-remove noise. In this paper, we propose a neural networkbased framework for robust yield curve estimation tailored to small mortgage bond markets. Our model estimates the yield curve independently for each day and introduces a new loss function to enforce smoothness and stability, addressing challenges associated with limited and noisy data. Empirical results on Swedish mortgage bonds demonstrate that our approach delivers more robust and stable yield curve estimates compared to existing methods such as Nelson-Siegel-Svensson (NSS) and Kernel-Ridge (KR). Furthermore, the framework allows for the integration of domain-specific constraints, such as alignment with risk-free benchmarks, enabling practitioners to balance the trade-off between smoothness and accuracy according to their needs.
    Date: 2025–10
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2510.21347

This nep-big issue is ©2025 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.