|
on Big Data |
By: | Fabio Gatti (University of Bern, Switzerland & Baffi Center, Bocconi University, Italy); Joel Huesler (University of Bern, Switzerland) |
Abstract: | The correspondence of historical personalities serves as a rich source of psychological, social, and economic information. Letters were indeed used as means of communication within the family circles but also a primary method for exchanging information with colleagues, subordinates, and employers. A quantitative analysis of such material enables scholars to reconstruct both the internal psychology and the relational networks of historical figures, ultimately providing deeper insights into the socio-economic systems in which they were embedded. In this study, we analyze the outgoing correspondence of Michelangelo Buonarroti, a prominent Renaissance artist, using a collection of 523 letters as the basis for a structured text analysis. Our methodological approach compares three distinct Natural Language Processing Methods: an Augmented Dictionary Approach, which relies on static lexicon analysis and Latent Dirichlet Allocation (LDA) for topic modeling, a Supervised Machine Learning Approach that utilizes BERT-generated letter embeddings combined with a Random Forest classifier trained by the authors, and an Unsupervised Machine Learning Method. The comparison of these three methods, benchmarked to biographic knowledge, allows us to construct a robust understanding of Michelangelo’s emotional association to monetary, thematic, and social factors. Furthermore, it highlights how the Supervised Machine Learning method, by incorporating the authors’ domain knowledge and understanding of documents and background, can provide, in the context of Renaissance multi-themed letters, a more nuanced interpretation of contextual meanings, enabling the detection of subtle (positive or negative) sentimental variations due to a variety of factors that other methods can overlook. |
Keywords: | Text Analysis, Natural Language Processing, Art History, Economic History |
JEL: | N33 C55 Z11 |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:hes:wpaper:0279 |
By: | Tobias Schmidt; Kai-Robin Lange; Matthias Reccius; Henrik M\"uller; Michael Roos; Carsten Jentsch |
Abstract: | As interest in economic narratives has grown in recent years, so has the number of pipelines dedicated to extracting such narratives from texts. Pipelines often employ a mix of state-of-the-art natural language processing techniques, such as BERT, to tackle this task. While effective on foundational linguistic operations essential for narrative extraction, such models lack the deeper semantic understanding required to distinguish extracting economic narratives from merely conducting classic tasks like Semantic Role Labeling. Instead of relying on complex model pipelines, we evaluate the benefits of Large Language Models (LLMs) by analyzing a corpus of Wall Street Journal and New York Times newspaper articles about inflation. We apply a rigorous narrative definition and compare GPT-4o outputs to gold-standard narratives produced by expert annotators. Our results suggests that GPT-4o is capable of extracting valid economic narratives in a structured format, but still falls short of expert-level performance when handling complex documents and narratives. Given the novelty of LLMs in economic research, we also provide guidance for future work in economics and the social sciences that employs LLMs to pursue similar objectives. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.15041 |
By: | Liexin Cheng; Xue Cheng; Shuaiqiang Liu |
Abstract: | This paper demonstrates that a broad class of problems in quantitative finance, including those previously addressed using deep neural networks, can be efficiently solved using single-layer neural networks without iterative gradient-based training, namely extreme learning machine (ELM). ELM utilizes a single-layer network with randomly initialized hidden nodes and analytically computed output weights obtained via convex optimization, enabling rapid training and inference. Both supervised and unsupervised learning tasks are explored. In supervised learning, ELM is employed to learn parametric option pricing functions, predict intraday stock returns, and complete implied volatility surfaces. Compared with deep neural networks, Gaussian process regression, and logistic regression, ELM achieves higher computational speed, comparable accuracy, and superior generalization. In unsupervised learning, ELM numerically solves Black-Scholes-type PDEs, and outperforms Physics-Informed Neural Networks in training speed without losing precision. The approximation and generalization abilities of ELM are briefly discussed. The findings establish ELM as a practical and efficient tool for various tasks in quantitative finance. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.09551 |
By: | Timoth\'ee Hornek Amir Sartipi; Igor Tchappi; Gilbert Fridgen |
Abstract: | Accurate electricity price forecasting (EPF) is crucial for effective decision-making in power trading on the spot market. While recent advances in generative artificial intelligence (GenAI) and pre-trained large language models (LLMs) have inspired the development of numerous time series foundation models (TSFMs) for time series forecasting, their effectiveness in EPF remains uncertain. To address this gap, we benchmark several state-of-the-art pretrained models--Chronos-Bolt, Chronos-T5, TimesFM, Moirai, Time-MoE, and TimeGPT--against established statistical and machine learning (ML) methods for EPF. Using 2024 day-ahead auction (DAA) electricity prices from Germany, France, the Netherlands, Austria, and Belgium, we generate daily forecasts with a one-day horizon. Chronos-Bolt and Time-MoE emerge as the strongest among the TSFMs, performing on par with traditional models. However, the biseasonal MSTL model, which captures daily and weekly seasonality, stands out for its consistent performance across countries and evaluation metrics, with no TSFM statistically outperforming it. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.08113 |
By: | Mihai Cucuringu; Kang Li; Chao Zhang |
Abstract: | This study focuses on forecasting intraday trading volumes, a crucial component for portfolio implementation, especially in high-frequency (HF) trading environments. Given the current scarcity of flexible methods in this area, we employ a suite of machine learning (ML) models enriched with numerous HF predictors to enhance the predictability of intraday trading volumes. Our findings reveal that intraday stock trading volume is highly predictable, especially with ML and considering commonality. Additionally, we assess the economic benefits of accurate volume forecasting through Volume Weighted Average Price (VWAP) strategies. The results demonstrate that precise intraday forecasting offers substantial advantages, providing valuable insights for traders to optimize their strategies. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.08180 |
By: | Sukru Selim Calik; Andac Akyuz; Zeynep Hilal Kilimci; Kerem Colak |
Abstract: | Financial literacy is increasingly dependent on the ability to interpret complex financial data and utilize advanced forecasting tools. In this context, this study proposes a novel approach that combines transformer-based time series models with explainable artificial intelligence (XAI) to enhance the interpretability and accuracy of stock price predictions. The analysis focuses on the daily stock prices of the five highest-volume banks listed in the BIST100 index, along with XBANK and XU100 indices, covering the period from January 2015 to March 2025. Models including DLinear, LTSNet, Vanilla Transformer, and Time Series Transformer are employed, with input features enriched by technical indicators. SHAP and LIME techniques are used to provide transparency into the influence of individual features on model outputs. The results demonstrate the strong predictive capabilities of transformer models and highlight the potential of interpretable machine learning to empower individuals in making informed investment decisions and actively engaging in financial markets. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.06345 |
By: | Yang Qiang |
Abstract: | This paper explores the socioeconomic impacts of extracurricular education, specifically private tutoring, on social mobility in Japan. Using data from the 2015 National Survey on Social Stratification and Social Mobility (SSM), we employed a causal machine learning approach to evaluate this educational intervention on income, educational attainment, and occupational prestige. Our research suggests that while shadow education holds the potential for positive socioeconomic impacts, its benefits are undermined by the economic disparities among households, resulting in minimal overall improvement. This highlights the complex mechanisms between individual demographics and educational interventions, revealing promising machine learning applications in this field. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.07421 |
By: | Buckmann , Marcus (Bank of England); Hill, Ed (Bank of England) |
Abstract: | Text classification tasks such as sentiment analysis are common in economics and finance. We demonstrate that smaller, local generative language models can be effectively used for these tasks. Compared to large commercial models, they offer key advantages in privacy, availability, cost, and explainability. We use 17 sentence classification tasks (each with 2 to 4 classes) to show that penalised logistic regression on embeddings from a small language model often matches or exceeds the performance of a large model, even when trained on just dozens of labelled examples per class – the same amount typically needed to validate a large model’s performance. Moreover, this embedding-based approach yields stable and interpretable explanations for classification decisions. |
Keywords: | Text classification; large language models; machine learning; embeddings; explainability |
JEL: | C38 C45 C80 |
Date: | 2025–05–23 |
URL: | https://d.repec.org/n?u=RePEc:boe:boeewp:1127 |
By: | Paolo Verme |
Abstract: | Poverty prediction models are used to address missing data issues in a variety of contexts such as poverty profiling, targeting with proxy-means tests, cross-survey imputations such as poverty mapping, top and bottom incomes studies, or vulnerability analyses. Based on the models used by this literature, this paper conducts a study by artificially corrupting data clear of missing incomes with different patterns and shares of missing incomes. It then compares the capacity of classic econometric and machine learning models to predict poverty under different scenarios with full information on observed and unobserved incomes, and the true counterfactual poverty rate. Random forest provides more consistent and accurate predictions under most but not all scenarios. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.05958 |
By: | Mateusz Wilinski; Anubha Goel; Alexandros Iosifidis; Juho Kanniainen |
Abstract: | The rapid development of sophisticated machine learning methods, together with the increased availability of financial data, has the potential to transform financial research, but also poses a challenge in terms of validation and interpretation. A good case study is the task of classifying financial investors based on their behavioral patterns. Not only do we have access to both classification and clustering tools for high-dimensional data, but also data identifying individual investors is finally available. The problem, however, is that we do not have access to ground truth when working with real-world data. This, together with often limited interpretability of modern machine learning methods, makes it difficult to fully utilize the available research potential. In order to deal with this challenge we propose to use a realistic agent-based model as a way to generate synthetic data. This way one has access to ground truth, large replicable data, and limitless research scenarios. Using this approach we show how, even when classifying trading agents in a supervised manner is relatively easy, a more realistic task of unsupervised clustering may give incorrect or even misleading results. We complete the results with investigating the details of how supervised techniques were able to successfully distinguish between different trading behaviors. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.21662 |
By: | Réka Juhász; Nathan J. Lane; Emily Oehlsen; Veronica C. Perez |
Abstract: | Since the 18th century, policymakers have debated the merits of industrial policy (IP). Yet, economists lack basic facts about its use due to measurement challenges. We propose a new approach to IP measurement based on information contained in policy text. We show how off-the-shelf supervised machine learning tools can be used to categorize industrial policies at scale. Using this approach, we validate longstanding concerns with earlier approaches to measurement which conflate IP with other types of policy. We apply our methodology to a global database of commercial policy descriptions, and provide a first look at IP use at the country, industry, and year levels (2010-2022). The new data on IP suggest that i) IP is on the rise; ii) modern IP tends to use subsidies and export promotion measures as opposed to tariffs; iii) rich countries heavily dominate IP use; iv) IP tends to target sectors with an established comparative advantage, particularly in high-income countries. |
JEL: | C38 L52 O25 |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:33895 |
By: | Austin Pollok |
Abstract: | The discrepancy between realized volatility and the market's view of volatility has been known to predict individual equity options at the monthly horizon. It is not clear how this predictability depends on a forecast's ability to predict firm-level volatility. We consider this phenomenon at the daily frequency using high-dimensional machine learning models, as well as low-dimensional factor models. We find that marginal improvements to standard forecast error measurements can lead to economically significant gains in portfolio performance. This makes a case for re-imagining the way we train models that are used to construct portfolios. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.07928 |
By: | Marcus Buckmann; Quynh Anh Nguyen; Edward Hill |
Abstract: | We investigate whether the hidden states of large language models (LLMs) can be used to estimate and impute economic and financial statistics. Focusing on county-level (e.g. unemployment) and firm-level (e.g. total assets) variables, we show that a simple linear model trained on the hidden states of open-source LLMs outperforms the models' text outputs. This suggests that hidden states capture richer economic information than the responses of the LLMs reveal directly. A learning curve analysis indicates that only a few dozen labelled examples are sufficient for training. We also propose a transfer learning method that improves estimation accuracy without requiring any labelled data for the target variable. Finally, we demonstrate the practical utility of hidden-state representations in super-resolution and data imputation tasks. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.08662 |
By: | Dangxing Chen |
Abstract: | In recent years, machine learning models have achieved great success at the expense of highly complex black-box structures. By using axiomatic attribution methods, we can fairly allocate the contributions of each feature, thus allowing us to interpret the model predictions. In high-risk sectors such as finance, risk is just as important as mean predictions. Throughout this work, we address the following risk attribution problem: how to fairly allocate the risk given a model with data? We demonstrate with analysis and empirical examples that risk can be well allocated by extending the Shapley value framework. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.06653 |
By: | Guanhao Zhou; Yuefeng Han; Xiufan Yu |
Abstract: | This paper studies the task of estimating heterogeneous treatment effects in causal panel data models, in the presence of covariate effects. We propose a novel Covariate-Adjusted Deep Causal Learning (CoDEAL) for panel data models, that employs flexible model structures and powerful neural network architectures to cohesively deal with the underlying heterogeneity and nonlinearity of both panel units and covariate effects. The proposed CoDEAL integrates nonlinear covariate effect components (parameterized by a feed-forward neural network) with nonlinear factor structures (modeled by a multi-output autoencoder) to form a heterogeneous causal panel model. The nonlinear covariate component offers a flexible framework for capturing the complex influences of covariates on outcomes. The nonlinear factor analysis enables CoDEAL to effectively capture both cross-sectional and temporal dependencies inherent in the data panel. This latent structural information is subsequently integrated into a customized matrix completion algorithm, thereby facilitating more accurate imputation of missing counterfactual outcomes. Moreover, the use of a multi-output autoencoder explicitly accounts for heterogeneity across units and enhances the model interpretability of the latent factors. We establish theoretical guarantees on the convergence of the estimated counterfactuals, and demonstrate the compelling performance of the proposed method using extensive simulation studies and a real data application. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.20536 |
By: | Emily Aiken; Anik Ashraf; Joshua Blumenstock; Raymond Guiteras; Ahmed Mushfiq Mobarak |
Abstract: | Innovations in big data and algorithms are enabling new approaches to target interventions at scale. We compare the accuracy of three different systems for identifying the poor to receive benefit transfers — proxy means-testing, nominations from community members, and an algorithmic approach using machine learning to predict poverty using mobile phone usage behavior — and study how their cost-effectiveness varies with the scale and scope of the program. We collect mobile phone records from all major telecom operators in Bangladesh and conduct community-based wealth rankings and detailed consumption surveys of 5, 000 households, to select the 22, 000 poorest households for $300 transfers from 106, 000 listed households. While proxy-means testing is most accurate, algorithmic targeting becomes more cost-effective for national-scale programs where large numbers of households have to be screened. We explore the external validity of these insights using survey data and mobile phone records data from Togo, and cross-country information on benefit transfer programs from the World Bank. |
JEL: | C55 I32 I38 O1 |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:33919 |
By: | Shu Wang; Zijun Yao; Shuhuai Zhang; Jianuo Gai; Tracy Xiao Liu; Songfa Zhong |
Abstract: | Advancements in large language models (LLMs) have sparked a growing interest in measuring and understanding their behavior through experimental economics. However, there is still a lack of established guidelines for designing economic experiments for LLMs. By combining principles from experimental economics with insights from LLM research in artificial intelligence, we outline and discuss eight practical tactics for conducting experiments with LLMs. We further perform two sets of experiments to demonstrate the significance of these tactics. Our study enhances the design, replicability, and generalizability of LLM experiments, and broadens the scope of experimental economics in the digital age. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.21371 |
By: | Zonghan Wu; Junlin Wang; Congyuan Zou; Chenhan Wang; Yilei Shao |
Abstract: | Generative AI, particularly large language models (LLMs), is beginning to transform the financial industry by automating tasks and helping to make sense of complex financial information. One especially promising use case is the automatic creation of fundamental analysis reports, which are essential for making informed investment decisions, evaluating credit risks, guiding corporate mergers, etc. While LLMs attempt to generate these reports from a single prompt, the risks of inaccuracy are significant. Poor analysis can lead to misguided investments, regulatory issues, and loss of trust. Existing financial benchmarks mainly evaluate how well LLMs answer financial questions but do not reflect performance in real-world tasks like generating financial analysis reports. In this paper, we propose FinAR-Bench, a solid benchmark dataset focusing on financial statement analysis, a core competence of fundamental analysis. To make the evaluation more precise and reliable, we break this task into three measurable steps: extracting key information, calculating financial indicators, and applying logical reasoning. This structured approach allows us to objectively assess how well LLMs perform each step of the process. Our findings offer a clear understanding of LLMs current strengths and limitations in fundamental analysis and provide a more practical way to benchmark their performance in real-world financial settings. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.07315 |
By: | Nofal, Bastián Castro; Flores, Ignacio; Cubillos, Pablo Gutiérrez |
Abstract: | This paper examines wealth inequality dynamics in Chile from 2007 to 2021, focusing on two key macroeconomic events: the sharp rise in housing prices after the introduction of a real estate value-added tax in 2016 and the substantial liquidation of pension assets through early withdrawals during the pandemic. We introduce a methodological innovation that aims to improve the measurement of wealth inequality by integrating administrative pension fund records into household wealth surveys using machine learning techniques. Our results reveal extreme levels of wealth concentration, with the top 10% holding approximately two-thirds of national private wealth. However, inequality slightly declined over the period, particularly after 2016, as the outcome of two opposing forces: housing appreciation, which benefited middle-class households, and pension fund withdrawals, which disproportionately reduced wealth at the lower end of the distribution. (Stone Center on Socio-Economic Inequality Working Paper) |
Date: | 2025–06–06 |
URL: | https://d.repec.org/n?u=RePEc:osf:socarx:b8zve_v1 |
By: | Zheng Cao; Wanchaloem Wunkaew; Helyette Geman |
Abstract: | This paper introduces the Hype Index as a novel metric to quantify media attention toward large-cap equities, leveraging advances in Natural Language Processing (NLP) for extracting predictive signals from financial news. Using the S&P 100 as the focus universe, we first construct a News Count-Based Hype Index, which measures relative media exposure by computing the share of news articles referencing each stock or sector. We then extend it to the Capitalization Adjusted Hype Index, adjusts for economic size by taking the ratio of a stock's or sector's media weight to its market capitalization weight within its industry or sector. We compute both versions of the Hype Index at the stock and sector levels, and evaluate them through multiple lenses: (1) their classification into different hype groups, (2) their associations with returns, volatility, and VIX index at various lags, (3) their signaling power for short-term market movements, and (4) their empirical properties including correlations, samplings, and trends. Our findings suggest that the Hype Index family provides a valuable set of tools for stock volatility analysis, market signaling, and NLP extensions in Finance. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.06329 |
By: | Falck-Zepeda, José B.; Zambrano, Patricia; Sanders, Arie; Trabanino, Carlos Rogelio |
Abstract: | Robust impact assessment methods need credible yield, costs, and other production performance parameter estimates. Sample data issues and the realities of producer heterogeneity and markets, including endogeneity, simultaneity, and outliers can affect such parameters. Methods have continued to evolve that may address data issues identified in the earlier literature examining genetically modified (GM) crops impacts especially those of conventional field level surveys. These methods may themselves have limitations, introduce trade-offs, and may not always be successful in addressing such issues. Experimental methods such as randomized control trials have been proposed to address several control treatment data issues, but these may not be suitable for every situation and issue and may be more expensive and complex than conventional field surveys. Furthermore, experimental methods may induce the unfortunate outcome of crowding-out impact assessors from low- and middle-income countries. The continued search for alternatives that help address conventional survey shortcomings remains critical. Previously, existing assessment methods were applied to the impact assessment of insect resistant and herbicide tolerant maize adoption in Honduras in 2008 and 2012. Results from assessments identified endogeneity issues such as self-selection and simultaneity concurrently with influential outliers. Procedures used to address these issues independently showed trade-offs between addressing endogeneity and outliers. Thus, the need to identify methods that address both issues simultaneously, minimizing as much as possible the impact of method trade-offs, continues. We structured this paper as follows. First, we review the literature to delineate data and assessment issues potentially affecting robust performance indicators such as yields and costs differentials. Second, we discuss and apply four types of approaches that can be used to obtain robust performance estimates for yield and cost differentials including: 1) Robust Instrumental Variables, 2) Instrumental Variable Regressions, and 3) Control/Treatment, and 4) Machine Learning methods that are amenable to robust strategies to deal with outliers including Random Forest and a Stacking regression approach that allows for a number of “base learners” in order to examine the pooled 2008 and 2012 Honduras field surveys. Third, we discuss implications for impact assessment results and implementation limitations especially in low- and middle-income countries. We further discuss and draw some conclusions regarding methodological issues for consideration by impact assessors and stakeholders. |
Keywords: | maize; yields; impact assessment; agriculture; data; capacity building; machine learning; parametric programming; herbicide resistance; Honduras; Latin America and the Caribbean; Central America |
Date: | 2025–04–24 |
URL: | https://d.repec.org/n?u=RePEc:fpr:ifprid:174327 |
By: | Mukashov, Askar; Robinson, Sherman; Thurlow, James; Arndt, Channing; Thomas, Timothy S. |
Abstract: | This paper uses machine learning, simulation, and data mining methods to develop Systematic Risk Profiles of three developing economies: Kenya, Rwanda, and Malawi. We focus on three exogenous shocks with implications for economic performance: world market prices, capital flows, and climate-driven sectoral productivity. In these and other developing countries, recent decades have been characterized by increased risks associated with all these factors, and there is a demand for instruments that can help to disentangle them. For each country, we utilize historical data to develop multi-variate distributions of shocks. We then sample from these distributions to obtain a series of shock vectors, which we label economic uncertainty scenarios. These scenarios are then entered into economywide computable general equilibrium (CGE) simulation models for the three countries, which allow us to quantify the impact of increased uncertainty on major economic indicators. Finally, we utilize importance metrics from the random forest machine learning algorithm and relative importance metrics from multiple linear regression models to quantify the importance of country-specific risk factors for country performance. We find that Malawi and Rwanda are more vulnerable to sectoral productivity shocks, and Kenya is more exposed to external risks. These findings suggest that a country’s level of development and integration into the global economy are key driving forces defining their risk profiles. The methodology of Systematic Risk Profiling can be applied to many other countries, delineating country-specific risks and vulnerabilities. |
Keywords: | climate; computable general equilibrium models; machine learning; risk; uncertainty; Kenya; Rwanda; Malawi; Africa; Eastern Africa; Sub-Saharan Africa |
Date: | 2024–10–25 |
URL: | https://d.repec.org/n?u=RePEc:fpr:gsspwp:158180 |
By: | Giuseppe Arbia; Luca Morandini; Vincenzo Nardelli |
Abstract: | This paper investigates Large Language Models (LLMs) ability to assess the economic soundness and theoretical consistency of empirical findings in spatial econometrics. We created original and deliberately altered "counterfactual" summaries from 28 published papers (2005-2024), which were evaluated by a diverse set of LLMs. The LLMs provided qualitative assessments and structured binary classifications on variable choice, coefficient plausibility, and publication suitability. The results indicate that while LLMs can expertly assess the coherence of variable choices (with top models like GPT-4o achieving an overall F1 score of 0.87), their performance varies significantly when evaluating deeper aspects such as coefficient plausibility and overall publication suitability. The results further revealed that the choice of LLM, the specific characteristics of the paper and the interaction between these two factors significantly influence the accuracy of the assessment, particularly for nuanced judgments. These findings highlight LLMs' current strengths in assisting with initial, more surface-level checks and their limitations in performing comprehensive, deep economic reasoning, suggesting a potential assistive role in peer review that still necessitates robust human oversight. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.06377 |
By: | Weber, Isabella; Wasner, Evan; Lang, Markus; Braun, Benjamin; Klooster, Jens van’t |
Abstract: | Supply shocks are now widely recognized as a driver of the recent inflation bout, but the role of firms’ pricing strategies in propagating input cost shocks remains contested. In this paper, we review the state of the academic debate over sellers’ inflation and assess whether, in line with this theory, economy-wide cost shocks have functioned as an implicit coordination mechanism for firms to hike prices. We use a dataset containing 138, 962 corporate earnings call transcripts of 4, 823 stock-market listed U.S. corporations from the period 2007-Q1 to 2022-Q2 to conduct sentiment analysis via both dictionary-based natural language processing and a large language model approach. We find that large input price shocks (as well as their co-occurrence with supply constraints) correlate with positive sentiments expressed in executives’ statements about cost increases. Qualitative analysis provides further insights into the reasoning behind executives’ optimism regarding their ability to turn an economy-wide cost shock into an opportunity to raise prices and protect or even increase profits. |
Keywords: | inflation; profits; price coordination; sentiment analysis; earnings calls |
JEL: | J1 F3 G3 |
Date: | 2025–09–30 |
URL: | https://d.repec.org/n?u=RePEc:ehl:lserod:128231 |
By: | R. Maria del Rio-Chanona; Marco Pangallo; Cars Hommes |
Abstract: | We explore the potential of Large Language Models (LLMs) to replicate human behavior in economic market experiments. Compared to previous studies, we focus on dynamic feedback between LLM agents: the decisions of each LLM impact the market price at the current step, and so affect the decisions of the other LLMs at the next step. We compare LLM behavior to market dynamics observed in laboratory settings and assess their alignment with human participants' behavior. Our findings indicate that LLMs do not adhere strictly to rational expectations, displaying instead bounded rationality, similarly to human participants. Providing a minimal context window i.e. memory of three previous time steps, combined with a high variability setting capturing response heterogeneity, allows LLMs to replicate broad trends seen in human experiments, such as the distinction between positive and negative feedback markets. However, differences remain at a granular level--LLMs exhibit less heterogeneity in behavior than humans. These results suggest that LLMs hold promise as tools for simulating realistic human behavior in economic contexts, though further research is needed to refine their accuracy and increase behavioral diversity. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.07457 |
By: | Fabian Muny |
Abstract: | Many programs evaluated in observational studies incorporate a sequential structure, where individuals may be assigned to various programs over time. While this complexity is often simplified by analyzing programs at single points in time, this paper reviews, explains, and applies methods for program evaluation within a sequential framework. It outlines the assumptions required for identification under dynamic confounding and demonstrates how extending sequential estimands to dynamic policies enables the construction of more realistic counterfactuals. Furthermore, the paper explores recently developed methods for estimating effects across multiple treatments and time periods, utilizing Double Machine Learning (DML), a flexible estimator that avoids parametric assumptions while preserving desirable statistical properties. Using Swiss administrative data, the methods are demonstrated through an empirical application assessing the participation of unemployed individuals in active labor market policies, where assignment decisions by caseworkers can be reconsidered between two periods. The analysis identifies a temporary wage subsidy as the most effective intervention, on average, even after adjusting for its extended duration compared to other programs. Overall, DML-based analysis of dynamic policies proves to be a useful approach within the program evaluation toolkit. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.11960 |
By: | Thiago Christiano Silva; Kei Moriya; Mr. Romain M Veyrune |
Abstract: | This paper introduces a classification framework to analyze central bank communications across four dimensions: topic, communication stance, sentiment, and audience. Using a fine-tuned large language model trained on central bank documents, we classify individual sentences to transform policy language into systematic and quantifiable metrics on how central banks convey information to diverse stakeholders. Applied to a multilingual dataset of 74, 882 documents from 169 central banks spanning 1884 to 2025, this study delivers the most comprehensive empirical analysis of central bank communication to date. Monetary policy communication changes significantly with inflation targeting, as backward-looking exchange rate discussions give way to forward-looking statements on inflation, interest rates, and economic conditions. We develop a directional communication index that captures signals about future policy rate changes and unconventional measures, including forward guidance and balance sheet operations. This unified signal helps explain future movements in market rates. While tailoring messages to audiences is often asserted, we offer the first systematic quantification of this practice. Audience-specific risk communication has remained stable for decades, suggesting a structural and deliberate tone. Central banks adopt neutral, fact-based language with financial markets, build confidence with the public, and highlight risks to governments. During crises, however, this pattern shifts remarkably: confidence-building rises in communication to the financial sector and government, while risk signaling increases for other audiences. Forward-looking risk communication also predicts future market volatility, demonstrating that central bank language plays a dual role across monetary and financial stability channels. Together, these findings provide novel evidence that communication is an active policy tool for steering expectations and shaping economic and financial conditions. |
Keywords: | Central bank communication; large language models; forward guidance; monetary policy; sentiment analysis |
Date: | 2025–06–06 |
URL: | https://d.repec.org/n?u=RePEc:imf:imfwpa:2025/109 |
By: | Weixian Waylon Li; Hyeonjun Kim; Mihai Cucuringu; Tiejun Ma |
Abstract: | Large Language Models (LLMs) have recently been leveraged for asset pricing tasks and stock trading applications, enabling AI agents to generate investment decisions from unstructured financial data. However, most evaluations of LLM timing-based investing strategies are conducted on narrow timeframes and limited stock universes, overstating effectiveness due to survivorship and data-snooping biases. We critically assess their generalizability and robustness by proposing FINSABER, a backtesting framework evaluating timing-based strategies across longer periods and a larger universe of symbols. Systematic backtests over two decades and 100+ symbols reveal that previously reported LLM advantages deteriorate significantly under broader cross-section and over a longer-term evaluation. Our market regime analysis further demonstrates that LLM strategies are overly conservative in bull markets, underperforming passive benchmarks, and overly aggressive in bear markets, incurring heavy losses. These findings highlight the need to develop LLM strategies that are able to prioritise trend detection and regime-aware risk controls over mere scaling of framework complexity. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.07078 |
By: | Gabriel Nova; Sander van Cranenburgh; Stephane Hess |
Abstract: | Discrete choice modelling is a theory-driven modelling framework for understanding and forecasting choice behaviour. To obtain behavioural insights, modellers test several competing model specifications in their attempts to discover the 'true' data generation process. This trial-and-error process requires expertise, is time-consuming, and relies on subjective theoretical assumptions. Although metaheuristics have been proposed to assist choice modellers, they treat model specification as a classic optimisation problem, relying on static strategies, applying predefined rules, and neglecting outcomes from previous estimated models. As a result, current metaheuristics struggle to prioritise promising search regions, adapt exploration dynamically, and transfer knowledge to other modelling tasks. To address these limitations, we introduce a deep reinforcement learning-based framework where an 'agent' specifies models by estimating them and receiving rewards based on goodness-of-fit and parsimony. Results demonstrate the agent dynamically adapts its strategies to identify promising specifications across data generation processes, showing robustness and potential transferability, without prior domain knowledge. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.06410 |
By: | Mindy L. Mallory; Rundong Peng; Meilin Ma; H. Holly Wang |
Abstract: | Price transmission has been studied extensively in agricultural economics through the lens of spatial and vertical price relationships. Classical time series econometric techniques suffer from the "curse of dimensionality" and are applied almost exclusively to small sets of price series, either prices of one commodity in a few regions or prices of a few commodities in one region. However, an agrifood supply chain usually contains several commodities (e.g., cattle and beef) and spans numerous regions. Failing to jointly examine multi-region, multi-commodity price relationships limits researchers' ability to derive insights from increasingly high-dimensional price datasets of agrifood supply chains. We apply a machine-learning method - specifically, regularized regression - to augment the classical vector error correction model (VECM) and study large spatial-plus-vertical price systems. Leveraging weekly provincial-level data on the piglet-hog-pork supply chain in China, we uncover economically interesting changes in price relationships in the system before and after the outbreak of a major hog disease. To quantify price transmission in the large system, we rely on the spatial-plus-vertical price relationships identified by the regularized VECM to visualize comprehensive spatial and vertical price transmission of hypothetical shocks through joint impulse response functions. Price transmission shows considerable heterogeneity across regions and commodities as the VECM outcomes imply and display different dynamics over time. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.13967 |
By: | Xueying Ding; Aakriti Mittal; Achintya Gopal |
Abstract: | Time-series data is a vital modality within data science communities. This is particularly valuable in financial applications, where it helps in detecting patterns, understanding market behavior, and making informed decisions based on historical data. Recent advances in language modeling have led to the rise of time-series pre-trained models that are trained on vast collections of datasets and applied to diverse tasks across financial domains. However, across financial applications, existing time-series pre-trained models have not shown boosts in performance over simple finance benchmarks in both zero-shot and fine-tuning settings. This phenomenon occurs because of a i) lack of financial data within the pre-training stage, and ii) the negative transfer effect due to inherently different time-series patterns across domains. Furthermore, time-series data is continuous, noisy, and can be collected at varying frequencies and with varying lags across different variables, making this data more challenging to model than languages. To address the above problems, we introduce a Pre-trained MoDEL for FINance TimE-series (Delphyne). Delphyne achieves competitive performance to existing foundation and full-shot models with few fine-tuning steps on publicly available datasets, and also shows superior performances on various financial tasks. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.06288 |
By: | Yu Li; Yuhan Wu; Shuhua Zhang |
Abstract: | In this paper, we study the continuous-time multi-asset mean-variance (MV) portfolio selection using a reinforcement learning (RL) algorithm, specifically the soft actor-critic (SAC) algorithm, in the time-varying financial market. A family of Gaussian portfolio selections is derived, and a policy iteration process is crafted to learn the optimal exploratory portfolio selection. We prove the convergence of the policy iteration process theoretically, based on which the SAC algorithm is developed. To improve the algorithm's stability and the learning accuracy in the multi-asset scenario, we divide the model parameters that influence the optimal portfolio selection into three parts, and learn each part progressively. Numerical studies in the simulated and real financial markets confirm the superior performance of the proposed SAC algorithm under various criteria. |
Date: | 2025–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2505.07537 |
By: | James Cussens; Julia Hatamyar; Vishalie Shah; Noemi Kreif |
Abstract: | We develop and implement a version of the popular "policytree" method (Athey and Wager, 2021) using discrete optimisation techniques. We test the performance of our algorithm in finite samples and find an improvement in the runtime of optimal policy tree learning by a factor of nearly 50 compared to the original version. We provide an R package, "fastpolicytree", for public use. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.15435 |
By: | Qirui Mi; Qipeng Yang; Zijun Fan; Wentian Fan; Heyang Ma; Chengdong Ma; Siyu Xia; Bo An; Jun Wang; Haifeng Zhang |
Abstract: | Artificial intelligence (AI) has become a powerful tool for economic research, enabling large-scale simulation and policy optimization. However, applying AI effectively requires simulation platforms for scalable training and evaluation-yet existing environments remain limited to simplified, narrowly scoped tasks, falling short of capturing complex economic challenges such as demographic shifts, multi-government coordination, and large-scale agent interactions. To address this gap, we introduce EconGym, a scalable and modular testbed that connects diverse economic tasks with AI algorithms. Grounded in rigorous economic modeling, EconGym implements 11 heterogeneous role types (e.g., households, firms, banks, governments), their interaction mechanisms, and agent models with well-defined observations, actions, and rewards. Users can flexibly compose economic roles with diverse agent algorithms to simulate rich multi-agent trajectories across 25+ economic tasks for AI-driven policy learning and analysis. Experiments show that EconGym supports diverse and cross-domain tasks-such as coordinating fiscal, pension, and monetary policies-and enables benchmarking across AI, economic methods, and hybrids. Results indicate that richer task composition and algorithm diversity expand the policy space, while AI agents guided by classical economic methods perform best in complex settings. EconGym also scales to 10k agents with high realism and efficiency. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.12110 |
By: | Weiyao Meng; John Harvey; James Goulding; Chris James Carter; Evgeniya Lukinova; Andrew Smith; Paul Frobisher; Mina Forrest; Georgiana Nica-Avram |
Abstract: | Reading and evaluating product reviews is central to how most people decide what to buy and consume online. However, the recent emergence of Large Language Models and Generative Artificial Intelligence now means writing fraudulent or fake reviews is potentially easier than ever. Through three studies we demonstrate that (1) humans are no longer able to distinguish between real and fake product reviews generated by machines, averaging only 50.8% accuracy overall - essentially the same that would be expected by chance alone; (2) that LLMs are likewise unable to distinguish between fake and real reviews and perform equivalently bad or even worse than humans; and (3) that humans and LLMs pursue different strategies for evaluating authenticity which lead to equivalently bad accuracy, but different precision, recall and F1 scores - indicating they perform worse at different aspects of judgment. The results reveal that review systems everywhere are now susceptible to mechanised fraud if they do not depend on trustworthy purchase verification to guarantee the authenticity of reviewers. Furthermore, the results provide insight into the consumer psychology of how humans judge authenticity, demonstrating there is an inherent 'scepticism bias' towards positive reviews and a special vulnerability to misjudge the authenticity of fake negative reviews. Additionally, results provide a first insight into the 'machine psychology' of judging fake reviews, revealing that the strategies LLMs take to evaluate authenticity radically differ from humans, in ways that are equally wrong in terms of accuracy, but different in their misjudgments. |
Date: | 2025–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2506.13313 |