|
on Big Data |
By: | David Imhof; Emanuel W Viklund; Martin Huber |
Abstract: | We propose a novel application of graph attention networks (GATs), a type of graph neural network enhanced with attention mechanisms, to develop a deep learning algorithm for detecting collusive behavior, leveraging predictive features suggested in prior research. We test our approach on a large dataset covering 13 markets across seven countries. Our results show that predictive models based on GATs, trained on a subset of the markets, can be effectively transferred to other markets, achieving accuracy rates between 80% and 90%, depending on the hyperparameter settings. The best-performing configuration, applied to eight markets from Switzerland and the Japanese region of Okinawa, yields an average accuracy of 91% for cross-market prediction. When extended to 12 markets, the method maintains a strong performance with an average accuracy of 84%, surpassing traditional ensemble approaches in machine learning. These results suggest that GAT-based detection methods offer a promising tool for competition authorities to screen markets for potential cartel activity. |
Date: | 2025–07 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2507.12369 |
By: | Jieyu Chen; Sebastian Lerch; Melanie Schienle; Tomasz Serafin; Rafal Weron |
Abstract: | The growing importance of intraday electricity trading in Europe calls for improved price forecasting and tailored decision-support tools. In this paper, we propose a novel generative neural network model to generate probabilistic path forecasts for intraday electricity prices and use them to construct effective trading strategies for Germany's continuous-time intraday market. Our method demonstrates competitive performance in terms of statistical evaluation metrics compared to two state-of-the-art statistical benchmark approaches. To further assess its economic value, we consider a realistic fixed-volume trading scenario and propose various strategies for placing market sell orders based on the path forecasts. Among the different trading strategies, the price paths generated by our generative model lead to higher profit gains than the benchmark methods. Our findings highlight the potential of generative machine learning tools in electricity price forecasting and underscore the importance of economic evaluation. |
Keywords: | Intraday electricity market; Probabilistic forecast; Path forecast; Prediction bands; Energy score; Machine learning; Generative neural network; Trading recommendations |
JEL: | C22 C32 C45 C51 C53 Q41 Q47 |
Date: | 2025 |
URL: | https://d.repec.org/n?u=RePEc:ahh:wpaper:worms2505 |
By: | Daniele Ballinari; Jessica Maly |
Abstract: | We enhance sentiment analysis in the foreign exchange (FX) market by fine-tuning large language models (LLMs) to better understand and interpret the complex language specific to FX markets. We build on existing methods by using state-of-the-art open source LLMs, fine-tuning them with labelled FX news articles and then comparing their performance against traditional approaches and alternative models. Furthermore, we tested these fine-tuned LLMs by creating investment strategies based on the sentiment they detect in FX analysis articles with the goal of demonstrating how well these strategies perform in real-world trading scenarios. Our findings indicate that the fine-tuned LLMs outperform the existing methods in terms of both the classification accuracy and trading performance, highlighting their potential for improving FX market sentiment analysis and investment decision-making. |
Keywords: | Large language models, Sentiment analysis, Fine-tuning, Text classification, Natural language processing, Foreign exchange, Financial markets |
JEL: | F31 G12 G15 |
Date: | 2025 |
URL: | https://d.repec.org/n?u=RePEc:snb:snbwpa:2025-11 |
By: | Gambara, Matteo; Livieri, Giulia; Pallavicini, Andrea |
Abstract: | Evaluating financial products with early-termination clauses, particularly those with path-dependent structures, is challenging. This paper focuses on Asian options, look-back options, and callable certificates. We will compare regression methods for pricing and computing sensitivities, highlighting modern machine learning techniques against traditional polynomial basis functions. Specifically, we will analyze randomized recurrent and feed-forward neural networks, along with a novel approach using signatures of the underlying price process. For option sensitivities like Delta and Gamma, we will incorporate Chebyshev interpolation. Our findings show that machine learning algorithms often match the accuracy and efficiency of traditional methods for Asian and look-back options, while randomized neural networks are best for callable certificates. Furthermore, we apply Chebyshev interpolation for Delta and Gamma calculations for the first time in Asian options and callable certificates. |
Keywords: | amerasian options; callable certificates; random networks; Chebyshev Greeks; early termination; signature methods |
JEL: | C63 G13 |
Date: | 2025–06–30 |
URL: | https://d.repec.org/n?u=RePEc:ehl:lserod:128600 |
By: | Rehim Kılıç |
Abstract: | This paper fills an important gap in the volatility forecasting literature by comparing a broad suite of machine learning (ML) methods with both linear and nonlinear econometric models using high-frequency realized volatility (RV) data for the S&P 500. We evaluate ARFIMA, HAR, regime-switching HAR models (THAR, STHAR, MSHAR), and ML methods including Extreme Gradient Boosting, deep feed-forward neural networks, and recurrent networks (BRNN, LSTM, LSTM-A, GRU). Using rolling forecasts from 2006 onward, we find that regime-switching models—particularly THAR and STHAR—consistently outperform ML and linear models, especially when predictors are limited. These models also deliver more accurate risk forecasts and higher realized utility. While ML models capture some nonlinear patterns, they offer no consistent advantage over simpler, interpretable alternatives. Our findings highlight the importance of modeling regime changes through transparent econometric tools, especially in real-world applications where predictor availability is sparse and model interpretability is critical for risk management and portfolio allocation. |
Keywords: | Realized volatility; Machine learning; Regime-switching; Nonlinearity; VaR; forecasting |
JEL: | C10 C50 G11 G15 |
Date: | 2025–08–08 |
URL: | https://d.repec.org/n?u=RePEc:fip:fedgfe:2025-61 |
By: | Jeremy Proz; Martin Huber |
Abstract: | Collusion and capacity withholding in electricity wholesale markets are important mechanisms of market manipulation. This study applies a refined machine learning-based cartel detection algorithm to two cartel cases in the Italian electricity market and evaluates its out-of-sample performance. Specifically, we consider an ensemble machine learning method that uses statistical screens constructed from the offer price distribution as predictors for the incidence of collusion among electricity providers in specific regions. We propose novel screens related to the capacity-withholding behavior of electricity providers and find that including such screens derived from the day-ahead spot market as predictors can improve cartel detection. We find that, under complete cartels - where collusion in a tender presumably involves all suppliers - the method correctly classifies up to roughly 95% of tenders in our data as collusive or competitive, improving classification accuracy compared to using only previously available screens. However, when trained on larger datasets including non-cartel members and applying algorithms tailored to detect incomplete cartels, the previously existing screens are sufficient to achieve 98% accuracy, and the addition of our newly proposed capacity-withholding screens does not further improve performance. Overall, this study highlights the promising potential of supervised machine learning techniques for detecting and dismantling cartels in electricity markets. |
Date: | 2025–08 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2508.09885 |
By: | Chenghao Liu; Aniket Mahanti; Ranesh Naha; Guanghao Wang; Erwann Sbai |
Abstract: | As cryptocurrencies gain popularity, the digital asset marketplace becomes increasingly significant. Understanding social media signals offers valuable insights into investor sentiment and market dynamics. Prior research has predominantly focused on text-based platforms such as Twitter. However, video content remains underexplored, despite potentially containing richer emotional and contextual sentiment that is not fully captured by text alone. In this study, we present a multimodal analysis comparing TikTok and Twitter sentiment, using large language models to extract insights from both video and text data. We investigate the dynamic dependencies and spillover effects between social media sentiment and cryptocurrency market indicators. Our results reveal that TikTok's video-based sentiment significantly influences speculative assets and short-term market trends, while Twitter's text-based sentiment aligns more closely with long-term dynamics. Notably, the integration of cross-platform sentiment signals improves forecasting accuracy by up to 20%. |
Date: | 2025–08 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2508.15825 |
By: | Arkadiusz Lipiecki; Kaja Bilinska; Nikolaos Kourentzes; Rafal Weron |
Abstract: | We introduce the concept of Temporal Hierarchy Forecasting (THieF) in predicting day-ahead electricity prices and show that reconciling forecasts for hourly products, 2- to 12-hour blocks, and baseload contracts significantly (up to 13%) improves accuracy at all levels. These results remain consistent throughout a challenging 4-year test period (2021-2024) in the German power market and across model architectures, including linear regression, a shallow neural network, gradient boosting, and a state-of-the-art transformer. Given that (i) trading of block products is becoming more common and (ii) the computational cost of reconciliation is comparable to that of predicting hourly prices alone, we recommend using it in daily forecasting practice. |
Keywords: | Electricity price; Temporal Hierarchy Forecasting (THieF); Forecast reconciliation; Regression; Machine learning |
JEL: | C22 C45 C51 C53 Q41 Q47 |
Date: | 2025 |
URL: | https://d.repec.org/n?u=RePEc:ahh:wpaper:worms2506 |
By: | Ivan Letteri |
Abstract: | The detection of outliers within cryptocurrency limit order books (LOBs) is of paramount importance for comprehending market dynamics, particularly in highly volatile and nascent regulatory environments. This study conducts a comprehensive comparative analysis of robust statistical methods and advanced machine learning techniques for real-time anomaly identification in cryptocurrency LOBs. Within a unified testing environment, named AITA Order Book Signal (AITA-OBS), we evaluate the efficacy of thirteen diverse models to identify which approaches are most suitable for detecting potentially manipulative trading behaviours. An empirical evaluation, conducted via backtesting on a dataset of 26, 204 records from a major exchange, demonstrates that the top-performing model, Empirical Covariance (EC), achieves a 6.70% gain, significantly outperforming a standard Buy-and-Hold benchmark. These findings underscore the effectiveness of outlier-driven strategies and provide insights into the trade-offs between model complexity, trade frequency, and performance. This study contributes to the growing corpus of research on cryptocurrency market microstructure by furnishing a rigorous benchmark of anomaly detection models and highlighting their potential for augmenting algorithmic trading and risk management. |
Date: | 2025–07 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2507.14960 |
By: | Yueyi Wang; Qiyao Wei |
Abstract: | In this study, we wish to showcase the unique utility of large language models (LLMs) in financial semantic annotation and alpha signal discovery. Leveraging a corpus of company-related tweets, we use an LLM to automatically assign multi-label event categories to high-sentiment-intensity tweets. We align these labeled sentiment signals with forward returns over 1-to-7-day horizons to evaluate their statistical efficacy and market tradability. Our experiments reveal that certain event labels consistently yield negative alpha, with Sharpe ratios as low as -0.38 and information coefficients exceeding 0.05, all statistically significant at the 95\% confidence level. This study establishes the feasibility of transforming unstructured social media text into structured, multi-label event variables. A key contribution of this work is its commitment to transparency and reproducibility; all code and methodologies are made publicly available. Our results provide compelling evidence that social media sentiment is a valuable, albeit noisy, signal in financial forecasting and underscore the potential of open-source frameworks to democratize algorithmic trading research. |
Date: | 2025–08 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2508.07408 |
By: | Askitas, Nikos (IZA) |
Abstract: | We examine the uptake of GPT-assisted writing in economics working paper abstracts. Using data from the IZA DP series, we detect a clear stylistic shift after the release of ChatGPT-3.5 in March 2023. This shift is evident in core textual metrics—mean word length, type-token ratio, and readability—and reflects growing convergence with machine-generated writing. While the ChatGPT launch was an exogenous shock, adoption is endogenous: authors choose whether to use AI. To capture this behavioral response, we combine stylometric analysis, machine learning classification, and prompt-based similarity testing. Event-study regressions with fixed effects and placebo checks confirm that the change is abrupt, persistent, and not explained by pre-existing trends. A similarity experiment using OpenAI’s API shows that post-ChatGPT abstracts resemble their GPT-optimized versions more closely than pre-ChatGPT resemble theirs. A classifier, trained on these variants, flags a growing share of post-March 2023 texts as GPT-like. Rather than suggesting full automation, our findings indicate selective human–AI augmentation. Our framework generalizes to other contexts such as e.g. resumes, job ads, legal briefs, research proposals, or programming code. |
Keywords: | AI-assisted writing, linguistic metrics, event study, machine learning, natural language processing (NLP), text analysis, academic writing, GPT adoption, diffusion of technology |
JEL: | C55 C88 O33 C81 L86 J24 |
Date: | 2025–08 |
URL: | https://d.repec.org/n?u=RePEc:iza:izadps:dp18062 |
By: | Jinbo Cai; Wenze Li; Wenjie Wang |
Abstract: | With stakeholder-level in-market data, we conduct a comparative analysis of machine learning (ML) for forecasting electricity prices in Singapore, spanning 15 individual models and 4 ensemble approaches. Our empirical findings justify the three virtues of ML models: (1) the virtue of capturing non-linearity, (2) the complexity (Kelly et al., 2024) and (3) the l2-norm and bagging techniques in a weak factor environment (Shen and Xiu, 2024). Simulation also supports the first virtue. Penalizing prediction correlation improves ensemble performance when individual models are highly correlated. The predictability can be translated into sizable economic gains under the mean-variance framework. We also reveal significant patterns of time-series heterogeneous predictability across macro regimes: predictability is clustered in expansion, volatile market and extreme geopolitical risk periods. Our feature importance results agree with the complex dynamics of Singapore's electricity market after de regulation, yet highlight its relatively supply-driven nature with the continued presence of strong regulatory influences. |
Date: | 2025–07 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2507.07477 |
By: | Dixon Domfeh; Saeid Safarveisi |
Abstract: | Traditional models for pricing catastrophe (CAT) bonds struggle to capture the complex, relational data inherent in these instruments. This paper introduces CATNet, a novel framework that applies a geometric deep learning architecture, the Relational Graph Convolutional Network (R-GCN), to model the CAT bond primary market as a graph, leveraging its underlying network structure for spread prediction. Our analysis reveals that the CAT bond market exhibits the characteristics of a scale-free network, a structure dominated by a few highly connected and influential hubs. CATNet demonstrates high predictive performance, significantly outperforming a strong Random Forest benchmark. The inclusion of topological centrality measures as features provides a further, significant boost in accuracy. Interpretability analysis confirms that these network features are not mere statistical artifacts; they are quantitative proxies for long-held industry intuition regarding issuer reputation, underwriter influence, and peril concentration. This research provides evidence that network connectivity is a key determinant of price, offering a new paradigm for risk assessment and proving that graph-based models can deliver both state-of-the-art accuracy and deeper, quantifiable market insights. |
Date: | 2025–08 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2508.10208 |
By: | Guilherme V. Moura; Andr\'e P. Santos; Hudson S. Torrent |
Abstract: | Machine learning (ML) methods have been successfully employed in identifying variables that can predict the equity premium of individual stocks. In this paper, we investigate if ML can also be helpful in selecting variables relevant for optimal portfolio choice. To address this question, we parameterize minimum-variance portfolio weights as a function of a large pool of firm-level characteristics as well as their second-order and cross-product transformations, yielding a total of 4, 610 predictors. We find that the gains from employing ML to select relevant predictors are substantial: minimum-variance portfolios achieve lower risk relative to sparse specifications commonly considered in the literature, especially when non-linear terms are added to the predictor space. Moreover, some of the selected predictors that help decreasing portfolio risk also increase returns, leading to minimum-variance portfolios with good performance in terms of Shape ratios in some situations. Our evidence suggests that ad-hoc sparsity can be detrimental to the performance of minimum-variance characteristics-based portfolios. |
Date: | 2025–08 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2508.14986 |
By: | Vidya Sagar G; Shifat Ali; Siddhartha P. Chakrabarty |
Abstract: | This paper presents a machine learning driven framework for sectoral stress testing in the Indian financial market, focusing on financial services, information technology, energy, consumer goods, and pharmaceuticals. Initially, we address the limitations observed in conventional stress testing through dimensionality reduction and latent factor modeling via Principal Component Analysis and Autoencoders. Building on this, we extend the methodology using Variational Autoencoders, which introduces a probabilistic structure to the latent space. This enables Monte Carlo-based scenario generation, allowing for more nuanced, distribution-aware simulation of stressed market conditions. The proposed framework captures complex non-linear dependencies and supports risk estimation through Value-at-Risk and Expected Shortfall. Together, these pipelines demonstrate the potential of Machine Learning approaches to improve the flexibility, robustness, and realism of financial stress testing. |
Date: | 2025–07 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2507.02011 |
By: | Tianjiao Zhao; Jingrao Lyu; Stokes Jones; Harrison Garber; Stefano Pasquali; Dhagash Mehta |
Abstract: | The field of artificial intelligence (AI) agents is evolving rapidly, driven by the capabilities of Large Language Models (LLMs) to autonomously perform and refine tasks with human-like efficiency and adaptability. In this context, multi-agent collaboration has emerged as a promising approach, enabling multiple AI agents to work together to solve complex challenges. This study investigates the application of role-based multi-agent systems to support stock selection in equity research and portfolio management. We present a comprehensive analysis performed by a team of specialized agents and evaluate their stock-picking performance against established benchmarks under varying levels of risk tolerance. Furthermore, we examine the advantages and limitations of employing multi-agent frameworks in equity analysis, offering critical insights into their practical efficacy and implementation challenges. |
Date: | 2025–08 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2508.11152 |
By: | Cathy Yi‐Hsuan Chen (University of Glasgow, Adam Smith Business School; Humboldt Universität zu Berlin); Abraham Lioui (EDHEC Business School); O. Scaillet (Swiss Finance Institute - University of Geneva) |
Abstract: | Voluntary carbon disclosure collapses into a paradox of green silence: firms choose to disclose emissions based on strategic incentives (e.g., correcting vendor overestimates), while high emitters may exploit vendor estimation bias. Mirroring Heckman sample selection bias, this selfcensorship skews disclosed emissions into non-random samples, distorting climate risk pricing and policy. We bridge economic problem and machine learning, proposing a Heckman-inspired three-step framework in high-dimensional settings to correct for strategic non-disclosure and ensure variable selection consistency in the presence of sample selection bias. By integrating kernel group lasso (KG-lasso) and double machine learning (DML) from neighbouring firms, i.e., using information from carbon next door, we unveil systematic underestimation: empirical analysis of 3444 unique US firms (2010-2023) rejects the null of no selection bias. Our findings indicate that voluntary disclosure induces adverse selection, where green silence rewards polluters and undermines decarbonization. Underestimation translates to a $2.6 billion shortfall in tax revenues and up to $525 billion hidden social cost of carbon. |
Keywords: | carbon emissions, machine learning, sample selection |
JEL: | C12 C13 C33 C51 C52 C82 Q52 Q54 Q56 Q58 |
Date: | 2025–07 |
URL: | https://d.repec.org/n?u=RePEc:chf:rpseri:rp2566 |
By: | Igor Halperin |
Abstract: | The proliferation of Large Language Models (LLMs) is challenged by hallucinations, critical failure modes where models generate non-factual, nonsensical or unfaithful text. This paper introduces Semantic Divergence Metrics (SDM), a novel lightweight framework for detecting Faithfulness Hallucinations -- events of severe deviations of LLMs responses from input contexts. We focus on a specific implementation of these LLM errors, {confabulations, defined as responses that are arbitrary and semantically misaligned with the user's query. Existing methods like Semantic Entropy test for arbitrariness by measuring the diversity of answers to a single, fixed prompt. Our SDM framework improves upon this by being more prompt-aware: we test for a deeper form of arbitrariness by measuring response consistency not only across multiple answers but also across multiple, semantically-equivalent paraphrases of the original prompt. Methodologically, our approach uses joint clustering on sentence embeddings to create a shared topic space for prompts and answers. A heatmap of topic co-occurances between prompts and responses can be viewed as a quantified two-dimensional visualization of the user-machine dialogue. We then compute a suite of information-theoretic metrics to measure the semantic divergence between prompts and responses. Our practical score, $\mathcal{S}_H$, combines the Jensen-Shannon divergence and Wasserstein distance to quantify this divergence, with a high score indicating a Faithfulness hallucination. Furthermore, we identify the KL divergence KL(Answer $||$ Prompt) as a powerful indicator of \textbf{Semantic Exploration}, a key signal for distinguishing different generative behaviors. These metrics are further combined into the Semantic Box, a diagnostic framework for classifying LLM response types, including the dangerous, confident confabulation. |
Date: | 2025–08 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2508.10192 |
By: | Yuqi Luan |
Abstract: | This study proposes a behaviorally-informed multi-factor stock selection framework that integrates short-cycle technical alpha signals with deep learning. We design a dual-task multilayer perceptron (MLP) that jointly predicts five-day future returns and directional price movements, thereby capturing nonlinear market behaviors such as volume-price divergence, momentum-driven herding, and bottom reversals. The model is trained on 40 carefully constructed factors derived from price-volume patterns and behavioral finance insights. Empirical evaluation demonstrates that the dual-task MLP achieves superior and stable performance across both predictive accuracy and economic relevance, as measured by information coefficient (IC), information ratio (IR), and portfolio backtesting results. Comparative experiments further show that deep learning methods outperform linear baselines by effectively capturing structural interactions between factors. This work highlights the potential of structure-aware deep learning in enhancing multi-factor modeling and provides a practical framework for short-horizon quantitative investment strategies. |
Date: | 2025–08 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2508.14656 |
By: | Diego Vallarino |
Abstract: | This study develops and empirically validates a Mixture of Experts (MoE) framework for stock price prediction across heterogeneous volatility regimes using real market data. The proposed model combines a Recurrent Neural Network (RNN) optimized for high-volatility stocks with a linear regression model tailored to stable equities. A volatility-aware gating mechanism dynamically weights the contributions of each expert based on asset classification. Using a dataset of 30 publicly traded U.S. stocks spanning diverse sectors, the MoE approach consistently outperforms both standalone models. Specifically, it achieves up to 33% improvement in MSE for volatile assets and 28% for stable assets relative to their respective baselines. Stratified evaluation across volatility classes demonstrates the model's ability to adapt complexity to underlying market dynamics. These results confirm that no single model suffices across market regimes and highlight the advantage of adaptive architectures in financial prediction. Future work should explore real-time gate learning, dynamic volatility segmentation, and applications to portfolio optimization. |
Date: | 2025–07 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2508.02686 |
By: | Ca' Zorzi, Michele; Manu, Ana-Simona; Lopardo, Gianluigi |
Abstract: | This paper investigates the economic impact of technological innovation, focusing on generative AI (GenAI) following ChatGPT’s release in November 2022. We propose a novel framework leveraging large language models to analyze earnings call transcripts. Our method quantifies firms’ GenAI exposure and classifies sentiment as opportunity, adoption, or risk. Using panel econometric techniques, we assess GenAI exposure’s impact on S&P 500 firms’ financial performance over 2014-2023. We find two main results. First, GenAI exposure rose sharply after ChatGPT’s release, particularly in IT, Consumer Services, and Consumer Discretionary sectors, coinciding with sentiment shifts toward adoption. Second, GenAI exposure significantly influenced stock market performance. Firms with early and high GenAI exposure saw stronger returns, though earnings expectations improved modestly. Panel regressions show a 1 percentage point increase in GenAI exposure led to 0.26% rise in quarterly excess returns. Difference-in-Difference estimates indicate 2.4% average quarterly stock price increases following ChatGPT’s release. JEL Classification: C80, G14, G30, L25, O33 |
Keywords: | artificial intelligence, ChatGPT, earnings call, equity returns, generative AI |
Date: | 2025–08 |
URL: | https://d.repec.org/n?u=RePEc:ecb:ecbwps:20253093 |
By: | Aryan Varshney; Venkat Ram Reddy Ganuthula |
Abstract: | This study investigates whether large language models (LLMs) exhibit consistent behavior (signal) or random variation (noise) when screening resumes against job descriptions, and how their performance compares to human experts. Using controlled datasets, we tested three LLMs (Claude, GPT, and Gemini) across contexts (No Company, Firm1 [MNC], Firm2 [Startup], Reduced Context) with identical and randomized resumes, benchmarked against three human recruitment experts. Analysis of variance revealed significant mean differences in four of eight LLM-only conditions and consistently significant differences between LLM and human evaluations (p 0.1), while all LLMs differed significantly from human experts across contexts. Meta-cognition analysis highlighted adaptive weighting patterns that differ markedly from human evaluation approaches. Findings suggest LLMs offer interpretable patterns with detailed prompts but diverge substantially from human judgment, informing their deployment in automated hiring systems. |
Date: | 2025–07 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2507.08019 |
By: | Orhan Erdem; Ragavi Pobbathi Ashok |
Abstract: | In this paper, we explore how large language models (LLMs) approach financial decision-making by systematically comparing their responses to those of human participants across the globe. We posed a set of commonly used financial decision-making questions to seven leading LLMs, including five models from the GPT series(GPT-4o, GPT-4.5, o1, o3-mini), Gemini 2.0 Flash, and DeepSeek R1. We then compared their outputs to human responses drawn from a dataset covering 53 nations. Our analysis reveals three main results. First, LLMs generally exhibit a risk-neutral decision-making pattern, favoring choices aligned with expected value calculations when faced with lottery-type questions. Second, when evaluating trade-offs between present and future, LLMs occasionally produce responses that appear inconsistent with normative reasoning. Third, when we examine cross-national similarities, we find that the LLMs' aggregate responses most closely resemble those of participants from Tanzania. These findings contribute to the understanding of how LLMs emulate human-like decision behaviors and highlight potential cultural and training influences embedded within their outputs. |
Date: | 2025–07 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2507.10933 |
By: | Lapo Santarlasci; Armando Rungi; Antonio Zinilli |
Abstract: | This paper introduces Natural Language Processing for identifying ``true'' green patents from official supporting documents. We start our training on about 12.4 million patents that had been classified as green from previous literature. Thus, we train a simple neural network to enlarge a baseline dictionary through vector representations of expressions related to environmental technologies. After testing, we find that ``true'' green patents represent about 20\% of the total of patents classified as green from previous literature. We show heterogeneity by technological classes, and then check that `true' green patents are about 1\% less cited by following inventions. In the second part of the paper, we test the relationship between patenting and a dashboard of firm-level financial accounts in the European Union. After controlling for reverse causality, we show that holding at least one ``true'' green patent raises sales, market shares, and productivity. If we restrict the analysis to high-novelty ``true'' green patents, we find that they also yield higher profits. Our findings underscore the importance of using text analyses to gauge finer-grained patent classifications that are useful for policymaking in different domains. |
Date: | 2025–07 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2507.02287 |
By: | Dimitrios Emmanoulopoulos; Ollie Olby; Justin Lyon; Namid R. Stillman |
Abstract: | Large language models (LLMs) are increasingly deployed in agentic frameworks, in which prompts trigger complex tool-based analysis in pursuit of a goal. While these frameworks have shown promise across multiple domains including in finance, they typically lack a principled model-building step, relying instead on sentiment- or trend-based analysis. We address this gap by developing an agentic system that uses LLMs to iteratively discover stochastic differential equations for financial time series. These models generate risk metrics which inform daily trading decisions. We evaluate our system in both traditional backtests and using a market simulator, which introduces synthetic but causally plausible price paths and news events. We find that model-informed trading strategies outperform standard LLM-based agents, improving Sharpe ratios across multiple equities. Our results show that combining LLMs with agentic model discovery enhances market risk estimation and enables more profitable trading decisions. |
Date: | 2025–07 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2507.08584 |
By: | Qizhao Chen |
Abstract: | This paper presents a dynamic cryptocurrency portfolio optimization strategy that integrates technical indicators and sentiment analysis to enhance investment decision-making. The proposed method employs the 14-day Relative Strength Index (RSI) and 14-day Simple Moving Average (SMA) to capture market momentum, while sentiment scores are extracted from news articles using the VADER (Valence Aware Dictionary and sEntiment Reasoner) model, with compound scores quantifying overall market tone. The large language model Google Gemini is used to further verify the sentiment scores predicted by VADER and give investment decisions. These technical indicator and sentiment signals are incorporated into the expected return estimates before applying mean-variance optimization with constraints on asset weights. The strategy is evaluated through a rolling-window backtest over cryptocurrency market data, with Bitcoin (BTC) and an equal-weighted portfolio of selected cryptocurrencies serving as benchmarks. Experimental results show that the proposed approach achieves a cumulative return of 38.72, substantially exceeding Bitcoin's 8.85 and the equal-weighted portfolio's 21.65 over the same period, and delivers a higher Sharpe ratio (1.1093 vs. 0.8853 and 1.0194, respectively). However, the strategy exhibits a larger maximum drawdown (-18.52%) compared to Bitcoin (-4.48%) and the equal-weighted portfolio (-11.02%), indicating higher short-term downside risk. These results highlight the potential of combining sentiment and technical signals to improve cryptocurrency portfolio performance, while also emphasizing the need to address risk exposure in volatile markets. |
Date: | 2025–08 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2508.16378 |