|
on Big Data |
| By: | Chike, Onyedikachi Emmanuel; Badruddoza, Syed; Lyford, Conrad |
| Keywords: | Health Economics and Policy |
| Date: | 2025 |
| URL: | https://d.repec.org/n?u=RePEc:ags:aaea25:360934 |
| By: | Schmidt, Lorenz; Ritter, Matthias; Mußhoff, Oliver; Odening, Martin |
| Keywords: | Agricultural Finance, Farm Management |
| Date: | 2025 |
| URL: | https://d.repec.org/n?u=RePEc:ags:aaea25:360670 |
| By: | Chawla, Parth; Taylor, J. Edward |
| Keywords: | Research and Development/Tech Change/Emerging Technologies |
| Date: | 2025 |
| URL: | https://d.repec.org/n?u=RePEc:ags:aaea25:361223 |
| By: | Benjamin Avanzi; Matthew Lambrianidis; Greg Taylor; Bernard Wong |
| Abstract: | The use of neural networks trained on individual claims data has become increasingly popular in the actuarial reserving literature. We consider how to best input historical payment data in neural network models. Additionally, case estimates are also available in the format of a time series, and we extend our analysis to assessing their predictive power. In this paper, we compare a feed-forward neural network trained on summarised transactions to a recurrent neural network equipped to analyse a claim's entire payment history and/or case estimate development history. We draw conclusions from training and comparing the performance of the models on multiple, comparable highly complex datasets simulated from SPLICE (Avanzi, Taylor and Wang, 2023). We find evidence that case estimates will improve predictions significantly, but that equipping the neural network with memory only leads to meagre improvements. Although the case estimation process and quality will vary significantly between insurers, we provide a standardised methodology for assessing their value. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.05274 |
| By: | Muriuki, James; Lawani, Abdelaziz |
| Keywords: | Food Security and Poverty |
| Date: | 2025 |
| URL: | https://d.repec.org/n?u=RePEc:ags:aaea25:360881 |
| By: | Kim Ristolainen (Turku School of Economics, University of Turku, Finland) |
| Abstract: | We develop a novel sentiment measure derived from survey data to empirically vali date the Minsky–Kindleberger view on financial crises. Using survey data from multiple countries, we decompose beliefs into components explained by public information that are orthogonal to optimal machine beliefs, constructing a framework that isolates sentiment and its dispersion among individuals. We show that deviations from machine-optimized benchmarks arise from systematic misaggregation of public information. The sentiment measure is validated through its predictive relationships with financial markets and belief dynamics consistent with heterogeneous-beliefs asset pricing theory. We extend this senti ment measure historically for a panel of 78 countries using machine learning models trained on BERT embeddings of historical news articles (1903–2020). The backcasted sentiment shows that shocks in median sentiment predict credit booms in the non-tradable corporate sector, which prior research has linked to financial crises, providing the first historically large-scale empirical validation of the Minsky cycle. We further show that sentiment, which is a misaggregation of public information, is influenced by memory-related dynamics, as the time elapsed since major crises and the share of young-to-old people in the population strongly predict surges in optimism even when recent economic developments are controlled for. |
| Keywords: | Survey data, Sentiment, Memory, Machine Learning, Text Data, Credit growth, Financial Crisis |
| JEL: | E44 E51 G01 D84 G41 E32 |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:tkk:dpaper:dp173 |
| By: | Chang Liu |
| Abstract: | The hedge fund industry presents significant challenges for investors due to its opacity and limited disclosure requirements. This pioneering study introduces two major innovations in financial text analysis. First, we apply topic modeling to hedge fund documents-an unexplored domain for automated text analysis-using a unique dataset of over 35, 000 documents from 1, 125 hedge fund managers. We compared three state-of-the-art methods: Latent Dirichlet Allocation (LDA), Top2Vec, and BERTopic. Our findings reveal that LDA with 20 topics produces the most interpretable results for human users and demonstrates higher robustness in topic assignments when the number of topics varies, while Top2Vec shows superior classification performance. Second, we establish a novel quantitative framework linking document sentiment to fund performance, transforming qualitative information traditionally requiring expert interpretation into systematic investment signals. In sentiment analysis, contrary to expectations, the general-purpose DistilBERT outperforms the finance-specific FinBERT in generating sentiment scores, demonstrating superior adaptability to diverse linguistic patterns found in hedge fund documents that extend beyond specialized financial news text. Furthermore, sentiment scores derived using DistilBERT in combination with Top2Vec show stronger correlations with subsequent fund performance compared to other model combinations. These results demonstrate that automated topic modeling and sentiment analysis can effectively process hedge fund documents, providing investors with new data-driven decision support tools. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.06620 |
| By: | Zhimin Chen (Nanyang Business School, Nanyang Technological University); Bryan T. Kelly (Yale SOM; AQR Capital Management, LLC; National Bureau of Economic Research (NBER)); Semyon Malamud (Ecole Polytechnique Federale de Lausanne; Centre for Economic Policy Research (CEPR); Swiss Finance Institute) |
| Abstract: | Machine learning (ML) methods are highly flexible, but their ability to approximate the true data-generating process is fundamentally constrained by finite samples. We characterize a universal lower bound, the Limits-to-Learning Gap (LLG), quantifying the unavoidable discrepancy between a model's empirical fit and the population benchmark. Recovering the true population R 2 , therefore, requires correcting observed predictive performance by this bound. Using a broad set of variables, including excess returns, yields, credit spreads, and valuation ratios, we find that the implied LLGs are large. This indicates that standard ML approaches can substantially understate true predictability in financial data. We also derive LLG-based refinements to the classic Hansen and Jagannathan (1991) bounds, analyze implications for parameter learning in general-equilibrium settings, and show that the LLG provides a natural mechanism for generating excess volatility. |
| Keywords: | machine learning, asset pricing, predictability, big data, limits to learning, excess volatility, stochastic discount factor, kernel methods |
| JEL: | C13 C32 C55 C58 G12 G17 |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:chf:rpseri:rp25106 |
| By: | Pablo Hidalgo; Julio E. Sandubete; Agust\'in Garc\'ia-Garc\'ia |
| Abstract: | This study investigates the contribution of Intrinsic Mode Functions (IMFs) derived from economic time series to the predictive performance of neural network models, specifically Multilayer Perceptrons (MLP) and Long Short-Term Memory (LSTM) networks. To enhance interpretability, DeepSHAP is applied, which estimates the marginal contribution of each IMF while keeping the rest of the series intact. Results show that the last IMFs, representing long-term trends, are generally the most influential according to DeepSHAP, whereas high-frequency IMFs contribute less and may even introduce noise, as evidenced by improved metrics upon their removal. Differences between MLP and LSTM highlight the effect of model architecture on feature relevance distribution, with LSTM allocating importance more evenly across IMFs. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.12499 |
| By: | Kirchner, Ella; Benami, Elinor; Cecil, Michael; Becker-Reshef, Inbal; Wagner, Josef; Sahajpal, Ritvik |
| Keywords: | International Development |
| Date: | 2025 |
| URL: | https://d.repec.org/n?u=RePEc:ags:aaea25:361023 |
| By: | Adler, Brian; Brown, Anne |
| Abstract: | Housing markets are more complex than a simple supply-demand relationship. Prices are set by complex market and spatial neighborhood dynamics. Certain cities like St. Louis, MO have experienced dramatic population decline marked by extreme vacancy and abandonment. Amidst its population decline, St. Louis simultaneously demonstrates neighborhoods with sharp housing shortages and competition alongside others with entrenched vacancy and disinvestment mere blocks away from one another. We use supervised machine learning models to predict housing prices with a diverse feature set that incorporates spatial aspects of vacancy among other traditional housing amenities in St. Louis. Our results show how proximity to vacancy may impact a home’s value even more than its number of bedrooms. These findings, we expect, may prompt policymakers to combat vacancy even more urgently to maintain neighborhood market stability. |
| Date: | 2026–01–06 |
| URL: | https://d.repec.org/n?u=RePEc:osf:socarx:s9v4u_v1 |
| By: | Agust\'in M. de los Riscos; Julio E. Sandubete; Diego Carmona-Fern\'andez; Le\'on Bele\~na |
| Abstract: | This study applies Empirical Mode Decomposition (EMD) to the MSCI World index and converts the resulting intrinsic mode functions (IMFs) into graph representations to enable modeling with graph neural networks (GNNs). Using CEEMDAN, we extract nine IMFs spanning high-frequency fluctuations to long-term trends. Each IMF is transformed into a graph using four time-series-to-graph methods: natural visibility, horizontal visibility, recurrence, and transition graphs. Topological analysis shows clear scale-dependent structure: high-frequency IMFs yield dense, highly connected small-world graphs, whereas low-frequency IMFs produce sparser networks with longer characteristic path lengths. Visibility-based methods are more sensitive to amplitude variability and typically generate higher clustering, while recurrence graphs better preserve temporal dependencies. These results provide guidance for designing GNN architectures tailored to the structural properties of decomposed components, supporting more effective predictive modeling of financial time series. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.12526 |
| By: | Cen, Huang; Wanying, Liao; He, Leng; Sheetal, Abhishek (The Hong Kong Polytechnic University) |
| Abstract: | This paper replicates and extends the study of Li et al. (2025) to investigate the role of feature engineering in machine learning (ML)-based cross-sectional stock return prediction. We construct a 3-tier feature system with 78 effective features, including basic financial ratios, financial change features, and growth quality features, using CRSP and Compustat data. Through a recursive rolling window approach from 1969 to 2018, we compare the performance of boosted regression trees (BRT), neural networks (NN), and the newly added extreme gradient boosting (XGBoost) models. The results show that XGBoost produces the highest accuracy in prediction since it captures statistical correlations among features efficiently, while it underperforms in terms of investment return due to its sensitivity to limited feature quality and the gap between statistical fitting and economic profitability. On the contrary, the BRT model generates the most robust performance for a strategy since it is more tolerant of noisy features within an incomplete information environment. Compared with Li et al. (2025), our strategy exhibits a lower Sharpe ratio and an insignificant risk-adjusted alpha. It is mainly due to the smaller number of features and the different sample period. This paper confirms the core conclusion of the original paper that feature engineering rather than model complexity is crucial for ML investment strategies. It offers empirical knowledge regarding real-time portfolio construction. |
| Date: | 2026–01–05 |
| URL: | https://d.repec.org/n?u=RePEc:osf:socarx:3fh8x_v2 |
| By: | Jaisal Patel; Yunzhe Chen; Kaiwen He; Keyi Wang; David Li; Kairong Xiao; Xiao-Yang Liu |
| Abstract: | Previous research has reported that large language models (LLMs) demonstrate poor performance on the Chartered Financial Analyst (CFA) exams. However, recent reasoning models have achieved strong results on graduate-level academic and professional examinations across various disciplines. In this paper, we evaluate state-of-the-art reasoning models on a set of mock CFA exams consisting of 980 questions across three Level I exams, two Level II exams, and three Level III exams. Using the same pass/fail criteria from prior studies, we find that most models clear all three levels. The models that pass, ordered by overall performance, are Gemini 3.0 Pro, Gemini 2.5 Pro, GPT-5, Grok 4, Claude Opus 4.1, and DeepSeek-V3.1. Specifically, Gemini 3.0 Pro achieves a record score of 97.6% on Level I. Performance is also strong on Level II, led by GPT-5 at 94.3%. On Level III, Gemini 2.5 Pro attains the highest score with 86.4% on multiple-choice questions while Gemini 3.0 Pro achieves 92.0% on constructed-response questions. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.08270 |
| By: | Sayed Akif Hussain; Chen Qiu-shi; Syed Amer Hussain; Syed Atif Hussain; Asma Komal; Muhammad Imran Khalid |
| Abstract: | This study proposes a novel hybrid deep learning framework that integrates a Large Language Model (LLM) with a Transformer architecture for stock price forecasting. The research addresses a critical theoretical gap in existing approaches that empirically combine textual and numerical data without a formal understanding of their interaction mechanisms. We conceptualise a prompt-based LLM as a mathematically defined signal generator, capable of extracting directional market sentiment and an associated confidence score from financial news. These signals are then dynamically fused with structured historical price features through a noise-robust gating mechanism, enabling the Transformer to adaptively weigh semantic and quantitative information. Empirical evaluations demonstrate that the proposed Hybrid LLM-Transformer model significantly outperforms a Vanilla Transformer baseline, reducing the Root Mean Squared Error (RMSE) by 5.28% (p = 0.003). Moreover, ablation and robustness analyses confirm the model's stability under noisy conditions and its capacity to maintain interpretability through confidence-weighted attention. The findings provide both theoretical and empirical support for a paradigm shift from empirical observation to formalised modelling of LLM-Transformer interactions, paving the way toward explainable, noise-resilient, and semantically enriched financial forecasting systems. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.02878 |
| By: | Zhiming Lian |
| Abstract: | Particularly, financial named-entity recognition (NER) is one of the many important approaches to translate unformatted reports and news into structured knowledge graphs. However, free, easy-to-use large language models (LLMs) often fail to differentiate organisations as people, or disregard an actual monetary amount entirely. This paper takes Meta's Llama 3 8B and applies it to financial NER by combining instruction fine-tuning and Low-Rank Adaptation (LoRA). Each annotated sentence is converted into an instruction-input-output triple, enabling the model to learn task descriptions while fine-tuning with small low-rank matrices instead of updating all weights. Using a corpus of 1, 693 sentences, our method obtains a micro-F1 score of 0.894 compared with Qwen3-8B, Baichuan2-7B, T5, and BERT-Base. We present dataset statistics, describe training hyperparameters, and perform visualizations of entity density, learning curves, and evaluation metrics. Our results show that instruction tuning combined with parameter-efficient fine-tuning enables state-of-the-art performance on domain-sensitive NER. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.10043 |
| By: | Kedidi, Islem; Araujo, Hamilton; Randriamarolo, Marie Rose |
| Abstract: | Most french dairy farm are still exposed to volatile milk price despite the implementation of milk package that aim to enforce contractualization and stabilize price. In this context, our study aims to investigate the effect of asymmetric price volatility on french dairy farms' profitability which is necessary to enhance resilience. We use APARCH to measure the volatility of aggregate milk prices represented by the Production Price Index for Agricultural Products. We focused our analysis on 385 dairy farms observed from 2010 to 2022 in the French Farm Accountancy Data Network (FADN). The evolution of farm profitability, indicated by the return on assets, was assessed by using machine learning algorithms. Findings show that there are differentiated effects of milk price volatility on farm profitability. |
| Keywords: | Agricultural Finance, Farm Management |
| Date: | 2025 |
| URL: | https://d.repec.org/n?u=RePEc:ags:aaea25:360682 |
| By: | Ahmed Khwaja; Sonal Srivastava |
| Abstract: | Dynamic discrete choice (DDC) models have found widespread application in marketing. However, estimating these becomes challenging in "big data" settings with high-dimensional state-action spaces. To address this challenge, this paper develops a Reinforcement Learning (RL)-based two-step ("computationally light") Conditional Choice Simulation (CCS) estimation approach that combines the scalability of machine learning with the transparency, explainability, and interpretability of structural models, which is particularly valuable for counterfactual policy analysis. The method is premised on three insights: (1) the CCS ("forward simulation") approach is a special case of RL algorithms, (2) starting from an initial state-action pair, CCS updates the corresponding value function only after each simulation path has terminated, whereas RL algorithms may update for all the state-action pairs visited along a simulated path, and (3) RL focuses on inferring an agent's optimal policy with known reward functions, whereas DDC models focus on estimating the reward functions presupposing optimal policies. The procedure's computational efficiency over CCS estimation is demonstrated using Monte Carlo simulations with a canonical machine replacement and a consumer food purchase model. Framing CCS estimation of DDC models as an RL problem increases their applicability and scalability to high-dimensional marketing problems while retaining both interpretability and tractability. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.02069 |
| By: | Rainer Michael Rilke (WHU - Otto Beisheim School of Management); Dirk Sliwka (University of Cologne) |
| Abstract: | A large body of research across management, psychology, accounting, and economics shows that subjective performance evaluations are systematically biased: ratings cluster near the midpoint of scales and are often excessively lenient. As organizations increasingly adopt large language models (LLMs) for evaluative tasks, little is known about how these systems perform when assessing human performance. We document that, in the absence of clear objective standards and when individuals are rated independently, LLMs reproduce the familiar patterns of human raters. However, LLMs generate greater dispersion and accuracy when evaluating multiple individuals simultaneously. With noisy but objective performance signals, LLMs provide substantially more accurate evaluations than human raters, as they (i) are less subject to biases arising from concern for the evaluated employee and (ii) make fewer mistakes in information processing closely approximating rational Bayesian benchmarks. |
| Keywords: | Performance Evaluation, Large Language Models, Signal Objectivity, Algorithmic Judgment, Gen-AI |
| JEL: | J24 J28 M12 M53 |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:ajk:ajkdps:384 |
| By: | Nikoleta Anesti; Edward Hill; Andreas Joseph |
| Abstract: | This paper investigates the ability of Large Language Models (LLMs), specifically GPT-3.5-turbo (GPT), to form inflation perceptions and expectations based on macroeconomic price signals. We compare the LLM's output to household survey data and official statistics, mimicking the information set and demographic characteristics of the Bank of England's Inflation Attitudes Survey (IAS). Our quasi-experimental design exploits the timing of GPT's training cut-off in September 2021 which means it has no knowledge of the subsequent UK inflation surge. We find that GPT tracks aggregate survey projections and official statistics at short horizons. At a disaggregated level, GPT replicates key empirical regularities of households' inflation perceptions, particularly for income, housing tenure, and social class. A novel Shapley value decomposition of LLM outputs suited for the synthetic survey setting provides well-defined insights into the drivers of model outputs linked to prompt content. We find that GPT demonstrates a heightened sensitivity to food inflation information similar to that of human respondents. However, we also find that it lacks a consistent model of consumer price inflation. More generally, our approach could be used to evaluate the behaviour of LLMs for use in the social sciences, to compare different models, or to assist in survey design. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.14306 |
| By: | Jakob Bjelac; Victor Chernozhukov; Phil-Adrian Klotz; Jannis Kueck; Theresa M. A. Schmitz |
| Abstract: | In this paper, we extend the Riesz representation framework to causal inference under sample selection, where both treatment assignment and outcome observability are non-random. Formulating the problem in terms of a Riesz representer enables stable estimation and a transparent decomposition of omitted variable bias into three interpretable components: a data-identified scale factor, outcome confounding strength, and selection confounding strength. For estimation, we employ the ForestRiesz estimator, which accounts for selective outcome observability while avoiding the instability associated with direct propensity score inversion. We assess finite-sample performance through a simulation study and show that conventional double machine learning approaches can be highly sensitive to tuning parameters due to their reliance on inverse probability weighting, whereas the ForestRiesz estimator delivers more stable performance by leveraging automatic debiased machine learning. In an empirical application to the gender wage gap in the U.S., we find that our ForestRiesz approach yields larger treatment effect estimates than a standard double machine learning approach, suggesting that ignoring sample selection leads to an underestimation of the gender wage gap. Sensitivity analysis indicates that implausibly strong unobserved confounding would be required to overturn our results. Overall, our approach provides a unified, robust, and computationally attractive framework for causal inference under sample selection. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.08643 |
| By: | Giovanni Ballarin; Lyudmila Grigoryeva; Yui Ching Li |
| Abstract: | Model combination is a powerful approach for achieving superior performance compared to selecting a single model. We study both theoretically and empirically the effectiveness of ensembles of Multi-Frequency Echo State Networks (MFESNs), which have been shown to achieve state-of-the-art macroeconomic time series forecasting results (Ballarin et al., 2024a). The Hedge and Follow-the-Leader schemes are discussed, and their online learning guarantees are extended to settings with dependent data. In empirical applications, the proposed Ensemble Echo State Networks demonstrate significantly improved predictive performance relative to individual MFESN models. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.13642 |
| By: | Kieran Wood; Stephen J. Roberts; Stefan Zohren |
| Abstract: | We propose DeePM (Deep Portfolio Manager), a structured deep-learning macro portfolio manager trained end-to-end to maximize a robust, risk-adjusted utility. DeePM addresses three fundamental challenges in financial learning: (1) it resolves the asynchronous "ragged filtration" problem via a Directed Delay (Causal Sieve) mechanism that prioritizes causal impulse-response learning over information freshness; (2) it combats low signal-to-noise ratios via a Macroeconomic Graph Prior, regularizing cross-asset dependence according to economic first principles; and (3) it optimizes a distributionally robust objective where a smooth worst-window penalty serves as a differentiable proxy for Entropic Value-at-Risk (EVaR) - a window-robust utility encouraging strong performance in the most adverse historical subperiods. In large-scale backtests from 2010-2025 on 50 diversified futures with highly realistic transaction costs, DeePM attains net risk-adjusted returns that are roughly twice those of classical trend-following strategies and passive benchmarks, solely using daily closing prices. Furthermore, DeePM improves upon the state-of-the-art Momentum Transformer architecture by roughly fifty percent. The model demonstrates structural resilience across the 2010s "CTA (Commodity Trading Advisor) Winter" and the post-2020 volatility regime shift, maintaining consistent performance through the pandemic, inflation shocks, and the subsequent higher-for-longer environment. Ablation studies confirm that strictly lagged cross-sectional attention, graph prior, principled treatment of transaction costs, and robust minimax optimization are the primary drivers of this generalization capability. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.05975 |
| By: | Carrera Gonzalo |
| Abstract: | El Producto Bruto Interno (PBI) es la mejor medida para cuantificar la riqueza flujo de una nación. En Argentina, este dato se publica con un rezago de 70 a 80 días tras el cierre del trimestre. Como anticipo, el INDEC difunde el EMAE, el cual converge al PBI pero demora entre 50 a 60 días después del mes de referencia. En un contexto de alta inestabilidad e incertidumbre, este retardo limita su utilidad como insumo para la toma de decisiones. El objetivo de este trabajo es anticipar, con 30 días de antelación a lo publicado por el INDEC, la variación interanual del EMAE y del EMAE sin Agropecuario. Para ello se aplicaron técnicas de Nowcasting (modelos econométricos y de machine learning) con 44 variables predictoras de la economía argentina. Se probaron seis métodos: un modelo Autorregresivo de Rezagos Distribuidos (ARDL), tres de machine learning (Lasso, Ridge y Elastic Net) y dos de selección de parámetros (General-to-Specific, GETS, y Global Search Regression, GSR). Lasso arrojó el menor error en dos de tres métricas y lideró en ambas variables objetivo. El GETS mostró buen desempeño, mientras que Ridge se destacó en el EMAE no Agropecuario y el ARDL en EMAE. |
| JEL: | E2 C1 |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:aep:anales:4784 |
| By: | Gabriel Saco |
| Abstract: | Double Machine Learning is often justified by nuisance-rate conditions, yet finite-sample reliability also depends on the conditioning of the orthogonal-score Jacobian. This conditioning is typically assumed rather than tracked. When residualized treatment variance is small, the Jacobian is ill-conditioned and small systematic nuisance errors can be amplified, so nominal confidence intervals may look precise yet systematically under-cover. Our main result is an exact identity for the cross-fitted PLR-DML estimator, with no Taylor approximation. From this identity, we derive a stochastic-order bound that separates oracle noise from a conditioning-amplified nuisance remainder and yields a sufficiency condition for root-n-inference. We further connect the amplification factor to semiparametric efficiency geometry via the Riesz representer and use a triangular-array framework to characterize regimes as residual treatment variation weakens. These results motivate an out-of-fold diagnostic that summarizes the implied amplification scale. We do not propose universal thresholds. Instead, we recommend reporting the diagnostic alongside cross-learner sensitivity summaries as a fragility assessment, illustrated in simulation and an empirical example. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2512.07083 |
| By: | Gongao Zhang; Haijiang Zeng; Lu Jiang |
| Abstract: | Financial institutions and regulators require systems that integrate heterogeneous data to assess risks from stock fluctuations to systemic vulnerabilities. Existing approaches often treat these tasks in isolation, failing to capture cross-scale dependencies. We propose Uni-FinLLM, a unified multimodal large language model that uses a shared Transformer backbone and modular task heads to jointly process financial text, numerical time series, fundamentals, and visual data. Through cross-modal attention and multi-task optimization, it learns a coherent representation for micro-, meso-, and macro-level predictions. Evaluated on stock forecasting, credit-risk assessment, and systemic-risk detection, Uni-FinLLM significantly outperforms baselines. It raises stock directional accuracy to 67.4% (from 61.7%), credit-risk accuracy to 84.1% (from 79.6%), and macro early-warning accuracy to 82.3%. Results validate that a unified multimodal LLM can jointly model asset behavior and systemic vulnerabilities, offering a scalable decision-support engine for finance. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.02677 |
| By: | Efstratios Manolakis; Christian Bongiorno; Rosario Nunzio Mantegna |
| Abstract: | A new wave of work on covariance cleaning and nonlinear shrinkage has delivered asymptotically optimal analytical solutions for large covariance matrices. The same framework has been generalized to empirical cross-covariance matrices, whose singular value decomposition identifies canonical comovement modes between two asset sets, with singular values quantifying the strength of each mode and providing natural targets for shrinkage. Existing analytical cross-covariance cleaners are derived under strong stationarity and large-sample assumptions, and they typically rely on mesoscopic regularity conditions such as bounded spectra; macroscopic common modes (e.g., a global market factor) violate these conditions. When applied to real equity returns, where dependence structures drift over time and global modes are prominent, we find that these theoretically optimal formulas do not translate into robust out-of-sample performance. We address this gap by designing a random-matrix-inspired neural architecture that operates in the empirical singular-vector basis and learns a nonlinear mapping from empirical singular values to their corresponding cleaned values. By construction, the network can recover the analytical solution as a special case, yet it remains flexible enough to adapt to non-stationary dynamics and mode-driven distortions. Trained on a long history of equity returns, the proposed method achieves a more favorable bias-variance trade-off than purely analytical cleaners and delivers systematically lower out-of-sample cross-covariance prediction errors. Our results demonstrate that combining random-matrix theory with machine learning makes asymptotic theories practically effective in realistic time-varying markets. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.07687 |
| By: | Sahaj Raj Malla; Shreeyash Kayastha; Rumi Suwal; Harish Chandra Bhandari; Rajendra Adhikari |
| Abstract: | This study develops a robust machine learning framework for one-step-ahead forecasting of daily log-returns in the Nepal Stock Exchange (NEPSE) Index using the XGBoost regressor. A comprehensive feature set is engineered, including lagged log-returns (up to 30 days) and established technical indicators such as short- and medium-term rolling volatility measures and the 14-period Relative Strength Index. Hyperparameter optimization is performed using Optuna with time-series cross-validation on the initial training segment. Out-of-sample performance is rigorously assessed via walk-forward validation under both expanding and fixed-length rolling window schemes across multiple lag configurations, simulating real-world deployment and avoiding lookahead bias. Predictive accuracy is evaluated using root mean squared error, mean absolute error, coefficient of determination (R-squared), and directional accuracy on both log-returns and reconstructed closing prices. Empirical results show that the optimal configuration, an expanding window with 20 lags, outperforms tuned ARIMA and Ridge regression benchmarks, achieving the lowest log-return RMSE (0.013450) and MAE (0.009814) alongside a directional accuracy of 65.15%. While the R-squared remains modest, consistent with the noisy nature of financial returns, primary emphasis is placed on relative error reduction and directional prediction. Feature importance analysis and visual inspection further enhance interpretability. These findings demonstrate the effectiveness of gradient boosting ensembles in modeling nonlinear dynamics in volatile emerging market time series and establish a reproducible benchmark for NEPSE Index forecasting. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.08896 |
| By: | Pawe{\l} Niszczota; Cassandra Gr\"utzner |
| Abstract: | The rapid spread of large language models (LLMs) has raised concerns about the social reactions they provoke. Prior research documents negative attitudes toward AI users, but it remains unclear whether such disapproval translates into costly action. We address this question in a two-phase online experiment (N = 491 Phase II participants; Phase I provided targets) where participants could spend part of their own endowment to reduce the earnings of peers who had previously completed a real-effort task with or without LLM support. On average, participants destroyed 36% of the earnings of those who relied exclusively on the model, with punishment increasing monotonically with actual LLM use. Disclosure about LLM use created a credibility gap: self-reported null use was punished more harshly than actual null use, suggesting that declarations of "no use" are treated with suspicion. Conversely, at high levels of use, actual reliance on the model was punished more strongly than self-reported reliance. Taken together, these findings provide the first behavioral evidence that the efficiency gains of LLMs come at the cost of social sanctions. |
| Date: | 2026–01 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2601.09772 |