|
on Computational Economics |
| By: | Gabriel M. Arantes; Richard F. Pinto; Bruno L. Dalmazo; Eduardo N. Borges; Giancarlo Lucca; Viviane L. D. de Mattos; Fabian C. Cardoso; Rafael A. Berri |
| Abstract: | Binary options trading is often marketed as a field where predictive models can generate consistent profits. However, the inherent randomness and stochastic nature of binary options make price movements highly unpredictable, posing significant challenges for any forecasting approach. This study demonstrates that machine learning algorithms struggle to outperform a simple baseline in predicting binary options movements. Using a dataset of EUR/USD currency pairs from 2021 to 2023, we tested multiple models, including Random Forest, Logistic Regression, Gradient Boosting, and k-Nearest Neighbors (kNN), both before and after hyperparameter optimization. Furthermore, several neural network architectures, including Multi-Layer Perceptrons (MLP) and a Long Short-Term Memory (LSTM) network, were evaluated under different training conditions. Despite these exhaustive efforts, none of the models surpassed the ZeroR baseline accuracy, highlighting the inherent randomness of binary options. These findings reinforce the notion that binary options lack predictable patterns, making them unsuitable for machine learning-based forecasting. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.15960 |
| By: | Jan Schmid; He Cheng |
| Abstract: | The increasing complexities of real estate market forecasting, in combination with the accelerated evolution of machine learning (ML) algorithms, necessitates the optimisation of algorithm selection to reduce computational demands and enhance model accuracy. While numerous studies have examined the performance of individual algorithms, a significant research gap remains concerning the impact of dataset characteristics on algorithmic performance within this specific domain. The present study aims to address this research gap by undertaking a systematic meta-learning analysis of 54 real estate forecasting studies conducted between 2001 and 2024. The study explores the relationship between dataset characteristics and algorithm performance, focusing on factors such as dataset size, dimensionality, and variable categories. Two models, a decision tree and a random forest model, were utilised to assess the impact of these characteristics on the accuracy of various algorithm categories, including artificial neural networks (ANNs), ensemble methods, and support vector machines (SVMs).The study's findings suggest that the random forest algorithm, when applied to dataset characteristics, serves as a reliable tool for predicting the best-performing algorithm for a given real estate market forecasting dataset. The model attained an average area under the curve (AUC) of 0.98 and an overall accuracy of 88%, underscoring the practical relevance of meta-learning approaches in econometrics and highlighting the potential for further enhancing algorithm selection methodologies in this research domain.This research contributes to the expanding field of automated machine meta-learning by providing a framework for more efficient and accurate real estate market forecasting. |
| Keywords: | algorithm selection; meta-learning; Random forest; real estate forecasting |
| JEL: | R3 |
| Date: | 2025–01–01 |
| URL: | https://d.repec.org/n?u=RePEc:arz:wpaper:eres2025_40 |
| By: | Yusuke Takahashi (Bank of Japan); Kazuki Otaka (Bank of Japan); Naoya Kato (Bank of Japan) |
| Abstract: | In this article, we present some preliminary analyses in which Large Language Models (LLMs) are used as economic agents in simulations, as an example of utilizing Generative AI in economic analysis. Existing research reports that Generative AI provides responses consistent with predictions suggested in fields like behavioral economics. There are also some studies which have applied Agent-Based Models (ABM) by treating Generative AI as "players" in a market. However, even though Generative AI exhibits behavior similar to actual economic agents, in reality, it is merely outputting statistically consistent responses based on patterns found in its training data. Therefore, whether the results of simulations that treat Generative AI as economic agents are consistent with economic theory depends crucially on the AI's training data. In this article, we conduct simple ABM simulations to demonstrate how Generative AI can be applied, and examine whether its responses are aligned with intuition and economic theory. Our results are consistent with economic theory: (1) consumers adjust their spending in response to real wage fluctuations; and (2) firms find it easier to pass costs on to consumers in a monopoly market compared to a duopoly market. We conclude that it is necessary to continue verifying through other economic analyses whether simulations using Generative AI consistently lead to conclusions congruent with economic theory. |
| Keywords: | Generative AI; Agent-Based Model; Consumer Behavior; Price Setting Behavior |
| JEL: | C63 D11 D40 |
| Date: | 2025–11–13 |
| URL: | https://d.repec.org/n?u=RePEc:boj:bojlab:lab25e01 |
| By: | Stanislav Selitskiy |
| Abstract: | We investigate a number of Artificial Neural Network architectures (well-known and more ``exotic'') in application to the long-term financial time-series forecasts of indexes on different global markets. The particular area of interest of this research is to examine the correlation of these indexes' behaviour in terms of Machine Learning algorithms cross-training. Would training an algorithm on an index from one global market produce similar or even better accuracy when such a model is applied for predicting another index from a different market? The demonstrated predominately positive answer to this question is another argument in favour of the long-debated Efficient Market Hypothesis of Eugene Fama. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.08658 |
| By: | Marlon Azinovic-Yang; Jan Zemlicka |
| Abstract: | We develop a deep learning algorithm for approximating functional rational expectations equilibria of dynamic stochastic economies in the sequence space. We use deep neural networks to parameterize equilibrium objects of the economy as a function of truncated histories of exogenous shocks. We train the neural networks to fulfill all equilibrium conditions along simulated paths of the economy. To illustrate the performance of our method, we solve three economies of increasing complexity: the stochastic growth model, a high-dimensional overlapping generations economy with multiple sources of aggregate risk, and finally an economy where households and firms face uninsurable idiosyncratic risk, shocks to aggregate productivity, and shocks to idiosyncratic and aggregate volatility. Furthermore, we show how to design practical neural policy function architectures that guarantee monotonicity of the predicted policies, facilitating the use of the endogenous grid method to simplify parts of our algorithm. |
| Keywords: | deep learning, heterogeneous firms, heterogeneous households, overlapping generations, deep neural networks, global solution method, life-cycle, occasionally binding constraints |
| JEL: | C61 C63 C68 D52 E32 |
| Date: | 2025–10 |
| URL: | https://d.repec.org/n?u=RePEc:cer:papers:wp802 |
| By: | Elliot Beck; Franziska Eckert; Linus K\"uhne; Helge Liebert; Rina Rosenblatt-Wisch |
| Abstract: | We introduce a novel indicator that combines machine learning and large language models with traditional statistical methods to track sentiment regarding the economic outlook in Swiss news. The indicator is interpretable and timely, and it significantly improves the accuracy of GDP growth forecasts. Our approach is resource-efficient, modular, and offers a way of benefitting from state-of-the-art large language models even if data are proprietary and cannot be stored or analyzed on external infrastructure - a restriction faced by many central banks and public institutions. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.04299 |
| By: | Matthias Soot; Sabine Horvath; Danielle Warstat; Hans-Berndt Neuner; Alexandra Weitkamp |
| Abstract: | Analyzing the real estate market using modern machine learning (ML) methods is increasingly becoming a common approach. The variables (factors) influencing the real estate market (purchase price or value) behave non-linearly often. For this reason, the ML-methods seem to outperform the previously established linear regression models – especially in modelling bigger datasets from large spatial submarkets or long timespans. However, many approaches found in the literature use the same influencing parameters known from the multiple linear regression models for the new non-parametric approaches. It remains unclear whether there are further influencing variables that only prove significant in a non-linear model. The selection of influencing factors is understood here as model selection: In this work, we investigate model selection approaches on inhomogeneous German real estate transaction data from Brandenburg, Saxony and Lower Saxony. The aim of the research is an improved automatization in the context of model selection starting from raw data. As functional submarket, we aggregate multi-family houses with apartments to increase the sample size. The dataset has several data gaps in explaining parameters e.g. living space. Furthermore, the influencing variables differ between apartments and multi-family houses. We are therefore developing a method to model this inhomogeneity in a single approach (e.g. factor analysis). We consider Artificial Neural Networks (ANN), Random Forest (RF) and Gradient Boosting (GB) as ML-models for which the model selection is performed. We compare the found parameters with classical model selection used for a linear approach. |
| Keywords: | Germany; Machine-Learning; Model-selection |
| JEL: | R3 |
| Date: | 2025–01–01 |
| URL: | https://d.repec.org/n?u=RePEc:arz:wpaper:eres2025_245 |
| By: | Sotiris; Tsolacos; Tatiana Franus |
| Abstract: | In this paper, we evaluate the performance of various methodologies for forecasting real estate yields. Expected yield changes are a crucial input for valuations and investment strategies. We conduct a comparative study to assess the forecast accuracy of econometric and time series models relative to machine learning algorithms. Our target series include net initial and equivalent yields across key real estate sectors: office, industrial, and retail. The analysis is based on monthly UK data, though the framework can be applied to different contexts, including quarterly data. The econometric and time series models considered include ARMA, ARMAX, stepwise regression, and VAR family models, while the machine learning methods encompass Random Forest, XGBoost, Decision Tree, Gradient Boosting and Support Vector Machines. We utilise a comprehensive set of economic, financial, and survey data to predict yield movements and evaluate forecast performance over three-, six-, and twelve-month horizons. While conventional forecast metrics are calculated, our primary focus is on directional forecasting. The findings have significant practical implications. By capturing directional changes, our assessment aids price discovery in real estate markets. Given that private-market real estate data are reported with a lag - even for monthly data - early signals of price movements are valuable for investors and lenders. This study aims to identify the most successful methods to gauge forthcoming yield movements. |
| Keywords: | directional forecasting; econometric models; Machine Learning; property yields |
| JEL: | R3 |
| Date: | 2025–01–01 |
| URL: | https://d.repec.org/n?u=RePEc:arz:wpaper:eres2025_269 |
| By: | Filippo Gusella; Eugenio Vicario |
| Abstract: | Results in the Heterogeneous Agent Model (HAM) literature determine the proportion of fundamentalists and trend followers in the financial market. This proportion varies according to the periods analyzed. In this paper, we use a large language model (LLM) to construct a generative agent (GA) that determines the probability of adopting one of the two strategies based on current information. The probabilities of strategy adoption are compared with those in the HAM literature for the S\&P 500 index between 1990 and 2020. Our findings suggest that the resulting artificial intelligence (AI) expectations align with those reported in the HAM literature. At the same time, extending the analysis to artificial market data helps us to filter the decision-making process of the AI agent. In the artificial market, results confirm the heterogeneity in expectations but reveal systematic asymmetry toward the fundamentalist behavior. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.08604 |
| By: | David Autor; Andrew Caplin; Daniel Martin; Philip Marx |
| Abstract: | The cost of error in many high-stakes settings is asymmetric: misdiagnosing pneumonia when absent is an inconvenience, but failing to detect it when present can be life-threatening. Because of this, artificial intelligence (AI) models used to assist such decisions are frequently trained with asymmetric loss functions that incorporate human decision-makers' trade-offs between false positives and false negatives. In two focal applications, we show that this standard alignment practice can backfire. In both cases, it would be better to train the machine learning model with a loss function that ignores the human's objective and then adjust predictions ex post according to that objective. We rationalize this result using an economic model of incentive design with endogenous information acquisition. The key insight from our theoretical framework is that machine classifiers perform not one but two incentivized tasks: choosing how to classify and learning how to classify. We show that while the adjustments engineers use correctly incentivize choosing, they can simultaneously reduce the incentives to learn. Our formal treatment of the problem reveals that methods embraced for their intuitive appeal can in fact misalign human and machine objectives in predictable ways. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.07699 |
| By: | Vikram Aggarwal (Google); Jay Kulkarni (xKDR Forum); Aakriti Narang (xKDR Forum); Aditi Mascarenhas (xKDR Forum); Siddarth Raman (xKDR Forum); Ajay Shah (xKDR Forum); Susan Thomas (xKDR Forum) |
| Abstract: | Large Language Models (LLMs) have demonstrated remarkable capabilities in text comprehension, but their ability to process complex, hierarchical tabular data remains underexplored. We present a novel approach to extracting structured data from multi-page government fiscal documents using LLM-based techniques. Applied to large annual fiscal documents from the State of Karnataka in India, our method achieves high accuracy through a multi-stage pipeline that leverages domain knowledge, sequential context, and algorithmic validation. Traditional OCR methods work poorly with errors that are hard to detect. The inherent structure of fiscal tables, with totals at each level of the hierarchy, allows for robust internal validation of the extracted data. We use these hierarchical relationships to create multi-level validation checks. We demonstrate that LLMs can read tables and also process document-specific structural hierarchies, offering a scalable process for converting PDF-based fiscal disclosures into research-ready databases. Our implementation shows promise for broader applications across developing country contexts. |
| JEL: | H6 H7 Y10 |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:anf:wpaper:43 |
| By: | Krishna Neupane; Igor Griva |
| Abstract: | Corporate insiders have control of material non-public preferential information (MNPI). Occasionally, the insiders strategically bypass legal and regulatory safeguards to exploit MNPI in their execution of securities trading. Due to a large volume of transactions a detection of unlawful insider trading becomes an arduous task for humans to examine and identify underlying patterns from the insider's behavior. On the other hand, innovative machine learning architectures have shown promising results for analyzing large-scale and complex data with hidden patterns. One such popular technique is eXtreme Gradient Boosting (XGBoost), the state-of-the-arts supervised classifier. We, hence, resort to and apply XGBoost to alleviate challenges of identification and detection of unlawful activities. The results demonstrate that XGBoost can identify unlawful transactions with a high accuracy of 97 percent and can provide ranking of the features that play the most important role in detecting fraudulent activities. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.08306 |
| By: | Qi Feng; Guang Lin; Purav Matlia; Denny Serdarevic |
| Abstract: | In this paper, we propose a novel data-driven framework for discovering probabilistic laws underlying the Feynman-Kac formula. Specifically, we introduce the first stochastic SINDy method formulated under the risk-neutral probability measure to recover the backward stochastic differential equation (BSDE) from a single pair of stock and option trajectories. Unlike existing approaches to identifying stochastic differential equations-which typically require ergodicity-our framework leverages the risk-neutral measure, thereby eliminating the ergodicity assumption and enabling BSDE recovery from limited financial time series data. Using this algorithm, we are able not only to make forward-looking predictions but also to generate new synthetic data paths consistent with the underlying probabilistic law. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.08606 |
| By: | Sina Kazemian; Ghazal Farhani; Amirhessam Yazdi |
| Abstract: | We present an uncertainty-aware, physics-informed neural network (PINN) for option pricing that solves the Black--Scholes (BS) partial differential equation (PDE) as a mesh-free, global surrogate over $(S, t)$. The model embeds the BS operator and boundary/terminal conditions in a residual-based objective and requires no labeled prices. For American options, early exercise is handled via an obstacle-style relaxation while retaining the BS residual in the continuation region. To quantify \emph{epistemic} uncertainty, we introduce an anchored-ensemble fine-tuning stage (AT--PINN) that regularizes each model toward a sampled anchor and yields prediction bands alongside point estimates. On European calls/puts, the approach attains low errors (e.g., MAE $\sim 5\times10^{-2}$, RMSE $\sim 7\times10^{-2}$, explained variance $\approx 0.999$ in representative settings) and tracks ground truth closely across strikes and maturities. For American puts, the method remains accurate (MAE/RMSE on the order of $10^{-1}$ with EV $\approx 0.999$) and does not exhibit the error accumulation associated with time-marching schemes. Against data-driven baselines (ANN, RNN) and a Kolmogorov--Arnold FINN variant (KAN), our PINN matches or outperforms on accuracy while training more stably; anchored ensembles provide uncertainty bands that align with observed error scales. We discuss design choices (loss balancing, sampling near the payoff kink), limitations, and extensions to higher-dimensional BS settings and alternative dynamics. |
| Date: | 2025–10 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.05519 |
| By: | Dennis Thumm; Luis Ontaneda Mijares |
| Abstract: | Market generators using deep generative models have shown promise for synthetic financial data generation, but existing approaches lack causal reasoning capabilities essential for counterfactual analysis and risk assessment. We propose a Time-series Neural Causal Model VAE (TNCM-VAE) that combines variational autoencoders with structural causal models to generate counterfactual financial time series while preserving both temporal dependencies and causal relationships. Our approach enforces causal constraints through directed acyclic graphs in the decoder architecture and employs the causal Wasserstein distance for training. We validate our method on synthetic autoregressive models inspired by the Ornstein-Uhlenbeck process, demonstrating superior performance in counterfactual probability estimation with L1 distances as low as 0.03-0.10 compared to ground truth. The model enables financial stress testing, scenario analysis, and enhanced backtesting by generating plausible counterfactual market trajectories that respect underlying causal mechanisms. |
| Date: | 2025–11 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2511.04469 |
| By: | Athina Karvounaraki (European Commission); Alexis Stevenson (European Commission); Isabelle Labrosse (Science-Metrix); David Campbell (Science-Metrix); Henrik Karlstrøm (NIFU); Eric Iversen (NIFU); Lili Wang (UNU/MERIT, Maastricht University); Ad Notten (UNU/MERIT, Maastricht University) |
| Abstract: | The study examines the surge in GenAI chatbot mentions in scientific literature, showing a 13-fold increase from November 2022 to December 2023. The use of GenAI chatbots in scientific research is mainly in ICT and Applied Sciences, where AI improves research efficiency. Key applications include writing and practical implementation, demonstrating the tool's widespread use in academic writing and research. Nonetheless, the increasing use of AI in research and academia raises concerns about quality assurance and trust issues. |
| Keywords: | Generative AI, Research, Scientific Literature, Chatbots, ICT, Applied Sciences, Academic Writing, Quality Assurance, Trust Issues |
| JEL: | O32 O38 C18 |
| Date: | 2025–06 |
| URL: | https://d.repec.org/n?u=RePEc:eug:wpaper:ki-01-25-084-en-n |