nep-cmp New Economics Papers
on Computational Economics
Issue of 2023‒09‒25
eighteen papers chosen by



  1. D-TIPO: Deep time-inconsistent portfolio optimization with stocks and options By Kristoffer Andersson; Cornelis W. Oosterlee
  2. Can Machine Learning Catch Economic Recessions Using Economic and Market Sentiments? By Kian Tehranian
  3. Applying Machine Learning Algorithms to Predict the Size of the Informal Economy By Joao Felix; Michel Alexandre, Gilberto Tadeu Lima
  4. Learning to Learn Financial Networks for Optimising Momentum Strategies By Xingyue Pu; Stefan Zohren; Stephen Roberts; Xiaowen Dong
  5. Model-agnostic auditing: a lost cause? By Hansen, Sakina; Loftus, Joshua
  6. Forecasting inflation using disaggregates and machine learning By Gilberto Boaretto; Marcelo C. Medeiros
  7. JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading By Sascha Frey; Kang Li; Peer Nagy; Silvia Sapora; Chris Lu; Stefan Zohren; Jakob Foerster; Anisoara Calinescu
  8. The Potential of Quantum Techniques for Stock Price Prediction By Naman S; Gaurang B; Neel S; Aswath Babu H
  9. "Guinea Pig Trials" Utilizing GPT: A Novel Smart Agent-Based Modeling Approach for Studying Firm Competition and Collusion By Xu Han; Zengqing Wu; Chuan Xiao
  10. Simulation Experiments as a Causal Problem By Tyrel Stokes; Ian Shrier; Russell Steele
  11. Grover Search for Portfolio Selection By A. Ege Yilmaz; Stefan Stettler; Thomas Ankenbrand; Urs Rhyner
  12. Retail Demand Forecasting: A Comparative Study for Multivariate Time Series By Md Sabbirul Haque; Md Shahedul Amin; Jonayet Miah
  13. Opportunities for business use of today's AI models - Rapidly achievable personalization of Large Language Models (like ChatGPT) in times of Industry 5.0 By Reinking, Ernst; Becker, Marco
  14. Linking microblogging sentiments to stock price movement: An application of GPT-4 By Rick Steinert; Saskia Altmann
  15. Analysis of CBDC Narrative OF Central Banks using Large Language Models By Andres Alonso-Robisco; Jose Manuel Carbo
  16. Sources of economic policy uncertainty in the euro area: a ready-to-use database By Andrés Azqueta-Gavaldón; Marina Diakonova; Corinna Ghirelli; Javier J. Pérez
  17. American Stories: A Large-Scale Structured Text Dataset of Historical U.S. Newspapers By Melissa Dell; Jacob Carlson; Tom Bryan; Emily Silcock; Abhishek Arora; Zejiang Shen; Luca D'Amico-Wong; Quan Le; Pablo Querubin; Leander Heldring
  18. Effects of Daily News Sentiment on Stock Price Forecasting By S. Srinivas; R. Gadela; R. Sabu; A. Das; G. Nath; V. Datla

  1. By: Kristoffer Andersson; Cornelis W. Oosterlee
    Abstract: In this paper, we propose a machine learning algorithm for time-inconsistent portfolio optimization. The proposed algorithm builds upon neural network based trading schemes, in which the asset allocation at each time point is determined by a a neural network. The loss function is given by an empirical version of the objective function of the portfolio optimization problem. Moreover, various trading constraints are naturally fulfilled by choosing appropriate activation functions in the output layers of the neural networks. Besides this, our main contribution is to add options to the portfolio of risky assets and a risk-free bond and using additional neural networks to determine the amount allocated into the options as well as their strike prices. We consider objective functions more in line with the rational preference of an investor than the classical mean-variance, apply realistic trading constraints and model the assets with a correlated jump-diffusion SDE. With an incomplete market and a more involved objective function, we show that it is beneficial to add options to the portfolio. Moreover, it is shown that adding options leads to a more constant stock allocation with less demand for drastic re-allocations.
    Date: 2023–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2308.10556&r=cmp
  2. By: Kian Tehranian
    Abstract: Quantitative models are an important decision-making factor for policy makers and investors. Predicting an economic recession with high accuracy and reliability would be very beneficial for the society. This paper assesses machine learning technics to predict economic recessions in United States using market sentiment and economic indicators (seventy-five explanatory variables) from Jan 1986 - June 2022 on a monthly basis frequency. In order to solve the issue of missing time-series data points, Autoregressive Integrated Moving Average (ARIMA) method used to backcast explanatory variables. Analysis started with reduction in high dimensional dataset to only most important characters using Boruta algorithm, correlation matrix and solving multicollinearity issue. Afterwards, built various cross-validated models, both probability regression methods and machine learning technics, to predict recession binary outcome. The methods considered are Probit, Logit, Elastic Net, Random Forest, Gradient Boosting, and Neural Network. Lastly, discussed different models performance based on confusion matrix, accuracy and F1 score with potential reasons for their weakness and robustness.
    Date: 2023–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2308.16200&r=cmp
  3. By: Joao Felix; Michel Alexandre, Gilberto Tadeu Lima
    Abstract: The use of machine learning models and techniques to predict economic variables has been growing lately, motivated by their better performance when compared to that of linear models. Although linear models have the advantage of considerable interpretive power, efforts have intensified in recent years to make machine learning models more interpretable. In this paper, tests are conducted to determine whether models based on machine learning algorithms have better performance relative to that of linear models for predicting the size of the informal economy. The paper also explores whether the determinants of such size detected as the most important by machine learning models are the same as those detected in the literature based on traditional linear models. For this purpose, observations were collected and processed for 122 countries from 2004 to 2014. Next, eleven models (four linear and seven based on machine learning algorithms) were used to predict the size of the informal economy in these countries. The relative importance of the predictive variables in determining the results yielded by the machine learning algorithms was calculated using Shapley values. The results suggest that (i) models based on machine learning algorithms have better predictive performance than that of linear models and (ii) the main determinants detected through the Shapley values coincide with those detected in the literature using traditional linear models.
    Keywords: : Informal economy; machine learning; linear models; Shapley values
    JEL: C52 C53 O17
    Date: 2023–08–28
    URL: http://d.repec.org/n?u=RePEc:spa:wpaper:2023wpecon10&r=cmp
  4. By: Xingyue Pu; Stefan Zohren; Stephen Roberts; Xiaowen Dong
    Abstract: Network momentum provides a novel type of risk premium, which exploits the interconnections among assets in a financial network to predict future returns. However, the current process of constructing financial networks relies heavily on expensive databases and financial expertise, limiting accessibility for small-sized and academic institutions. Furthermore, the traditional approach treats network construction and portfolio optimisation as separate tasks, potentially hindering optimal portfolio performance. To address these challenges, we propose L2GMOM, an end-to-end machine learning framework that simultaneously learns financial networks and optimises trading signals for network momentum strategies. The model of L2GMOM is a neural network with a highly interpretable forward propagation architecture, which is derived from algorithm unrolling. The L2GMOM is flexible and can be trained with diverse loss functions for portfolio performance, e.g. the negative Sharpe ratio. Backtesting on 64 continuous future contracts demonstrates a significant improvement in portfolio profitability and risk control, with a Sharpe ratio of 1.74 across a 20-year period.
    Date: 2023–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2308.12212&r=cmp
  5. By: Hansen, Sakina; Loftus, Joshua
    Abstract: Tools for interpretable machine learning (IML) or explainable artificial intelligence (xAI) can be used to audit algorithms for fairness or other desiderata. In a black-box setting without access to the algorithm’s internal structure an auditor may be limited to methods that are model-agnostic. These methods have severe limitations with important consequences for outcomes such as fairness. Among model-agnostic IML methods, visualizations such as the partial dependence plot (PDP) or individual conditional expectation (ICE) plots are popular and useful for displaying qualitative relationships. Although we focus on fairness auditing with PDP/ICE plots, the consequences we highlight generalize to other auditing or IML/xAI applications. This paper questions the validity of auditing in high-stakes settings with contested values or conflicting interests if the audit methods are model-agnostic.
    Keywords: artificial intelligence; black-box auditing; causal models; CEUR Workshop Proceedings (CEUR-WS.org); counterfactual fairness; individual conditional expectation; machine learning; partial dependence plots; supervised learning; visualization
    JEL: C1
    Date: 2023–07–16
    URL: http://d.repec.org/n?u=RePEc:ehl:lserod:120114&r=cmp
  6. By: Gilberto Boaretto; Marcelo C. Medeiros
    Abstract: This paper examines the effectiveness of several forecasting methods for predicting inflation, focusing on aggregating disaggregated forecasts - also known in the literature as the bottom-up approach. Taking the Brazilian case as an application, we consider different disaggregation levels for inflation and employ a range of traditional time series techniques as well as linear and nonlinear machine learning (ML) models to deal with a larger number of predictors. For many forecast horizons, the aggregation of disaggregated forecasts performs just as well survey-based expectations and models that generate forecasts using the aggregate directly. Overall, ML methods outperform traditional time series models in predictive accuracy, with outstanding performance in forecasting disaggregates. Our results reinforce the benefits of using models in a data-rich environment for inflation forecasting, including aggregating disaggregated forecasts from ML techniques, mainly during volatile periods. Starting from the COVID-19 pandemic, the random forest model based on both aggregate and disaggregated inflation achieves remarkable predictive performance at intermediate and longer horizons.
    Date: 2023–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2308.11173&r=cmp
  7. By: Sascha Frey; Kang Li; Peer Nagy; Silvia Sapora; Chris Lu; Stefan Zohren; Jakob Foerster; Anisoara Calinescu
    Abstract: Financial exchanges across the world use limit order books (LOBs) to process orders and match trades. For research purposes it is important to have large scale efficient simulators of LOB dynamics. LOB simulators have previously been implemented in the context of agent-based models (ABMs), reinforcement learning (RL) environments, and generative models, processing order flows from historical data sets and hand-crafted agents alike. For many applications, there is a requirement for processing multiple books, either for the calibration of ABMs or for the training of RL agents. We showcase the first GPU-enabled LOB simulator designed to process thousands of books in parallel, with a notably reduced per-message processing time. The implementation of our simulator - JAX-LOB - is based on design choices that aim to best exploit the powers of JAX without compromising on the realism of LOB-related mechanisms. We integrate JAX-LOB with other JAX packages, to provide an example of how one may address an optimal execution problem with reinforcement learning, and to share some preliminary results from end-to-end RL training on GPUs.
    Date: 2023–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2308.13289&r=cmp
  8. By: Naman S; Gaurang B; Neel S; Aswath Babu H
    Abstract: We explored the potential applications of various Quantum Algorithms for stock price prediction by conducting a series of experimental simulations using both Classical as well as Quantum Hardware. Firstly, we extracted various stock price indicators, such as Moving Averages (MA), Average True Range (ATR), and Aroon, to gain insights into market trends and stock price movements. Next, we employed Quantum Annealing (QA) for feature selection and Principal Component Analysis (PCA) for dimensionality reduction. Further, we transformed the stock price prediction task essentially into a classification problem. We trained the Quantum Support Vector Machine (QSVM) to predict price movements (whether up or down) contrasted their performance with classical models and analyzed their accuracy on a dataset formulated using Quantum Annealing and PCA individually. We focused on the stock price prediction and binary classification of stock prices for four different companies, namely Apple, Visa, Johnson and Jonson, and Honeywell. We primarily used the real-time stock data of the raw stock prices of these companies. We compared various Quantum Computing techniques with their classical counterparts in terms of accuracy and F-score of the prediction model. Through these experimental simulations, we shed light on the potential advantages and limitations of Quantum Algorithms in stock price prediction and contribute to the growing body of knowledge at the intersection of Quantum Computing and Finance.
    Date: 2023–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2308.13642&r=cmp
  9. By: Xu Han; Zengqing Wu; Chuan Xiao
    Abstract: Firm competition and collusion involve complex dynamics, particularly when considering communication among firms. Such issues can be modeled as problems of complex systems, traditionally approached through experiments involving human subjects or agent-based modeling methods. We propose an innovative framework called Smart Agent-Based Modeling (SABM), wherein smart agents, supported by GPT-4 technologies, represent firms, and interact with one another. We conducted a controlled experiment to study firm price competition and collusion behaviors under various conditions. SABM is more cost-effective and flexible compared to conducting experiments with human subjects. Smart agents possess an extensive knowledge base for decision-making and exhibit human-like strategic abilities, surpassing traditional ABM agents. Furthermore, smart agents can simulate human conversation and be personalized, making them ideal for studying complex situations involving communication. Our results demonstrate that, in the absence of communication, smart agents consistently reach tacit collusion, leading to prices converging at levels higher than the Bertrand equilibrium price but lower than monopoly or cartel prices. When communication is allowed, smart agents achieve a higher-level collusion with prices close to cartel prices. Collusion forms more quickly with communication, while price convergence is smoother without it. These results indicate that communication enhances trust between firms, encouraging frequent small price deviations to explore opportunities for a higher-level win-win situation and reducing the likelihood of triggering a price war. We also assigned different personas to firms to analyze behavioral differences and tested variant models under diverse market structures. The findings showcase the effectiveness and robustness of SABM and provide intriguing insights into competition and collusion.
    Date: 2023–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2308.10974&r=cmp
  10. By: Tyrel Stokes; Ian Shrier; Russell Steele
    Abstract: Simulation methods are among the most ubiquitous methodological tools in statistical science. In particular, statisticians often is simulation to explore properties of statistical functionals in models for which developed statistical theory is insufficient or to assess finite sample properties of theoretical results. We show that the design of simulation experiments can be viewed from the perspective of causal intervention on a data generating mechanism. We then demonstrate the use of causal tools and frameworks in this context. Our perspective is agnostic to the particular domain of the simulation experiment which increases the potential impact of our proposed approach. In this paper, we consider two illustrative examples. First, we re-examine a predictive machine learning example from a popular textbook designed to assess the relationship between mean function complexity and the mean-squared error. Second, we discuss a traditional causal inference method problem, simulating the effect of unmeasured confounding on estimation, specifically to illustrate bias amplification. In both cases, applying causal principles and using graphical models with parameters and distributions as nodes in the spirit of influence diagrams can 1) make precise which estimand the simulation targets , 2) suggest modifications to better attain the simulation goals, and 3) provide scaffolding to discuss performance criteria for a particular simulation design.
    Date: 2023–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2308.10823&r=cmp
  11. By: A. Ege Yilmaz; Stefan Stettler; Thomas Ankenbrand; Urs Rhyner
    Abstract: We present explicit oracles designed to be used in Grover's algorithm to match investor preferences. Specifically, the oracles select portfolios with returns and standard deviations exceeding and falling below certain thresholds, respectively. One potential use case for the oracles is selecting portfolios with the best Sharpe ratios. We have implemented these algorithms using quantum simulators.
    Date: 2023–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2308.13063&r=cmp
  12. By: Md Sabbirul Haque; Md Shahedul Amin; Jonayet Miah
    Abstract: Accurate demand forecasting in the retail industry is a critical determinant of financial performance and supply chain efficiency. As global markets become increasingly interconnected, businesses are turning towards advanced prediction models to gain a competitive edge. However, existing literature mostly focuses on historical sales data and ignores the vital influence of macroeconomic conditions on consumer spending behavior. In this study, we bridge this gap by enriching time series data of customer demand with macroeconomic variables, such as the Consumer Price Index (CPI), Index of Consumer Sentiment (ICS), and unemployment rates. Leveraging this comprehensive dataset, we develop and compare various regression and machine learning models to predict retail demand accurately.
    Date: 2023–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2308.11939&r=cmp
  13. By: Reinking, Ernst; Becker, Marco
    Abstract: The introduction of ChatGPT as one of the best-known Large Language Models not only opened a new chapter in artificial intelligence in the general perception – some authors even speak of an era of (business) informatics. It also heralds the fifth industrial revolution (Industry 5.0). The aim of this working paper is not only to objectify the contradiction between hype and reality in the context of artificial intelligence, but also to show the opportunities and perspectives for the analysis of unstructured, internal company data. To this end, the authors have developed several prototypes based on their own research work, which form the basis of this working paper.
    Keywords: Ai, Industry 5.0, Language Model, LLM, ChatGPT, I5.0
    JEL: M15
    Date: 2023
    URL: http://d.repec.org/n?u=RePEc:zbw:esprep:275738&r=cmp
  14. By: Rick Steinert; Saskia Altmann
    Abstract: This paper investigates the potential improvement of the GPT-4 Language Learning Model (LLM) in comparison to BERT for modeling same-day daily stock price movements of Apple and Tesla in 2017, based on sentiment analysis of microblogging messages. We recorded daily adjusted closing prices and translated them into up-down movements. Sentiment for each day was extracted from messages on the Stocktwits platform using both LLMs. We develop a novel method to engineer a comprehensive prompt for contextual sentiment analysis which unlocks the true capabilities of modern LLM. This enables us to carefully retrieve sentiments, perceived advantages or disadvantages, and the relevance towards the analyzed company. Logistic regression is used to evaluate whether the extracted message contents reflect stock price movements. As a result, GPT-4 exhibited substantial accuracy, outperforming BERT in five out of six months and substantially exceeding a naive buy-and-hold strategy, reaching a peak accuracy of 71.47 % in May. The study also highlights the importance of prompt engineering in obtaining desired outputs from GPT-4's contextual abilities. However, the costs of deploying GPT-4 and the need for fine-tuning prompts highlight some practical considerations for its use.
    Date: 2023–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2308.16771&r=cmp
  15. By: Andres Alonso-Robisco (Banco de España); Jose Manuel Carbo (Banco de España)
    Abstract: Central banks are increasingly using verbal communication for policymaking, focusing not only on traditional monetary policy, but also on a broad set of topics. One such topic is central bank digital currency (CBDC), which is attracting attention from the international community. The complex nature of this project means that it must be carefully designed to avoid unintended consequences, such as financial instability. We propose the use of different Natural Language Processing (NLP) techniques to better understand central banks’ stance towards CBDC, analyzing a set of central bank discourses from 2016 to 2022. We do this using traditional techniques, such as dictionary-based methods, and two large language models (LLMs), namely Bert and ChatGPT, concluding that LLMs better reflect the stance identified by human experts. In particular, we observe that ChatGPT exhibits a higher degree of alignment because it can capture subtler information than BERT. Our study suggests that LLMs are an effective tool to improve sentiment measurements for policy-specific texts, though they are not infallible and may be subject to new risks, like higher sensitivity to the length of texts, and prompt engineering.
    Keywords: ChatGPT, BERT, CBDC, digital money
    JEL: G15 G41 E58
    Date: 2023–08
    URL: http://d.repec.org/n?u=RePEc:bde:wpaper:2321&r=cmp
  16. By: Andrés Azqueta-Gavaldón (Banco de España); Marina Diakonova (Banco de España); Corinna Ghirelli (Banco de España); Javier J. Pérez (Banco de España)
    Abstract: In this paper, we build a publicly-available database of economic policy uncertainty (EPU) indicators based on the methodology proposed by Azqueta-Gavaldón, Hirschbühl, Onorante and Saiz (2023), which uses topic modelling techniques to identify distinct components of EPU. This database is regularly updated and can be accessed on the Banco de España’s website. Currently, the dataset covers the four largest countries in the euro area, namely Spain, Italy, France, and Germany. Our data coverage is continually expanding to include more euro area countries. Additionally, we compute the aggregated EPU indexes for the euro area. This comprehensive dataset and the resulting euro area indexes provide valuable tools for researchers, policymakers and analysts to assess and monitor the dynamics of economic policy uncertainty in real time.
    Keywords: economic policy uncertainty, euro area, machine learning, Latent Dirichlet Allocation, word embeddings
    JEL: D80 E20 E66 G18
    Date: 2023–07
    URL: http://d.repec.org/n?u=RePEc:bde:opaper:2315&r=cmp
  17. By: Melissa Dell; Jacob Carlson; Tom Bryan; Emily Silcock; Abhishek Arora; Zejiang Shen; Luca D'Amico-Wong; Quan Le; Pablo Querubin; Leander Heldring
    Abstract: Existing full text datasets of U.S. public domain newspapers do not recognize the often complex layouts of newspaper scans, and as a result the digitized content scrambles texts from articles, headlines, captions, advertisements, and other layout regions. OCR quality can also be low. This study develops a novel, deep learning pipeline for extracting full article texts from newspaper images and applies it to the nearly 20 million scans in Library of Congress's public domain Chronicling America collection. The pipeline includes layout detection, legibility classification, custom OCR, and association of article texts spanning multiple bounding boxes. To achieve high scalability, it is built with efficient architectures designed for mobile phones. The resulting American Stories dataset provides high quality data that could be used for pre-training a large language model to achieve better understanding of historical English and historical world knowledge. The dataset could also be added to the external database of a retrieval-augmented language model to make historical information - ranging from interpretations of political events to minutiae about the lives of people's ancestors - more widely accessible. Furthermore, structured article texts facilitate using transformer-based methods for popular social science applications like topic classification, detection of reproduced content, and news story clustering. Finally, American Stories provides a massive silver quality dataset for innovating multimodal layout analysis models and other multimodal applications.
    Date: 2023–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2308.12477&r=cmp
  18. By: S. Srinivas; R. Gadela; R. Sabu; A. Das; G. Nath; V. Datla
    Abstract: Predicting future prices of a stock is an arduous task to perform. However, incorporating additional elements can significantly improve our predictions, rather than relying solely on a stock's historical price data to forecast its future price. Studies have demonstrated that investor sentiment, which is impacted by daily news about the company, can have a significant impact on stock price swings. There are numerous sources from which we can get this information, but they are cluttered with a lot of noise, making it difficult to accurately extract the sentiments from them. Hence the focus of our research is to design an efficient system to capture the sentiments from the news about the NITY50 stocks and investigate how much the financial news sentiment of these stocks are affecting their prices over a period of time. This paper presents a robust data collection and preprocessing framework to create a news database for a timeline of around 3.7 years, consisting of almost half a million news articles. We also capture the stock price information for this timeline and create multiple time series data, that include the sentiment scores from various sections of the article, calculated using different sentiment libraries. Based on this, we fit several LSTM models to forecast the stock prices, with and without using the sentiment scores as features and compare their performances.
    Date: 2023–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2308.08549&r=cmp

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.