nep-big New Economics Papers
on Big Data
Issue of 2023‒05‒29
thirty papers chosen by
Tom Coupé
University of Canterbury

  1. The impact of the AI revolution on asset management By Michael Kopp
  2. Hedonic Prices and Quality Adjusted Price Indices Powered by AI By Patrick Bajari; Zhihao Cen; Victor Chernozhukov; Manoj Manukonda; Suhas Vijaykumar; Jin Wang; Ramon Huerta; Junbo Li; Ling Leng; George Monokroussos; Shan Wan
  3. “Making Text Talk”: The Minutes of the Central Bank of Brazil and the Real Economy By Carlos Moreno Pérez; Marco Minozzo
  4. Deep learning techniques for financial time series forecasting: A review of recent advancements: 2020-2022 By Cheng Zhang; Nilam Nur Amir Sjarif; Roslina Binti Ibrahim
  5. UQ for Credit Risk Management: A deep evidence regression approach By Ashish Dhiman
  6. Identifying Financial Crises Using Machine Learning on Textual Data By Mary Chen; Matthew DeHaven; Isabel Kitschelt; Seung Jung Lee; Martin Sicilian
  7. Maximally Machine-Learnable Portfolios By Philippe Goulet Coulombe; Maximilian Gobel
  8. Stock Price Predictability and the Business Cycle via Machine Learning By Li Rong Wang; Hsuan Fu; Xiuyi Fan
  9. Random neural networks for rough volatility By Antoine Jacquier; Zan Zuric
  10. LSTM based Anomaly Detection in Time Series for United States exports and imports By Aggarwal, Sakshi
  11. Learning Volatility Surfaces using Generative Adversarial Networks By Andrew Na; Meixin Zhang; Justin Wan
  12. Optimum Output Long Short-Term Memory Cell for High-Frequency Trading Forecasting By Adamantios Ntakaris; Moncef Gabbouj; Juho Kanniainen
  13. Big data, news diversity and financial market crash By Sabri Boubaker; Zhenya Liu; Ling Zhai
  14. TM-vector: A Novel Forecasting Approach for Market stock movement with a Rich Representation of Twitter and Market data By Faraz Sasani; Ramin Mousa; Ali Karkehabadi; Samin Dehbashi; Ali Mohammadi
  15. Augmented balancing weights as linear regression By David Bruns-Smith; Oliver Dukes; Avi Feller; Elizabeth L. Ogburn
  16. Assessing Text Mining and Technical Analyses on Forecasting Financial Time Series By Ali Lashgari
  17. Construction and Analysis of Uncertainty Indices based on Multilingual Text Representations By Viktoriia Naboka-Krell
  18. "Solving Kolmogorov PDEs without the curse of dimensionality via deep learning and asymptotic expansion with Malliavin calculus" By Akihiko Takahashi; Toshihiro Yamada
  19. Estimating Input Coefficients for Regional Input-Output Tables Using Deep Learning with Mixup By Shogo Fukui
  20. Deep Stock: training and trading scheme using deep learning By Sungwoo Kang
  21. Generative modeling for time series via Schrödinger bridge By Mohamed Hamdouche; Pierre Henry-Labordere; Huyên Pham
  22. Conditional Generative Models for Learning Stochastic Processes By Salvatore Certo; Anh Pham; Nicolas Robles; Andrew Vlasic
  23. Solving Kolmogorov PDEs without the curse of dimensionality via deep learning and asymptotic expansion with Malliavin calculus (Forthcoming in "Partial Differential Equations and Applications")(Revised version of CARF-F-547) By Akihiko Takahashi; Toshihiro Yamada
  24. The Unintended Consequences of Censoring Digital Technology -- Evidence from Italy's ChatGPT Ban By David H. Kreitmeir; Paul A. Raschky
  25. The economic impact of conflict-related and policy uncertainty shocks: the case of Russia By Marina Diakonova; Corinna Ghirelli; Javier J. Pérez; Luis Molina
  26. The information content of conflict, social unrest and policy uncertainty measures for macroeconomic forecasting By Marina Diakonova; Luis Molina; Hannes Mueller; Javier J. Pérez; Cristopher Rauh
  27. A Multi-method Approach to Analyze Australia-China Geopolitical Discourse on YouTube By Adeliyi, Oluwaseyi; Adesoba, Adeola
  28. Effects of Information Overload on Financial Markets: How Much Is Too Much? By Alejandro Bernales; Marcela Valenzuela; Ilknur Zer
  29. Financial Hedging and Risk Compression, A journey from linear regression to neural network By Ali Shirazi; Fereshteh Sadeghi Naieni Fard
  30. Using newspapers for textual indicators: which and how many? By Erik Andres-Escayola; Corinna Ghirelli; Luis Molina; Javier J. Pérez; Elena Vidal

  1. By: Michael Kopp
    Abstract: Recent progress in deep learning, a special form of machine learning, has led to remarkable capabilities machines can now be endowed with: they can read and understand free flowing text, reason and bargain with human counterparts, translate texts between languages, learn how to take decisions to maximize certain outcomes, etc. Today, machines have revolutionized the detection of cancer, the prediction of protein structures, the design of drugs, the control of nuclear fusion reactors etc. Although these capabilities are still in their infancy, it seems clear that their continued refinement and application will result in a technological impact on nearly all social and economic areas of human activity, the likes of which we have not seen before. In this article, I will share my view as to how AI will likely impact asset management in general and I will provide a mental framework that will equip readers with a simple criterion to assess whether and to what degree a given fund really exploits deep learning and whether a large disruption risk from deep learning exist.
    Date: 2023–04
  2. By: Patrick Bajari; Zhihao Cen; Victor Chernozhukov; Manoj Manukonda; Suhas Vijaykumar; Jin Wang; Ramon Huerta; Junbo Li; Ling Leng; George Monokroussos; Shan Wan
    Abstract: Accurate, real-time measurements of price index changes using electronic records are essential for tracking inflation and productivity in today's economic environment. We develop empirical hedonic models that can process large amounts of unstructured product data (text, images, prices, quantities) and output accurate hedonic price estimates and derived indices. To accomplish this, we generate abstract product attributes, or ``features, '' from text descriptions and images using deep neural networks, and then use these attributes to estimate the hedonic price function. Specifically, we convert textual information about the product to numeric features using large language models based on transformers, trained or fine-tuned using product descriptions, and convert the product image to numeric features using a residual network model. To produce the estimated hedonic price function, we again use a multi-task neural network trained to predict a product's price in all time periods simultaneously. To demonstrate the performance of this approach, we apply the models to Amazon's data for first-party apparel sales and estimate hedonic prices. The resulting models have high predictive accuracy, with $R^2$ ranging from $80\%$ to $90\%$. Finally, we construct the AI-based hedonic Fisher price index, chained at the year-over-year frequency. We contrast the index with the CPI and other electronic indices.
    Date: 2023–04
  3. By: Carlos Moreno Pérez (Banco de España); Marco Minozzo (University of Verona)
    Abstract: This paper investigates the relationship between the views expressed in the minutes of the meetings of the Central Bank of Brazil’s Monetary Policy Committee (COPOM) and the real economy. It applies various computational linguistic machine learning algorithms to construct measures of the minutes of the COPOM. First, we create measures of the content of the paragraphs of the minutes using Latent Dirichlet Allocation (LDA). Second, we build an uncertainty index for the minutes using Word Embedding and K-Means. Then, we combine these indices to create two topic-uncertainty indices. The first one is constructed from paragraphs with a higher probability of topics related to “general economic conditions”. The second topic-uncertainty index is constructed from paragraphs that have a higher probability of topics related to “inflation” and the “monetary policy discussion”. Finally, we employ a structural VAR model to explore the lasting effects of these uncertainty indices on certain Brazilian macroeconomic variables. Our results show that greater uncertainty leads to a decline in inflation, the exchange rate, industrial production and retail trade in the period from January 2000 to July 2019.
    Keywords: Central Bank of Brazil, monetary policy communication, Latent Dirichlet Allocation, monetary policy uncertainty, Structural Vector Autoregressive model, Word Embedding
    JEL: C32 C45 D83 E52
    Date: 2022–11
  4. By: Cheng Zhang; Nilam Nur Amir Sjarif; Roslina Binti Ibrahim
    Abstract: Forecasting financial time series has long been a challenging problem that has attracted attention from both researchers and practitioners. Statistical and machine learning techniques have both been explored to develop effective forecasting models in the past few decades. With recent developments in deep learning models, financial time series forecasting models have advanced significantly, and these developments are often difficult to keep up with. Hence, we have conducted this literature review to provide a comprehensive assessment of recent research from 2020 to 2022 on deep learning models used to predict prices based on financial time series. Our review presents different data sources and neural network structures, as well as their implementation details. Our goals are to ensure that interested researchers remain up-to-date on recent developments in the field and facilitate the selection of baselines based on models used in prior studies. Additionally, we provide suggestions for future research based on the content in this review.
    Date: 2023–04
  5. By: Ashish Dhiman
    Abstract: Machine Learning has invariantly found its way into various Credit Risk applications. Due to the intrinsic nature of Credit Risk, quantifying the uncertainty of the predicted risk metrics is essential, and applying uncertainty-aware deep learning models to credit risk settings can be very helpful. In this work, we have explored the application of a scalable UQ-aware deep learning technique, Deep Evidence Regression and applied it to predicting Loss Given Default. We contribute to the literature by extending the Deep Evidence Regression methodology to learning target variables generated by a Weibull process and provide the relevant learning framework. We demonstrate the application of our approach to both simulated and real-world data.
    Date: 2023–05
  6. By: Mary Chen; Matthew DeHaven; Isabel Kitschelt; Seung Jung Lee; Martin Sicilian
    Abstract: We use machine learning techniques on textual data to identify financial crises. The onset of a crisis and its duration have implications for real economic activity, and as such can be valuable inputs into macroprudential, monetary, and fiscal policy. The academic literature and the policy realm rely mostly on expert judgment to determine crises, often with a lag. Consequently, crisis durations and the buildup phases of vulnerabilities are usually determined only with the benefit of hindsight. Although we can identify and forecast a portion of crises worldwide to various degrees with traditional econometric techniques and using readily available market data, we find that textual data helps in reducing false positives and false negatives in out-of-sample testing of such models, especially when the crises are considered more severe. Building a framework that is consistent across countries and in real time can benefit policymakers around the world, especially when international coordination is required across different government policies.
    Keywords: Financial crises; Machine learning; Natural language processing
    JEL: C53 C55 G01
    Date: 2023–03–31
  7. By: Philippe Goulet Coulombe (University of Quebec in Montreal); Maximilian Gobel (Bocconi University)
    Abstract: When it comes to stock returns, any form of predictability can bolster risk-adjusted profitability. We develop a collaborative machine learning algorithm that optimizes portfolio weights so that the resulting synthetic security is maximally predictable. Precisely, we introduce MACE, a multivariate extension of Alternating Conditional Expectations that achieves the aforementioned goal by wielding a Random Forest on one side of the equation, and a constrained Ridge Regression on the other. There are two key improvements with respect to Lo and MacKinlay’s original maximally predictable portfolio approach. First, it accommodates for any (nonlinear) forecasting algorithm and predictor set. Second, it handles large portfolios. We conduct exercises at the daily and monthly frequency and report significant increases in predictability and profitability using very little conditioning information. Interestingly, predictability is found in bad as well as good times, and MACE successfully navigates the debacle of 2022.
    Date: 2023–04
  8. By: Li Rong Wang; Hsuan Fu; Xiuyi Fan
    Abstract: We study the impacts of business cycles on machine learning (ML) predictions. Using the S&P 500 index, we find that ML models perform worse during most recessions, and the inclusion of recession history or the risk-free rate does not necessarily improve their performance. Investigating recessions where models perform well, we find that they exhibit lower market volatility than other recessions. This implies that the improved performance is not due to the merit of ML methods but rather factors such as effective monetary policies that stabilized the market. We recommend that ML practitioners evaluate their models during both recessions and expansions.
    Date: 2023–04
  9. By: Antoine Jacquier; Zan Zuric
    Abstract: We construct a deep learning-based numerical algorithm to solve path-dependent partial differential equations arising in the context of rough volatility. Our approach is based on interpreting the PDE as a solution to an SPDE, building upon recent insights by Bayer, Qiu and Yao, and on constructing a neural network of reservoir type as originally developed by Gonon, Grigoryeva, Ortega. The reservoir approach allows us to formulate the optimisation problem as a simple least-square regression for which we prove theoretical convergence properties.
    Date: 2023–05
  10. By: Aggarwal, Sakshi
    Abstract: This survey aims to offer a thorough and organized overview of research on anomaly detection, which is a significant problem that has been studied in various fields and application areas. Some anomaly detection techniques have been tailored for specific domains, while others are more general. Anomaly detection involves identifying unusual patterns or events in a dataset, which is important for a wide range of applications including fraud detection and medical diagnosis. Not much research on anomaly detection techniques has been conducted in the field of economic and international trade. Therefore, this study attempts to analyze the time-series data of United Nations exports and imports for the period 1992 – 2022 using LSTM based anomaly detection algorithm. Deep learning, particularly LSTM networks, are becoming increasingly popular in anomaly detection tasks due to their ability to learn complex patterns in sequential data. This paper presents a detailed explanation of LSTM architecture, including the role of input, forget, and output gates in processing input vectors and hidden states at each timestep. The LSTM based anomaly detection approach yields promising results by modelling small-term as well as long-term temporal dependencies.
    Keywords: Anomaly detection, LSTM, Machine learning, Artificial intelligence, economic trade
    JEL: C54 F13 F15
    Date: 2023–04–25
  11. By: Andrew Na; Meixin Zhang; Justin Wan
    Abstract: In this paper, we propose a generative adversarial network (GAN) approach for efficiently computing volatility surfaces. The idea is to make use of the special GAN neural architecture so that on one hand, we can learn volatility surfaces from training data and on the other hand, enforce no-arbitrage conditions. In particular, the generator network is assisted in training by a discriminator that evaluates whether the generated volatility matches the target distribution. Meanwhile, our framework trains the GAN network to satisfy the no-arbitrage constraints by introducing penalties as regularization terms. The proposed GAN model allows the use of shallow networks which results in much less computational costs. In our experiments, we demonstrate the performance of the proposed method by comparing with the state-of-the-art methods for computing implied and local volatility surfaces. We show that our GAN model can outperform artificial neural network (ANN) approaches in terms of accuracy and computational time.
    Date: 2023–04
  12. By: Adamantios Ntakaris; Moncef Gabbouj; Juho Kanniainen
    Abstract: High-frequency trading requires fast data processing without information lags for precise stock price forecasting. This high-paced stock price forecasting is usually based on vectors that need to be treated as sequential and time-independent signals due to the time irregularities that are inherent in high-frequency trading. A well-documented and tested method that considers these time-irregularities is a type of recurrent neural network, named long short-term memory neural network. This type of neural network is formed based on cells that perform sequential and stale calculations via gates and states without knowing whether their order, within the cell, is optimal. In this paper, we propose a revised and real-time adjusted long short-term memory cell that selects the best gate or state as its final output. Our cell is running under a shallow topology, has a minimal look-back period, and is trained online. This revised cell achieves lower forecasting error compared to other recurrent neural networks for online high-frequency trading forecasting tasks such as the limit order book mid-price prediction as it has been tested on two high-liquid US and two less-liquid Nordic stocks.
    Date: 2023–04
  13. By: Sabri Boubaker (Métis Lab EM Normandie - EM Normandie - École de Management de Normandie, VNU - Vietnam National University [Hanoï]); Zhenya Liu (CERGAM - Centre d'Études et de Recherche en Gestion d'Aix-Marseille - AMU - Aix Marseille Université - UTLN - Université de Toulon, Renmin University of China); Ling Zhai (Renmin University of China)
    Abstract: A vast quantity of high-dimensional, unstructured textual news data is produced every day, more than two decades after the launch of the global Internet. These big data have a significant influence on the way that decisions are made in business and finance, due to the cost, scalability, and transparency benefits that they bring. However, limited studies have fully exploited big data to analyze changes in news diversity or to predict financial market movements, specifically stock market crashes. Based on modern methods of textual analysis, this paper investigates the relationship between news diversity and financial market crashes by applying the change-point detection approach. The empirical analysis shows that (1) big data is a relatively new and useful tool for assessing financial market movements, (2) there is a relationship between news diversity and financial market movements. News diversity tends to decline when the market falls and volatility soars, and increases when the market is on an upward trend and in recovery, and (3) the multiple structural breaks detected improve the ability to forecast stock price movements. Therefore, changes to news diversity, embedded in big data, can be a useful indicator of financial market crashes and recoveries.
    Keywords: Big data, News diversity, Textual analysis, Change-point, Financial crisis
    Date: 2021–07
  14. By: Faraz Sasani; Ramin Mousa; Ali Karkehabadi; Samin Dehbashi; Ali Mohammadi
    Abstract: Stock market forecasting has been a challenging part for many analysts and researchers. Trend analysis, statistical techniques, and movement indicators have traditionally been used to predict stock price movements, but text extraction has emerged as a promising method in recent years. The use of neural networks, especially recurrent neural networks, is abundant in the literature. In most studies, the impact of different users was considered equal or ignored, whereas users can have other effects. In the current study, we will introduce TM-vector and then use this vector to train an IndRNN and ultimately model the market users' behaviour. In the proposed model, TM-vector is simultaneously trained with both the extracted Twitter features and market information. Various factors have been used for the effectiveness of the proposed forecasting approach, including the characteristics of each individual user, their impact on each other, and their impact on the market, to predict market direction more accurately. Dow Jones 30 index has been used in current work. The accuracy obtained for predicting daily stock changes of Apple is based on various models, closed to over 95\% and for the other stocks is significant. Our results indicate the effectiveness of TM-vector in predicting stock market direction.
    Date: 2023–03
  15. By: David Bruns-Smith; Oliver Dukes; Avi Feller; Elizabeth L. Ogburn
    Abstract: We provide a novel characterization of augmented balancing weights, also known as Automatic Debiased Machine Learning (AutoDML). These estimators combine outcome modeling with balancing weights, which estimate inverse propensity score weights directly. When the outcome and weighting models are both linear in some (possibly infinite) basis, we show that the augmented estimator is equivalent to a single linear model with coefficients that combine the original outcome model coefficients and OLS; in many settings, the augmented estimator collapses to OLS alone. We then extend these results to specific choices of outcome and weighting models. We first show that the combined estimator that uses (kernel) ridge regression for both outcome and weighting models is equivalent to a single, undersmoothed (kernel) ridge regression; this also holds when considering asymptotic rates. When the weighting model is instead lasso regression, we give closed-form expressions for special cases and demonstrate a ``double selection'' property. Finally, we generalize these results to linear estimands via the Riesz representer. Our framework ``opens the black box'' on these increasingly popular estimators and provides important insights into estimation choices for augmented balancing weights.
    Date: 2023–04
  16. By: Ali Lashgari
    Abstract: Forecasting financial time series (FTS) is an essential field in finance and economics that anticipates market movements in financial markets. This paper investigates the accuracy of text mining and technical analyses in forecasting financial time series. It focuses on the S&P500 stock market index during the pandemic, which tracks the performance of the largest publicly traded companies in the US. The study compares two methods of forecasting the future price of the S&P500: text mining, which uses NLP techniques to extract meaningful insights from financial news, and technical analysis, which uses historical price and volume data to make predictions. The study examines the advantages and limitations of both methods and analyze their performance in predicting the S&P500. The FinBERT model outperforms other models in terms of S&P500 price prediction, as evidenced by its lower RMSE value, and has the potential to revolutionize financial analysis and prediction using financial news data. Keywords: ARIMA, BERT, FinBERT, Forecasting Financial Time Series, GARCH, LSTM, Technical Analysis, Text Mining JEL classifications: G4, C8
    Date: 2023–04
  17. By: Viktoriia Naboka-Krell (University Giessen)
    Abstract: The work by Baker et al. (2016), who propose a dictionary based method and estimate the level of economic policy uncertainty (EPU) based on the occurrence of specific terms in ten leading newspapers in the USA, is among the first ones to detect the potential of text data in economic research. Following this line of research, this paper proposes automated approaches to construction of EPU indices for different countries based on newspapers’ texts. First, multilingual fastText word embeddings and BERT text embeddings are used in order to define relevant EPU key words and EPU related articles, respectively. Further, multilingual conceptualized topic modeling introduced by Bianchi et al. (2021) is performed and EPU related topics are detected. It is shown that the constructed EPU indices based on fastText embeddings Granger cause the economic activity in all of the considered countries, namely Germany, Russia, and Ukraine. Also, some of the topics uncovered by multilingual conceptualized topic modeling have proved to Granger cause the economic activity in all of the considered countries.
    Keywords: text-as-data, fastText emeddings, BERT, economic policy uncertainty, natural language processing
    Date: 2023
  18. By: Akihiko Takahashi (Faculty of Economics, The University of Tokyo); Toshihiro Yamada (Graduate School of Economics, Hitotsubashi University and Japan Science and Technology Agenc)
    Abstract: This paper proposes a new spatial approximation method without the curse of dimensionality for solving high-dimensional partial differential equations (PDEs) by using an asymptotic expan- sion method with a deep learning-based algorithm. In particular, the mathematical justi cation on the spatial approximation is provided. Numerical examples for high-dimensional Kolmogorov PDEs show effectiveness of our method.
    Date: 2023–04
  19. By: Shogo Fukui
    Abstract: An input-output table is an important data for analyzing the economic situation of a region. Generally, the input-output table for each region (regional input-output table) in Japan is not always publicly available, so it is necessary to estimate the table. In particular, various methods have been developed for estimating input coefficients, which are an important part of the input-output table. Currently, non-survey methods are often used to estimate input coefficients because they require less data and computation, but these methods have some problems, such as discarding information and requiring additional data for estimation. In this study, the input coefficients are estimated by approximating the generation process with an artificial neural network (ANN) to mitigate the problems of the non-survey methods and to estimate the input coefficients with higher precision. To avoid over-fitting due to the small data used, data augmentation, called mixup, is introduced to increase the data size by generating virtual regions through region composition and scaling. By comparing the estimates of the input coefficients with those of Japan as a whole, it is shown that the accuracy of the method of this research is higher and more stable than that of the conventional non-survey methods. In addition, the estimated input coefficients for the three cities in Japan are generally close to the published values for each city.
    Date: 2023–05
  20. By: Sungwoo Kang
    Abstract: Despite the efficient market hypothesis, many studies suggest the existence of inefficiencies in the stock market, leading to the development of techniques to gain above-market returns, known as alpha. Systematic trading has undergone significant advances in recent decades, with deep learning emerging as a powerful tool for analyzing and predicting market behavior. In this paper, we propose a model inspired by professional traders that look at stock prices of the previous 600 days and predicts whether the stock price rises or falls by a certain percentage within the next D days. Our model, called DeepStock, uses Resnet's skip connections and logits to increase the probability of a model in a trading scheme. We test our model on both the Korean and US stock markets and achieve a profit of N\% on Korea market, which is M\% above the market return, and profit of A\% on US market, which is B\% above the market return.
    Date: 2023–04
  21. By: Mohamed Hamdouche (LPSM (UMR_8001) - Laboratoire de Probabilités, Statistique et Modélisation - SU - Sorbonne Université - CNRS - Centre National de la Recherche Scientifique - UPCité - Université Paris Cité); Pierre Henry-Labordere (Qube RT); Huyên Pham (LPSM (UMR_8001) - Laboratoire de Probabilités, Statistique et Modélisation - SU - Sorbonne Université - CNRS - Centre National de la Recherche Scientifique - UPCité - Université Paris Cité)
    Abstract: We propose a novel generative model for time series based on Schrödinger bridge (SB) approach. This consists in the entropic interpolation via optimal transport between a reference probability measure on path space and a target measure consistent with the joint data distribution of the time series. The solution is characterized by a stochastic differential equation on finite horizon with a path-dependent drift function, hence respecting the temporal dynamics of the time series distribution. We can estimate the drift function from data samples either by kernel regression methods or with LSTM neural networks, and the simulation of the SB diffusion yields new synthetic data samples of the time series. The performance of our generative model is evaluated through a series of numerical experiments. First, we test with a toy autoregressive model, a GARCH Model, and the example of fractional Brownian motion, and measure the accuracy of our algorithm with marginal and temporal dependencies metrics. Next, we use our SB generated synthetic samples for the application to deep hedging on real-data sets. Finally, we illustrate the SB approach for generating sequence of images.
    Keywords: generative models, time series, Schrödinger bridge, kernel estimation, deep hedging
    Date: 2023–04–07
  22. By: Salvatore Certo; Anh Pham; Nicolas Robles; Andrew Vlasic
    Abstract: A framework to learn a multi-modal distribution is proposed, denoted as the Conditional Quantum Generative Adversarial Network (C-qGAN). The neural network structure is strictly within a quantum circuit and, as a consequence, is shown to represents a more efficient state preparation procedure than current methods. This methodology has the potential to speed-up algorithms, such as Monte Carlo analysis. In particular, after demonstrating the effectiveness of the network in the learning task, the technique is applied to price Asian option derivatives, providing the foundation for further research on other path-dependent options.
    Date: 2023–04
  23. By: Akihiko Takahashi (The University of Tokyo); Toshihiro Yamada (Hitotsubashi University, Japan Science and Technology Agency (JST))
    Abstract: This paper proposes a new spatial approximation method without the curse of dimensionality for solving high-dimensional partial differential equations (PDEs) by using an asymptotic expansion method with a deep learning-based algorithm. In particular, the mathematical justification on the spatial approximation is provided. Numerical examples for high-dimensional Kolmogorov PDEs show effectiveness of our method.
    Date: 2023–05
  24. By: David H. Kreitmeir; Paul A. Raschky
    Abstract: We analyse the effects of the ban of ChatGPT, a generative pre-trained transformer chatbot, on individual productivity. We first compile data on the hourly coding output of over 8, 000 professional GitHub users in Italy and other European countries to analyse the impact of the ban on individual productivity. Combining the high-frequency data with the sudden announcement of the ban in a difference-in-differences framework, we find that the output of Italian developers decreased by around 50% in the first two business days after the ban and recovered after that. Applying a synthetic control approach to daily Google search and Tor usage data shows that the ban led to a significant increase in the use of censorship bypassing tools. Our findings show that users swiftly implement strategies to bypass Internet restrictions but this adaptation activity creates short-term disruptions and hampers productivity.
    Date: 2023–04
  25. By: Marina Diakonova (Banco de España); Corinna Ghirelli (Banco de España); Javier J. Pérez (Banco de España); Luis Molina (Banco de España)
    Abstract: We show how policy uncertainty and conflict-related shocks impact the dynamics of economic activity (GDP) in Russia. We use alternative indicators of “conflict”, relating to specific aspects of this general concept: geopolitical risk, social unrest, outbreaks of political violence and escalations into internal armed conflict. For policy uncertainty we employ the workhorse economic policy uncertainty (EPU) indicator. We use two distinct but complementary empirical approaches. The first is based on a time series mixed-frequency forecasting model. We show that the indicators provide useful information for forecasting GDP in the short run, even when controlling for a comprehensive set of standard high-frequency macro-financial variables. The second approach, is a SVAR model. We show that negative shocks to the selected indicators lead to economic slowdown, with a persistent drop in GDP growth and a short-lived but large increase in country risk.
    Keywords: GDP forecasting, natural language processing, social unrest, social conflict, policy uncertainty, geopolitical risk
    JEL: E37 D74 N16
    Date: 2022–11
  26. By: Marina Diakonova (Banco de España); Luis Molina (Banco de España); Hannes Mueller (IAE-CSIC and BSE); Javier J. Pérez (Banco de España); Cristopher Rauh (University of Cambridge)
    Abstract: It is widely accepted that episodes of social unrest, conflict, political tensions and policy uncertainty affect the economy. Nevertheless, the real-time dimension of such relationships is less studied, and it remains unclear how to incorporate them in a forecasting framework. This can be partly explained by a certain divide between the economic and political science contributions in this area, as well as by the traditional lack of availability of high-frequency indicators measuring such phenomena. The latter constraint, though, is becoming less of a limiting factor through the production of text-based indicators. In this paper we assemble a dataset of such monthly measures of what we call “institutional instability”, for three representative emerging market economies: Brazil, Colombia and Mexico. We then forecast quarterly GDP by adding these new variables to a standard macro-forecasting model in a mixed-frequency MIDAS framework. Our results strongly suggest that capturing institutional instability based on a broad set of standard high-frequency indicators is useful when forecasting quarterly GDP. We also analyse the relative strengths and weaknesses of the approach.
    Keywords: forecasting, social unrest, social conflict, policy uncertainty, forecasting GDP, natural language processing, geopolitical risk
    JEL: E37 D74 N16
    Date: 2022–09
  27. By: Adeliyi, Oluwaseyi; Adesoba, Adeola
    Abstract: In recent years, Australia-China relations started to decline, after Australia made an inquiry into the origins of COVID-19 (Peters et al. 2021). The involvement of the US in these activities and South China Sea tension has also led the tensions to grow more. Recent studies have analyzed political division on this topic on social media platforms such as Twitter (Stewart et al. 2018). In this paper, we utilize a multimethod analytical framework to analyze geopolitical discourse between Australia and China on YouTube. We analyze over 900 YouTube channels, 2 million comments, and 11, 000 videos from July 2019 through December 2020. Our results show COVID-19 topic had an impact on the geopolitical discourse in the short term, but was suppressed by trade, and defense related topics in the long term. Finally, we studied suspicious channel activity and found Defense Flash News YouTube channel tried to grow their user engagement statistics inorganically.
    Date: 2022–03–19
  28. By: Alejandro Bernales; Marcela Valenzuela; Ilknur Zer
    Abstract: Motivated by cognitive theories verifying that investors have limited capacity to process information, we study the effects of information overload on stock market dynamics. We construct an information overload index using textual analysis tools on daily data from The New York Times since 1885. We structure our empirical analysis around a discrete-time learning model, which links information overload with asset prices and trading volume when investors are attention constrained. We find that our index is associated with lower trading volume and predicts higher market returns for up to 18 months, even after controlling for standard predictors and other news-based measures. Information overload also affects the cross-section of stock returns: Investors require higher risk premia to hold small, high beta, high volatile, and unprofitable stocks. Such findings are consistent with theories emphasizing that information overload increases information and estimation risk and deteriorates investors' decision accuracy amid their limited attention.
    Keywords: Limited attention; Dispersion; Sentiment; Predicting returns; Behavioral biases
    JEL: G40 G41 G12 G14
    Date: 2023–03–09
  29. By: Ali Shirazi; Fereshteh Sadeghi Naieni Fard
    Abstract: Finding the hedge ratios for a portfolio and risk compression is the same mathematical problem. Traditionally, regression is used for this purpose. However, regression has its own limitations. For example, in a regression model, we can't use highly correlated independent variables due to multicollinearity issue and instability in the results. A regression model cannot also consider the cost of hedging in the hedge ratios estimation. We have introduced several methods that address the linear regression limitation while achieving better performance. These models, in general, fall into two categories: Regularization Techniques and Common Factor Analyses. In regularization techniques, we minimize the variance of hedged portfolio profit and loss (PnL) and the hedge ratio sizes, which helps reduce the cost of hedging. The regularization techniques methods could also consider the cost of hedging as a function of the cost of funding, market condition, and liquidity. In common factor analyses, we first map variables into common factors and then find the hedge ratios so that the hedged portfolio doesn't have any exposure to the factors. We can use linear or nonlinear factors construction. We are introducing a modified beta variational autoencoder that constructs common factors nonlinearly to compute hedges. Finally, we introduce a comparison method and generate numerical results for an example.
    Date: 2023–04
  30. By: Erik Andres-Escayola (European Central Bank); Corinna Ghirelli (Banco de España); Luis Molina (Banco de España); Javier J. Pérez (Banco de España); Elena Vidal (Banco de España)
    Abstract: This paper investigates the role that two key methodological choices play in the construction of textual indicators: the selection of local versus foreign newspapers and the breadth of the press coverage (i.e. the number of newspapers considered). The large literature in this field is almost silent about the robustness of research results to these two choices. We use as a case study the well-known economic policy uncertainty (EPU) index, taking as examples Latin America and Spain. First, we develop EPU measures based on press with different levels of proximity, i.e. local versus foreign, and corroborate that they deliver broadly similar narratives. Second, we examine the macroeconomic effects of EPU shocks computed using these different sources by means of a structural Bayesian vector autoregression framework and find similar responses from the statistical point of view. Third, we show that constructing EPU indexes based on only one newspaper may yield biased responses. This suggests that it is important to maximize the breadth of press coverage when building text-based indicators, since this improves the credibility of results. In this regard, our first and second results are good news for researchers, given that they provide a justification for the combined use of a larger amount of data from local and foreign sources.
    Keywords: economic policy uncertainty, textual analysis, press coverage, Latin American economies, business cycles
    JEL: D80 C43 E32 O11
    Date: 2022–10

This nep-big issue is ©2023 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.