nep-big 2021-07-26 papers

on Big Data

Issue of 2021‒07‒26
27 papers chosen by
Tom Coupé
University of Canterbury

National-scale electricity peak load forecasting: Traditional, machine learning, or hybrid model? By Juyong Lee; Youngsang Cho
MegazordNet: combining statistical and machine learning standpoints for time series forecasting By Angelo Garangau Menezes; Saulo Martiello Mastelini
Effectiveness of Artificial Intelligence in Stock Market Prediction based on Machine Learning By Sohrab Mokhtari; Kang K. Yen; Jin Liu
Machine Learning for Financial Forecasting, Planning and Analysis: Recent Developments and Pitfalls By Helmut Wasserbacher; Martin Spindler
Stock price prediction using BERT and GAN By Priyank Sonkiya; Vikas Bajpai; Anukriti Bansal
Crowdsourcing Artificial Intelligence in Africa: Findings from a Machine Learning Contest By Naudé, Wim; Bray, Amy; Lee, Celina
Application of deep reinforcement learning for Indian stock trading automation By Supriya Bajpai
A Sparsity Algorithm with Applications to Corporate Credit Rating By Dan Wang; Zhi Chen; Ionut Florescu
Predicting Exporters with Machine Learning By Francesca Micocci; Armando Rungi
Investor Behavior Modeling by Analyzing Financial Advisor Notes: A Machine Learning Perspective By Cynthia Pagliaro; Dhagash Mehta; Han-Tai Shiao; Shaofei Wang; Luwei Xiong
Deep Learning for Mean Field Games and Mean Field Control with Applications to Finance By Ren\'e Carmona; Mathieu Lauri\`ere
A data-driven explainable case-based reasoning approach for financial risk detection By Li, Wei; Paraschiv, Florentina; Sermpinis, Georgios
Double debiased machine learning nonparametric inference with continuous treatments By Kyle Colangelo; Ying-Ying Lee
Visual Time Series Forecasting: An Image-driven Approach By Naftali Cohen; Srijan Sood; Zhen Zeng; Tucker Balch; Manuela Veloso
Epidemic Exposure, Fintech Adoption, and the Digital Divide By Orkun Saka; Barry Eichengreen; Cevat Giray Aksoy
What future for European robotics? By CHARISI Vasiliki; COMPANO Ramon; DUCH BROWN Nestor; GOMEZ GUTIERREZ Emilia; KLENERT David; LUTZ Michael; MARSCHINSKI Robert; TORRECILLA SALINAS Carlos
Epidemic Exposure, Fintech Adoption, and the Digital Divide By Orkun Saka ⓡ; Barry Eichengreen ⓡ; Cevat Giray Aksoy
Data Sharing Markets By Mohammad Rasouli; Michael I. Jordan
Deep calibration of the quadratic rough Heston model By Mathieu Rosenbaum; Jianfei Zhang
Flexible Covariate Adjustments in Regression Discontinuity Designs By Claudia Noack; Tomasz Olma; Christoph Rothe
Deep Risk Model: A Deep Learning Solution for Mining Latent Risk Factors to Improve Covariance Matrix Estimation By Hengxu Lin; Dong Zhou; Weiqing Liu; Jiang Bian
Numerical approximation of singular Forward-Backward SDEs By Jean-Fran\c{c}ois Chassagneux; Mohan Yang
Tracking ECB's communication: Perspectives and Implications for Financial Markets By FORTES, Roberta; Le Guenedal, Theo
Pseudo-Model-Free Hedging for Variable Annuities via Deep Reinforcement Learning By Wing Fung Chong; Haoen Cui; Yuxuan Li
Emotions in Macroeconomic News and their Impact on the European Bond Market By Sergio Consoli; Luca Tiozzo Pezzoli; Elisa Tosetti
Modelling Clusters From The Ground Up: A Web Data Approach By Stich, Christoph; Tranos, Emmanouil; Nathan, Max
The Role of "Live" in Livestreaming Markets: Evidence Using Orthogonal Random Forest By Ziwei Cong; Jia Liu; Puneet Manchanda

National-scale electricity peak load forecasting: Traditional, machine learning, or hybrid model?

By:	Juyong Lee; Youngsang Cho
Abstract:	As the volatility of electricity demand increases owing to climate change and electrification, the importance of accurate peak load forecasting is increasing. Traditional peak load forecasting has been conducted through time series-based models; however, recently, new models based on machine or deep learning are being introduced. This study performs a comparative analysis to determine the most accurate peak load-forecasting model for Korea, by comparing the performance of time series, machine learning, and hybrid models. Seasonal autoregressive integrated moving average with exogenous variables (SARIMAX) is used for the time series model. Artificial neural network (ANN), support vector regression (SVR), and long short-term memory (LSTM) are used for the machine learning models. SARIMAX-ANN, SARIMAX-SVR, and SARIMAX-LSTM are used for the hybrid models. The results indicate that the hybrid models exhibit significant improvement over the SARIMAX model. The LSTM-based models outperformed the others; the single and hybrid LSTM models did not exhibit a significant performance difference. In the case of Korea's highest peak load in 2019, the predictive power of the LSTM model proved to be greater than that of the SARIMAX-LSTM model. The LSTM, SARIMAX-SVR, and SARIMAX-LSTM models outperformed the current time series-based forecasting model used in Korea. Thus, Korea's peak load-forecasting performance can be improved by including machine learning or hybrid models.
Date:	2021–06
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2107.06174&r=

MegazordNet: combining statistical and machine learning standpoints for time series forecasting

By:	Angelo Garangau Menezes; Saulo Martiello Mastelini
Abstract:	Forecasting financial time series is considered to be a difficult task due to the chaotic feature of the series. Statistical approaches have shown solid results in some specific problems such as predicting market direction and single-price of stocks; however, with the recent advances in deep learning and big data techniques, new promising options have arises to tackle financial time series forecasting. Moreover, recent literature has shown that employing a combination of statistics and machine learning may improve accuracy in the forecasts in comparison to single solutions. Taking into consideration the mentioned aspects, in this work, we proposed the MegazordNet, a framework that explores statistical features within a financial series combined with a structured deep learning model for time series forecasting. We evaluated our approach predicting the closing price of stocks in the S&P 500 using different metrics, and we were able to beat single statistical and machine learning methods.
Date:	2021–06
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2107.01017&r=

Effectiveness of Artificial Intelligence in Stock Market Prediction based on Machine Learning

By:	Sohrab Mokhtari; Kang K. Yen; Jin Liu
Abstract:	This paper tries to address the problem of stock market prediction leveraging artificial intelligence (AI) strategies. The stock market prediction can be modeled based on two principal analyses called technical and fundamental. In the technical analysis approach, the regression machine learning (ML) algorithms are employed to predict the stock price trend at the end of a business day based on the historical price data. In contrast, in the fundamental analysis, the classification ML algorithms are applied to classify the public sentiment based on news and social media. In the technical analysis, the historical price data is exploited from Yahoo Finance, and in fundamental analysis, public tweets on Twitter associated with the stock market are investigated to assess the impact of sentiments on the stock market's forecast. The results show a median performance, implying that with the current technology of AI, it is too soon to claim AI can beat the stock markets.
Date:	2021–06
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2107.01031&r=

Machine Learning for Financial Forecasting, Planning and Analysis: Recent Developments and Pitfalls

By:	Helmut Wasserbacher; Martin Spindler
Abstract:	This article is an introduction to machine learning for financial forecasting, planning and analysis (FP\&A). Machine learning appears well suited to support FP\&A with the highly automated extraction of information from large amounts of data. However, because most traditional machine learning techniques focus on forecasting (prediction), we discuss the particular care that must be taken to avoid the pitfalls of using them for planning and resource allocation (causal inference). While the naive application of machine learning usually fails in this context, the recently developed double machine learning framework can address causal questions of interest. We review the current literature on machine learning in FP\&A and illustrate in a simulation study how machine learning can be used for both forecasting and planning. We also investigate how forecasting and planning improve as the number of data points increases.
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2107.04851&r=

Stock price prediction using BERT and GAN

By:	Priyank Sonkiya; Vikas Bajpai; Anukriti Bansal
Abstract:	The stock market has been a popular topic of interest in the recent past. The growth in the inflation rate has compelled people to invest in the stock and commodity markets and other areas rather than saving. Further, the ability of Deep Learning models to make predictions on the time series data has been proven time and again. Technical analysis on the stock market with the help of technical indicators has been the most common practice among traders and investors. One more aspect is the sentiment analysis - the emotion of the investors that shows the willingness to invest. A variety of techniques have been used by people around the globe involving basic Machine Learning and Neural Networks. Ranging from the basic linear regression to the advanced neural networks people have experimented with all possible techniques to predict the stock market. It's evident from recent events how news and headlines affect the stock markets and cryptocurrencies. This paper proposes an ensemble of state-of-the-art methods for predicting stock prices. Firstly sentiment analysis of the news and the headlines for the company Apple Inc, listed on the NASDAQ is performed using a version of BERT, which is a pre-trained transformer model by Google for Natural Language Processing (NLP). Afterward, a Generative Adversarial Network (GAN) predicts the stock price for Apple Inc using the technical indicators, stock indexes of various countries, some commodities, and historical prices along with the sentiment scores. Comparison is done with baseline models like - Long Short Term Memory (LSTM), Gated Recurrent Units (GRU), vanilla GAN, and Auto-Regressive Integrated Moving Average (ARIMA) model.
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2107.09055&r=

Crowdsourcing Artificial Intelligence in Africa: Findings from a Machine Learning Contest

By:	Naudé, Wim (University College Cork); Bray, Amy (Zindi); Lee, Celina (Zindi)
Abstract:	In this paper, we study the crowdsourcing of innovation in Africa through a data science contest on an intermediated digital platform. We ran a Machine Learning (ML) contest on the continent's largest data science contest platform, Zindi. Contestants were surveyed on their motivations to take part and their perceptions about AI in Africa. In total, 614 contestants submitted 15,832 entries, and 559 responded to the accompanying survey. From the findings, we answered several questions: who take part in these contests and why? Who is most likely to win? What are contestants' entrepreneurial aspirations in deploying AI? What are the obstacles they perceive to the greater diffusion of AI in Africa? We conclude that crowdsourcing of AI via data contest platforms offers a potential mechanism to alleviate some of the constraints in the adoption and diffusion of AI in Africa. Recommendations for further research are made.
Keywords:	crowdsourcing, innovation, data science, artificial intelligence, Africa
JEL:	O31 O33 O36 O55
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:iza:izadps:dp14545&r=

Application of deep reinforcement learning for Indian stock trading automation

By:	Supriya Bajpai
Abstract:	In stock trading, feature extraction and trading strategy design are the two important tasks to achieve long-term benefits using machine learning techniques. Several methods have been proposed to design trading strategy by acquiring trading signals to maximize the rewards. In the present paper the theory of deep reinforcement learning is applied for stock trading strategy and investment decisions to Indian markets. The experiments are performed systematically with three classical Deep Reinforcement Learning models Deep Q-Network, Double Deep Q-Network and Dueling Double Deep Q-Network on ten Indian stock datasets. The performance of the models are evaluated and comparison is made.
Date:	2021–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2106.16088&r=

A Sparsity Algorithm with Applications to Corporate Credit Rating

By:	Dan Wang; Zhi Chen; Ionut Florescu
Abstract:	In Artificial Intelligence, interpreting the results of a Machine Learning technique often termed as a black box is a difficult task. A counterfactual explanation of a particular "black box" attempts to find the smallest change to the input values that modifies the prediction to a particular output, other than the original one. In this work we formulate the problem of finding a counterfactual explanation as an optimization problem. We propose a new "sparsity algorithm" which solves the optimization problem, while also maximizing the sparsity of the counterfactual explanation. We apply the sparsity algorithm to provide a simple suggestion to publicly traded companies in order to improve their credit ratings. We validate the sparsity algorithm with a synthetically generated dataset and we further apply it to quarterly financial statements from companies in financial, healthcare and IT sectors of the US market. We provide evidence that the counterfactual explanation can capture the nature of the real statement features that changed between the current quarter and the following quarter when ratings improved. The empirical results show that the higher the rating of a company the greater the "effort" required to further improve credit rating.
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2107.10306&r=

Predicting Exporters with Machine Learning

By:	Francesca Micocci (IMT School for Advanced Studies Lucca); Armando Rungi (IMT School for advanced studies)
Abstract:	In this contribution, we exploit machine learning techniques to predict out-of-sample firms' ability to export based on the financial accounts of both exporters and non-exporters. Therefore, we show how forecasts can be used as exporting scores, i.e., to measure the distance of non-exporters from export status. For our purpose, we train and test various algorithms on the financial reports of 57,021 manufacturing firms in France in 2010-2018. We find that a Bayesian Additive Regression Tree with Missingness In Attributes (BART-MIA) performs better than other techniques with a prediction accuracy of up to 0:90. Predictions are robust to changes in definitions of exporters and in the presence of discontinuous exporters. Eventually, we argue that exporting scores can be helpful for trade promotion, trade credit, and to assess firms' competitiveness. For example, back-of-the-envelope estimates show that a representative firm with just below-average exporting scores needs up to 44% more cash resources and up to 2:5 times more capital expenses to reach full export status.
Keywords:	exporting; machine learning; trade promotion; trade finance; competitiveness
JEL:	F17 C53 C55 L21 L25
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:ial:wpaper:3/2021&r=

Investor Behavior Modeling by Analyzing Financial Advisor Notes: A Machine Learning Perspective

By:	Cynthia Pagliaro; Dhagash Mehta; Han-Tai Shiao; Shaofei Wang; Luwei Xiong
Abstract:	Modeling investor behavior is crucial to identifying behavioral coaching opportunities for financial advisors. With the help of natural language processing (NLP) we analyze an unstructured (textual) dataset of financial advisors' summary notes, taken after every investor conversation, to gain first ever insights into advisor-investor interactions. These insights are used to predict investor needs during adverse market conditions; thus allowing advisors to coach investors and help avoid inappropriate financial decision-making. First, we perform topic modeling to gain insight into the emerging topics and trends. Based on this insight, we construct a supervised classification model to predict the probability that an advised investor will require behavioral coaching during volatile market periods. To the best of our knowledge, ours is the first work on exploring the advisor-investor relationship using unstructured data. This work may have far-reaching implications for both traditional and emerging financial advisory service models like robo-advising.
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2107.05592&r=

Deep Learning for Mean Field Games and Mean Field Control with Applications to Finance

By:	Ren\'e Carmona; Mathieu Lauri\`ere
Abstract:	Financial markets and more generally macro-economic models involve a large number of individuals interacting through variables such as prices resulting from the aggregate behavior of all the agents. Mean field games have been introduced to study Nash equilibria for such problems in the limit when the number of players is infinite. The theory has been extensively developed in the past decade, using both analytical and probabilistic tools, and a wide range of applications have been discovered, from economics to crowd motion. More recently the interaction with machine learning has attracted a growing interest. This aspect is particularly relevant to solve very large games with complex structures, in high dimension or with common sources of randomness. In this chapter, we review the literature on the interplay between mean field games and deep learning, with a focus on three families of methods. A special emphasis is given to financial applications.
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2107.04568&r=

A data-driven explainable case-based reasoning approach for financial risk detection

By:	Li, Wei; Paraschiv, Florentina; Sermpinis, Georgios
Abstract:	The rapid development of artificial intelligence methods contributes to their wide applications for forecasting various financial risks in recent years. This study introduces a novel explainable case-based reasoning (CBR) approach without a requirement of rich expertise in financial risk. Compared with other black-box algorithms, the explainable CBR system allows a natural economic interpretation of results. Indeed, the empirical results emphasize the interpretability of the CBR system in predicting financial risk, which is essential for both financial companies and their customers. In addition, results show that the proposed automatic design CBR system has a good prediction performance compared to other artificial intelligence methods, overcoming the main drawback of a standard CBR system of highly depending on prior domain knowledge about the corresponding field.
Keywords:	Case-based reasoning,Financial risk detection,Multiple-criteria decision-making,Feature scoring,Particle swarm optimization,Parallel computing
JEL:	C51 C52 C53 C61 C63 D81 G21 G32
Date:	2021
URL:	http://d.repec.org/n?u=RePEc:zbw:irtgdp:2021010&r=

Double debiased machine learning nonparametric inference with continuous treatments

By:	Kyle Colangelo (Institute for Fiscal Studies); Ying-Ying Lee (Institute for Fiscal Studies)
Abstract:	We propose a nonparametric inference method for causal e?ects of continuous treatment variables, under unconfoundedness and in the presence of high-dimensional or nonparametric nuisance parameters. Our simple kernel-based double debiased machine learning (DML) estimators for the average dose-response function (or the average structural function) and the partial e?ects are asymptotically normal with nonparametric convergence rates. The nuisance estimators for the conditional expectation function and the conditional density can be nonparametric kernel or series estimators or ML methods. Using doubly robust in?uence function and cross-?tting, we give tractable primitive conditions under which the nuisance estimators do not a?ect the ?rst-order large sample distribution of the DML estimators. We implement various ML methods in Monte Carlo simulations and an empirical application on a job training program evaluation to support the theoretical results and demonstrate the usefulness of our DML estimator in practice.
Date:	2019–12–17
URL:	http://d.repec.org/n?u=RePEc:ifs:cemmap:72/19&r=

Visual Time Series Forecasting: An Image-driven Approach

By:	Naftali Cohen; Srijan Sood; Zhen Zeng; Tucker Balch; Manuela Veloso
Abstract:	In this work, we address time-series forecasting as a computer vision task. We capture input data as an image and train a model to produce the subsequent image. This approach results in predicting distributions as opposed to pointwise values. To assess the robustness and quality of our approach, we examine various datasets and multiple evaluation metrics. Our experiments show that our forecasting tool is effective for cyclic data but somewhat less for irregular data such as stock prices. Importantly, when using image-based evaluation metrics, we find our method to outperform various baselines, including ARIMA, and a numerical variation of our deep learning approach.
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2107.01273&r=

Epidemic Exposure, Fintech Adoption, and the Digital Divide

By:	Orkun Saka; Barry Eichengreen; Cevat Giray Aksoy
Abstract:	We ask whether epidemic exposure leads to a shift in financial technology usage within and across countries and if so who participates in this shift. We exploit a dataset combining Gallup World Polls and Global Findex surveys for some 250,000 individuals in 140 countries, merging them with information on the incidence of epidemics and local 3G internet infrastructure. Epidemic exposure is associated with an increase in remote-access (online/mobile) banking and substitution from bank branch-based to ATM-based activity. Using a machine-learning algorithm, we show that heterogeneity in this response centers on the age, income and employment of respondents. Young, high-income earners in full-time employment have the greatest propensity to shift to online/mobile transactions in response to epidemics. These effects are larger for individuals in subnational regions with better ex ante 3G signal coverage, highlighting the role of the digital divide in adaption to new technologies necessitated by adverse external shocks.
Keywords:	epidemics, fintech, banking
JEL:	G20 G59 I10
Date:	2021
URL:	http://d.repec.org/n?u=RePEc:ces:ceswps:_9173&r=

What future for European robotics?

By:	CHARISI Vasiliki (European Commission - JRC); COMPANO Ramon (European Commission - JRC); DUCH BROWN Nestor (European Commission - JRC); GOMEZ GUTIERREZ Emilia (European Commission - JRC); KLENERT David (European Commission - JRC); LUTZ Michael (European Commission - JRC); MARSCHINSKI Robert (European Commission - JRC); TORRECILLA SALINAS Carlos (European Commission - JRC)
Abstract:	Europe is a world-leader in the production of robots. This industry is a key element of the digital transformation of our societies and economies that, combined with Artificial Intelligence (AI), will likely have a tremendous disruptive potential. To explore further the future of the European robotics industry and its related challenges, the Joint Research Centre organized a Science for Policy conference entitled “What future for European Robotics?”, bringing together recognized specialists from industry, academia and policy. This report presents the main conclusions that emerged from the conference.
Keywords:	Robotics, artificial intelligence
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:ipt:iptwpa:jrc125343&r=

Epidemic Exposure, Fintech Adoption, and the Digital Divide

By:	Orkun Saka ⓡ; Barry Eichengreen ⓡ; Cevat Giray Aksoy
Abstract:	We ask whether epidemic exposure leads to a shift in financial technology usage within and across countries and if so who participates in this shift. We exploit a dataset combining Gallup World Polls and Global Findex surveys for some 250,000 individuals in 140 countries, merging them with information on the incidence of epidemics and local 3G internet infrastructure. Epidemic exposure is associated with an increase in remote-access (online/mobile) banking and substitution from bank branch-based to ATM-based activity. Using a machine-learning algorithm, we show that heterogeneity in this response centers on the age, income and employment of respondents. Young, high-income earners in full-time employment have the greatest propensity to shift to online/mobile transactions in response to epidemics. These effects are larger for individuals in subnational regions with better ex ante 3G signal coverage, highlighting the role of the digital divide in adaption to new technologies necessitated by adverse external shocks.
JEL:	G0 G20 G59 I10
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:nbr:nberwo:29006&r=

Data Sharing Markets

By:	Mohammad Rasouli; Michael I. Jordan
Abstract:	With the growing use of distributed machine learning techniques, there is a growing need for data markets that allows agents to share data with each other. Nevertheless data has unique features that separates it from other commodities including replicability, cost of sharing, and ability to distort. We study a setup where each agent can be both buyer and seller of data. For this setup, we consider two cases: bilateral data exchange (trading data with data) and unilateral data exchange (trading data with money). We model bilateral sharing as a network formation game and show the existence of strongly stable outcome under the top agents property by allowing limited complementarity. We propose ordered match algorithm which can find the stable outcome in O(N^2) (N is the number of agents). For the unilateral sharing, under the assumption of additive cost structure, we construct competitive prices that can implement any social welfare maximizing outcome. Finally for this setup when agents have private information, we propose mixed-VCG mechanism which uses zero cost data distortion of data sharing with its isolated impact to achieve budget balance while truthfully implementing socially optimal outcomes to the exact level of budget imbalance of standard VCG mechanisms. Mixed-VCG uses data distortions as data money for this purpose. We further relax zero cost data distortion assumption by proposing distorted-mixed-VCG. We also extend our model and results to data sharing via incremental inquiries and differential privacy costs.
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2107.08630&r=

Deep calibration of the quadratic rough Heston model

By:	Mathieu Rosenbaum; Jianfei Zhang
Abstract:	The quadratic rough Heston model provides a natural way to encode Zumbach effect in the rough volatility paradigm. We apply multi-factor approximation and use deep learning methods to build an efficient calibration procedure for this model. We show that the model is able to reproduce very well both SPX and VIX implied volatilities. We typically obtain VIX option prices within the bid-ask spread and an excellent fit of the SPX at-the-money skew. Moreover, we also explain how to use the trained neural networks for hedging with instantaneous computation of hedging quantities.
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2107.01611&r=

Flexible Covariate Adjustments in Regression Discontinuity Designs

By:	Claudia Noack; Tomasz Olma; Christoph Rothe
Abstract:	Empirical regression discontinuity (RD) studies often use covariates to increase the precision of their estimates. In this paper, we propose a novel class of estimators that use such covariate information more efficiently than the linear adjustment estimators that are currently used widely in practice. Our approach can accommodate a possibly large number of either discrete or continuous covariates. It involves running a standard RD analysis with an appropriately modified outcome variable, which takes the form of the difference between the original outcome and a function of the covariates. We characterize the function that leads to the estimator with the smallest asymptotic variance, and show how it can be estimated via modern machine learning, nonparametric regression, or classical parametric methods. The resulting estimator is easy to implement, as tuning parameters can be chosen as in a conventional RD analysis. An extensive simulation study illustrates the performance of our approach.
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2107.07942&r=

Deep Risk Model: A Deep Learning Solution for Mining Latent Risk Factors to Improve Covariance Matrix Estimation

By:	Hengxu Lin; Dong Zhou; Weiqing Liu; Jiang Bian
Abstract:	Modeling and managing portfolio risk is perhaps the most important step to achieve growing and preserving investment performance. Within the modern portfolio construction framework that built on Markowitz's theory, the covariance matrix of stock returns is required to model the portfolio risk. Traditional approaches to estimate the covariance matrix are based on human designed risk factors, which often requires tremendous time and effort to design better risk factors to improve the covariance estimation. In this work, we formulate the quest of mining risk factors as a learning problem and propose a deep learning solution to effectively "design" risk factors with neural networks. The learning objective is carefully set to ensure the learned risk factors are effective in explaining stock returns as well as have desired orthogonality and stability. Our experiments on the stock market data demonstrate the effectiveness of the proposed method: our method can obtain $1.9\%$ higher explained variance measured by $R^2$ and also reduce the risk of a global minimum variance portfolio. Incremental analysis further supports our design of both the architecture and the learning objective.
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2107.05201&r=

Numerical approximation of singular Forward-Backward SDEs

By:	Jean-Fran\c{c}ois Chassagneux; Mohan Yang
Abstract:	In this work, we study the numerical approximation of a class of singular fully coupled forward backward stochastic differential equations. These equations have a degenerate forward component and non-smooth terminal condition. They are used, for example, in the modeling of carbon market[9] and are linked to scalar conservation law perturbed by a diffusion. Classical FBSDEs methods fail to capture the correct entropy solution to the associated quasi-linear PDE. We introduce a splitting approach that circumvent this difficulty by treating differently the numerical approximation of the diffusion part and the non-linear transport part. Under the structural condition guaranteeing the well-posedness of the singular FBSDEs [8], we show that the splitting method is convergent with a rate $1/2$. We implement the splitting scheme combining non-linear regression based on deep neural networks and conservative finite difference schemes. The numerical tests show very good results in possibly high dimensional framework.
Date:	2021–06
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2106.15496&r=

Tracking ECB's communication: Perspectives and Implications for Financial Markets

By:	FORTES, Roberta; Le Guenedal, Theo
Abstract:	This article assesses the communication of the European Central Bank (ECB) using Natural Language Processing (NLP) techniques. We show the evolution of discourse over time and capture the main themes of interest for the central bank that go beyond its traditional mandate of maintaining price stability, enlightening main concerns and themes of discussion among board members. We also built sentiment signals compatible with any form of language, both formal and informal, an important step as the ECB aims to enhance communication with non-expert audiences. In a second step, we measure the impact of the ECB's communication on the EUR/USD exchange rate. We found that our quantitative series, both topics and sentiment, improve financial-linked models consistently in all periods analyzed (2.5\% on average). Meaningful signals comprise a broad range of subjects and vary in time. This suggests that overall ECB's talk matters for asset prices, including themes not directly related to monetary policy. This result is particularly important in a context in which the ECB, as well as other major central banks, are moving towards integrating issues closer to the society into their scope of action, implying that subjects, which were considered peripheral, may become central. This emphasizes the importance for markets to effectively track central banks' communication to improve investment processes.
Keywords:	Quantitative trading, Central Bank, Fixed Income,Exchange Rates, Text mining, NLP, Euro
JEL:	C38 C63 E44 F3 F31 G12
Date:	2020–12
URL:	http://d.repec.org/n?u=RePEc:pra:mprapa:108746&r=

Pseudo-Model-Free Hedging for Variable Annuities via Deep Reinforcement Learning

By:	Wing Fung Chong; Haoen Cui; Yuxuan Li
Abstract:	This paper applies a deep reinforcement learning approach to revisit the hedging problem of variable annuities. Instead of assuming actuarial and financial dual-market model a priori, the reinforcement learning agent learns how to hedge by collecting anchor-hedging reward signals through interactions with the market. By the recently advanced proximal policy optimization, the pseudo-model-free reinforcement learning agent performs equally well as the correct Delta, while outperforms the misspecified Deltas. The reinforcement learning agent is also integrated with online learning to demonstrate its full adaptive capability to the market.
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2107.03340&r=

Emotions in Macroeconomic News and their Impact on the European Bond Market

By:	Sergio Consoli; Luca Tiozzo Pezzoli; Elisa Tosetti
Abstract:	We show how emotions extracted from macroeconomic news can be used to explain and forecast future behaviour of sovereign bond yield spreads in Italy and Spain. We use a big, open-source, database known as Global Database of Events, Language and Tone to construct emotion indicators of bond market affective states. We find that negative emotions extracted from news improve the forecasting power of government yield spread models during distressed periods even after controlling for the number of negative words present in the text. In addition, stronger negative emotions, such as panic, reveal useful information for predicting changes in spread at the short-term horizon, while milder emotions, such as distress, are useful at longer time horizons. Emotions generated by the Italian political turmoil propagate to the Spanish news affecting this neighbourhood market.
Date:	2021–06
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2106.15698&r=

Modelling Clusters From The Ground Up: A Web Data Approach

By:	Stich, Christoph; Tranos, Emmanouil; Nathan, Max (UCL)
Abstract:	This paper proposes a new methodological framework to identify economic clusters over space and time. We employ a unique open source dataset of geolocated and archived business webpages and interrogate them using Natural Language Processing to build bottom-up classi- fications of economic activities. We validate our method on an iconic UK tech cluster – Shoreditch, East London. We benchmark our results against existing case studies and admin- istrative data, replicating the main features of the cluster and providing fresh insights. As well as overcoming limitations in conventional industrial classification, our method addresses some of the spatial and temporal limitations of the clustering literature.
Date:	2021–05–02
URL:	http://d.repec.org/n?u=RePEc:osf:socarx:j2w8v&r=

The Role of "Live" in Livestreaming Markets: Evidence Using Orthogonal Random Forest

By:	Ziwei Cong; Jia Liu; Puneet Manchanda
Abstract:	The common belief about the growing medium of livestreaming is that its value lies in its "live" component. In this paper, we leverage data from a large livestreaming platform to examine this belief. We are able to do this as this platform also allows viewers to purchase the recorded version of the livestream. We summarize the value of livestreaming content by estimating how demand responds to price before, on the day of, and after the livestream. We do this by proposing a generalized Orthogonal Random Forest framework. This framework allows us to estimate heterogeneous treatment effects in the presence of high-dimensional confounders whose relationships with the treatment policy (i.e., price) are complex but partially known. We find significant dynamics in the price elasticity of demand over the temporal distance to the scheduled livestreaming day and after. Specifically, demand gradually becomes less price sensitive over time to the livestreaming day and is inelastic on the livestreaming day. Over the post-livestream period, demand is still sensitive to price, but much less than the pre-livestream period. This indicates that the vlaue of livestreaming persists beyond the live component. Finally, we provide suggestive evidence for the likely mechanisms driving our results. These are quality uncertainty reduction for the patterns pre- and post-livestream and the potential of real-time interaction with the creator on the day of the livestream.
Date:	2021–07
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2107.01629&r=

This nep-big issue is ©2021 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.