nep-big 2022-01-24 papers

on Big Data

Issue of 2022‒01‒24
twenty-six papers chosen by
Tom Coupé
University of Canterbury

Using the Google Places API and Google Trends Data to Develop High Frequency Indicators of Economic Activity By Mr. Marco Marini; Mr. Paul A Austin; James Tebrake; Alberto Sanchez; Chima Simpson-Bell
Another Piece of the Puzzle: Adding Swift Data on Documentary Collections to the Short-Term Forecast of World Trade By Mr. Alexei Goumilevski; Narek Ghazaryan; Aneta Radzikowski; Mr. Joannes Mongardini
Finding General Equilibria in Many-Agent Economic Simulations Using Deep Reinforcement Learning By Michael Curry; Alexander Trott; Soham Phade; Yu Bai; Stephan Zheng
Market efficiency in the age of big data By Martin, Ian W.R.; Nagel, Stefan
Estimating economic severity of Air Traffic Flow Management regulations By Luis Delgado; G\'erald Gurtner; Tatjana Boli\'c; Lorenzo Castelli
COVID-19 Forecasts via Stock Market Indicators By Yi Liang; James Unwin
A level-set approach to the control of state-constrained McKean-Vlasov equations: application to renewable energy storage and portfolio selection By Maximilien Germain; Huy\^en Pham; Xavier Warin
Neural Networks for Delta Hedging By Guijin Son; Joocheol Kim
Learning in Random Utility Models Via Online Decision Problems By Emerson Melo
FinTech Development in Greater Manchester: An Overview By Miglo, Anton
Housing Price Prediction Model Selection Based on Lorenz and Concentration Curves: Empirical Evidence from Tehran Housing Market By Mohammad Mirbagherijam
Deep differentiable reinforcement learning and optimal trading By Thibault Jaisson
Using maps to predict economic activity By Imryoung Jeong; Hyunjoo Yang
People-centric Emission Reduction in Buildings: A Data-driven and Network Topology-based Investigation By Debnath, R.; Bardhan, R.; Mohaddes, K.; Shah, D. U.; Ramage, M. H.; Alvarez, R. M.
A Finite Sample Theorem for Longitudinal Causal Inference with Machine Learning: Long Term, Dynamic, and Mediated Effects By Rahul Singh
Deep Quantile and Deep Composite Model Regression By Tobias Fissler; Michael Merz; Mario V. W\"uthrich
Denoised Labels for Financial Time-Series Data via Self-Supervised Learning By Yanqing Ma; Carmine Ventre; Maria Polukarov
Differentiating Approach and Avoidance from Traditional Notions of Sentiment in Economic Contexts By Jacob Turton; Ali Kabiri; David Tuckett; Robert Elliott Smith; David P. Vinson
A methodology for linking the Energy-related Policies of the European Green Deal to the 17 SDGs using Machine Learning By Phoebe Koundouri; Nicolaos Theodossiou; Charalampos Stavridis; Stathis Devves; Angelos Plataniotis
Compiling Granular Population Data Using Geospatial Information By Mitterling, Thomas; Fenz, Katharina; Martinez Jr, Arturo; Bulan, Joseph; Addawe, Mildred; Durante, Ron Lester; Martillan, Marymell
Importance sampling for option pricing with feedforward neural networks By Aleksandar Arandjelovi\'c; Thorsten Rheinl\"ander; Pavel V. Shevchenko
The Employment in Innovative Enterprises in Europe By Laureti, Lucio; Costantiello, Alberto; Matarrese, Marco Maria; Leogrande, Angelo
Using Neural Networks to Predict Micro-Spatial Economic Growth By Arman Khachiyan; Anthony Thomas; Huye Zhou; Gordon H. Hanson; Alex Cloninger; Tajana Rosing; Amit Khandelwal
Data-driven integration of regularized mean-variance portfolios By Andrew Butler; Roy H. Kwon
Business Closures and (Re)Openings in Real Time Using Google Places By Thibaut Duprey; Daniel E. Rigobon; Philip Schnattinger; Artur Kotlicki; Soheil Baharian; T. R. Hurd
“An application of deep learning for exchange rate forecasting” By Oscar Claveria; Enric Monte; Petar Soric; Salvador Torra

Using the Google Places API and Google Trends Data to Develop High Frequency Indicators of Economic Activity

By:	Mr. Marco Marini; Mr. Paul A Austin; James Tebrake; Alberto Sanchez; Chima Simpson-Bell
Abstract:	As the pandemic heigthened policymakers’ demand for more frequent and timely indicators to assess economic activities, traditional data collection and compilation methods to produce official indicators are falling short—triggering stronger interest in real time data to provide early signals of turning points in economic activity. In this paper, we examine how data extracted from the Google Places API and Google Trends can be used to develop high frequency indicators aligned to the statistical concepts, classifications, and definitions used in producing official measures. The approach is illustrated by use of Google data-derived indicators that predict well the GDP trajectories of selected countries during the early stage of COVID-19. To this end, we developed a methodological toolkit for national compilers interested in using Google data to enhance the timeliness and frequency of economic indicators.
Keywords:	Reopening, COVID-19, High-Frequency Data, Business Register.
Date:	2021–12–17
URL:	http://d.repec.org/n?u=RePEc:imf:imfwpa:2021/295&r=

Another Piece of the Puzzle: Adding Swift Data on Documentary Collections to the Short-Term Forecast of World Trade

By:	Mr. Alexei Goumilevski; Narek Ghazaryan; Aneta Radzikowski; Mr. Joannes Mongardini
Abstract:	This paper extends earlier research by adding SWIFT data on documentary collections to the short-term forecast of international trade. While SWIFT documentary collections accounted for just over one percent of world trade financing in 2020, they have strong explanatory power to forecast world trade and national trade in selected economies. The informational content from documentary collections helps improve the forecast of world trade, while a horse race with machine learning algorithms shows significant non-linearities between trade and its determinants during the Covid-19 pandemic.
Keywords:	SWIFT; trade forecast; machine learning
Date:	2021–12–17
URL:	http://d.repec.org/n?u=RePEc:imf:imfwpa:2021/293&r=

Finding General Equilibria in Many-Agent Economic Simulations Using Deep Reinforcement Learning

By:	Michael Curry; Alexander Trott; Soham Phade; Yu Bai; Stephan Zheng
Abstract:	Real economies can be seen as a sequential imperfect-information game with many heterogeneous, interacting strategic agents of various agent types, such as consumers, firms, and governments. Dynamic general equilibrium models are common economic tools to model the economic activity, interactions, and outcomes in such systems. However, existing analytical and computational methods struggle to find explicit equilibria when all agents are strategic and interact, while joint learning is unstable and challenging. Amongst others, a key reason is that the actions of one economic agent may change the reward function of another agent, e.g., a consumer's expendable income changes when firms change prices or governments change taxes. We show that multi-agent deep reinforcement learning (RL) can discover stable solutions that are epsilon-Nash equilibria for a meta-game over agent types, in economic simulations with many agents, through the use of structured learning curricula and efficient GPU-only simulation and training. Conceptually, our approach is more flexible and does not need unrealistic assumptions, e.g., market clearing, that are commonly used for analytical tractability. Our GPU implementation enables training and analyzing economies with a large number of agents within reasonable time frames, e.g., training completes within a day. We demonstrate our approach in real-business-cycle models, a representative family of DGE models, with 100 worker-consumers, 10 firms, and a government who taxes and redistributes. We validate the learned meta-game epsilon-Nash equilibria through approximate best-response analyses, show that RL policies align with economic intuitions, and that our approach is constructive, e.g., by explicitly learning a spectrum of meta-game epsilon-Nash equilibria in open RBC models.
Date:	2022–01
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2201.01163&r=

Market efficiency in the age of big data

By:	Martin, Ian W.R.; Nagel, Stefan
Abstract:	Modern investors face a high-dimensional prediction problem: thousands of observable variables are potentially relevant for forecasting. We reassess the conventional wisdom on market efficiency in light of this fact. In our equilibrium model, N assets have cash flows that are linear in J characteristics, with unknown coefficients. Risk-neutral Bayesian investors learn these coefficients and determine market prices. If J and N are comparable in size, returns are cross-sectionally predictable ex post. In-sample tests of market efficiency reject the no-predictability null with high probability, even though investors use information optimally in real time. In contrast, out-of-sample tests retain their economic meaning.
Keywords:	Bayesian learning; high-dimensional prediction problems; return predictability; out-of-sample tests; Starting Grant 639744; Center for Research in Security Prices
JEL:	G14 G12 C11
Date:	2021–11–27
URL:	http://d.repec.org/n?u=RePEc:ehl:lserod:112960&r=

Estimating economic severity of Air Traffic Flow Management regulations

By:	Luis Delgado; G\'erald Gurtner; Tatjana Boli\'c; Lorenzo Castelli
Abstract:	The development of trajectory-based operations and the rolling network operations plan in European air traffic management network implies a move towards more collaborative, strategic flight planning. This opens up the possibility for inclusion of additional information in the collaborative decision-making process. With that in mind, we define the indicator for the economic risk of network elements (e.g., sectors or airports) as the expected costs that the elements impose on airspace users due to Air Traffic Flow Management (ATFM) regulations. The definition of the indicator is based on the analysis of historical ATFM regulations data, that provides an indication of the risk of accruing delay. This risk of delay is translated into a monetary risk for the airspace users, creating the new metric of the economic risk of a given airspace element. We then use some machine learning techniques to find the parameters leading to this economic risk. The metric is accompanied by an indication of the accuracy of the delay cost prediction model. Lastly, the economic risk is transformed into a qualitative economic severity classification. The economic risks and consequently economic severity can be estimated for different temporal horizons and time periods providing an indicator which can be used by Air Navigation Service Providers to identify areas which might need the implementation of strategic measures (e.g., resectorisation or capacity provision change), and by Airspace Users to consider operation of routes which use specific airspace regions.
Date:	2021–12
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2112.11263&r=

COVID-19 Forecasts via Stock Market Indicators

By:	Yi Liang; James Unwin
Abstract:	Reliable short term forecasting can provide potentially lifesaving insights into logistical planning, and in particular, into the optimal allocation of resources such as hospital staff and equipment. By reinterpreting COVID-19 daily cases in terms of candlesticks, we are able to apply some of the most popular stock market technical indicators to obtain predictive power over the course of the pandemics. By providing a quantitative assessment of MACD, RSI, and candlestick analyses, we show their statistical significance in making predictions for both stock market data and WHO COVID-19 data. In particular, we show the utility of this novel approach by considering the identification of the beginnings of subsequent waves of the pandemic. Finally, our new methods are used to assess whether current health policies are impacting the growth in new COVID-19 cases.
Date:	2021–12
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2112.06393&r=

A level-set approach to the control of state-constrained McKean-Vlasov equations: application to renewable energy storage and portfolio selection

By:	Maximilien Germain (EDF R&D OSIRIS, EDF R&D, EDF, LPSM); Huy\^en Pham (LPSM, CREST, FiME Lab); Xavier Warin (EDF R&D OSIRIS, EDF R&D, EDF, FiME Lab)
Abstract:	We consider the control of McKean-Vlasov dynamics (or mean-field control) with probabilistic state constraints. We rely on a level-set approach which provides a representation of the constrained problem in terms of an unconstrained one with exact penalization and running maximum or integral cost. The method is then extended to the common noise setting. Our work extends (Bokanowski, Picarelli, and Zidani, SIAM J. Control Optim. 54.5 (2016), pp. 2568--2593) and (Bokanowski, Picarelli, and Zidani, Appl. Math. Optim. 71 (2015), pp. 125--163) to a mean-field setting. The reformulation as an unconstrained problem is particularly suitable for the numerical resolution of the problem, that is achieved from an extension of a machine learning algorithm from (Carmona, Lauri{\`e}re, arXiv:1908.01613 to appear in Ann. Appl. Prob., 2019). A first application concerns the storage of renewable electricity in the presence of mean-field price impact and another one focuses on a mean-variance portfolio selection problem with probabilistic constraints on the wealth. We also illustrate our approach for a direct numerical resolution of the primal Markowitz continuous-time problem without relying on duality.
Date:	2021–12
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2112.11059&r=

Neural Networks for Delta Hedging

By:	Guijin Son; Joocheol Kim
Abstract:	The Black-Scholes model, defined under the assumption of a perfect financial market, theoretically creates a flawless hedging strategy allowing the trader to evade risks in a portfolio of options. However, the concept of a "perfect financial market," which requires zero transaction and continuous trading, is challenging to meet in the real world. Despite such widely known limitations, academics have failed to develop alternative models successful enough to be long-established. In this paper, we explore the landscape of Deep Neural Networks(DNN) based hedging systems by testing the hedging capacity of the following neural architectures: Recurrent Neural Networks, Temporal Convolutional Networks, Attention Networks, and Span Multi-Layer Perceptron Networks. In addition, we attempt to achieve even more promising results by combining traditional derivative hedging models with DNN based approaches. Lastly, we construct \textbf{NNHedge}, a deep learning framework that provides seamless pipelines for model development and assessment for the experiments.
Date:	2021–12
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2112.10084&r=

Learning in Random Utility Models Via Online Decision Problems

By:	Emerson Melo
Abstract:	This paper studies the Random Utility Model (RUM) in environments where the decision maker is imperfectly informed about the payoffs associated to each of the alternatives he faces. By embedding the RUM into an online decision problem, we make four contributions. First, we propose a gradient-based learning algorithm and show that a large class of RUMs are Hannan consistent (\citet{Hahn1957}); that is, the average difference between the expected payoffs generated by a RUM and that of the best fixed policy in hindsight goes to zero as the number of periods increase. Second, we show that the class of Generalized Extreme Value (GEV) models can be implemented with our learning algorithm. Examples in the GEV class include the Nested Logit, Ordered, and Product Differentiation models among many others. Third, we show that our gradient-based algorithm is the dual, in a convex analysis sense, of the Follow the Regularized Leader (FTRL) algorithm, which is widely used in the Machine Learning literature. Finally, we discuss how our approach can incorporate recency bias and be used to implement prediction markets in general environments.
Date:	2021–12
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2112.10993&r=

FinTech Development in Greater Manchester: An Overview

By:	Miglo, Anton
Abstract:	This article analyzes the patterns of Fintech development in Greater Manchester, UK. Manchester is often called a northern capital of Fintech. We analyze different subsectors of FinTech and find that such sectors as payments, fintech loans, debt-based, reward-based and real-estate-based crowdfunding, big data analytics, data security, insurtech and regtech are the most growing areas. We also compare the Fintech structure in Manchester with that in London and other major cities in the UK and identify similarities and differences.
Keywords:	FinTech, cryptocurrencies, digital finance, crowdfunding, Fintech in Manchester, data security
JEL:	G00 G10 G32 O33
Date:	2022–01–02
URL:	http://d.repec.org/n?u=RePEc:pra:mprapa:111348&r=

Housing Price Prediction Model Selection Based on Lorenz and Concentration Curves: Empirical Evidence from Tehran Housing Market

By:	Mohammad Mirbagherijam
Abstract:	This study contributes a house price prediction model selection in Tehran City based on the area between Lorenz curve (LC) and concentration curve (CC) of the predicted price by using 206,556 observed transaction data over the period from March 21, 2018, to February 19, 2021. Several different methods such as generalized linear models (GLM) and recursive partitioning and regression trees (RPART), random forests (RF) regression models, and neural network (NN) models were examined house price prediction. We used 90% of all data samples which were chosen randomly to estimate the parameters of pricing models and 10% of remaining datasets to test the accuracy of prediction. Results showed that the area between the LC and CC curves (which are known as ABC criterion) of real and predicted prices in the test data sample of the random forest regression model was less than by other models under study. The comparison of the calculated ABC criteria leads us to conclude that the nonlinear regression models such as RF regression models give an accurate prediction of house prices in Tehran City.
Date:	2021–12
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2112.06192&r=

Deep differentiable reinforcement learning and optimal trading

By:	Thibault Jaisson
Abstract:	In this article we introduce the differentiable reinforcement learning framework. It is based on the fact that in many reinforcement learning applications, the environment reward and transition functions are not black boxes but known differentiable functions. Incorporating deep learning in this framework we find more accurate and stable solutions than more generic actor critic algorithms. We apply this deep differentiable reinforcement learning (DDRL) algorithm to the problem of optimal trading strategies in various environments where the market dynamics are known. Thanks to the stability of this method, we are able to efficiently find optimal strategies for complex multi-scale market models and for a wide range of environment parameters. This makes it applicable to real life financial signals and portfolio optimization where the expected return has multiple time scales. In the case of a slow and a fast alpha signal, we find that the optimal trading strategy consists in using the fast signal to time the trades associated to the slow signal.
Date:	2021–12
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2112.02944&r=

Using maps to predict economic activity

By:	Imryoung Jeong; Hyunjoo Yang
Abstract:	We introduce a novel machine learning approach to leverage historical and contemporary maps to systematically predict economic statistics. Remote sensing data have been used as reliable proxies for local economic activity. However, they have only become available in recent years, thus limiting their applicability for long-term analysis. Historical maps, on the other hand, date back several decades. Our simple algorithm extracts meaningful features from the maps based on their color compositions. The grid-level population predictions by our approach outperform the conventional CNN-based predictions using raw map images. It also predicts population better than other approaches using night light satellite images or land cover classifications as the input for predictions.
Date:	2021–12
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2112.13850&r=

People-centric Emission Reduction in Buildings: A Data-driven and Network Topology-based Investigation

By:	Debnath, R.; Bardhan, R.; Mohaddes, K.; Shah, D. U.; Ramage, M. H.; Alvarez, R. M.
Abstract:	There is a growing consensus among policymakers that we need a human-centric low-carbon transition. There are few studies on how to do it effectively, especially in the context of emissions reduction in the building sector. It is critical to investigate public sentiment and attitudes towards this aspect of climate action, as the building and construction sector accounts for 40% of global carbon emissions. Our methodology involves a multi-method approach, using a data-driven exploration of public sentiment using 256,717 tweets containing #emission and #building between 2009 - 2021. Using graph theory-led metrics, a network topology-based investigation of hashtag co-occurrences was used to extract highly influential hashtags. Our results show that public sentiment is reactive to global climate policy events. Between 2009-2012, #greenbuilding, #emissions were highly influential, shaping the public discourse towards climate action. In 2013-2016, #lowcarbon, #construction and #energyefficiency had high centrality scores, which were replaced by hashtags like #climatetec, #netzero, #climateaction, #circulareconomy, and #masstimber, #climatejustice in 2017-2021. Results suggest that the current building emission reduction context emphasises the social and environmental justice dimensions, which is pivotal to an effective people-centric policymaking.
Keywords:	Emission, climate change, building, computational social science, people-centric transition, Twitter
JEL:	C63 Q54
Date:	2022–01–05
URL:	http://d.repec.org/n?u=RePEc:cam:camdae:2202&r=

A Finite Sample Theorem for Longitudinal Causal Inference with Machine Learning: Long Term, Dynamic, and Mediated Effects

By:	Rahul Singh
Abstract:	I construct and justify confidence intervals for longitudinal causal parameters estimated with machine learning. Longitudinal parameters include long term, dynamic, and mediated effects. I provide a nonasymptotic theorem for any longitudinal causal parameter estimated with any machine learning algorithm that satisfies a few simple, interpretable conditions. The main result encompasses local parameters defined for specific demographics as well as proximal parameters defined in the presence of unobserved confounding. Formally, I prove consistency, Gaussian approximation, and semiparametric efficiency. The rate of convergence is $n^{-1/2}$ for global parameters, and it degrades gracefully for local parameters. I articulate a simple set of conditions to translate mean square rates into statistical inference. A key feature of the main result is a new multiple robustness to ill posedness for proximal causal inference in longitudinal settings.
Date:	2021–12
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2112.14249&r=

Deep Quantile and Deep Composite Model Regression

By:	Tobias Fissler; Michael Merz; Mario V. W\"uthrich
Abstract:	A main difficulty in actuarial claim size modeling is that there is no simple off-the-shelf distribution that simultaneously provides a good distributional model for the main body and the tail of the data. In particular, covariates may have different effects for small and for large claim sizes. To cope with this problem, we introduce a deep composite regression model whose splicing point is given in terms of a quantile of the conditional claim size distribution rather than a constant. To facilitate M-estimation for such models, we introduce and characterize the class of strictly consistent scoring functions for the triplet consisting a quantile, as well as the lower and upper expected shortfall beyond that quantile. In a second step, this elicitability result is applied to fit deep neural network regression models. We demonstrate the applicability of our approach and its superiority over classical approaches on a real accident insurance data set.
Date:	2021–12
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2112.03075&r=

Denoised Labels for Financial Time-Series Data via Self-Supervised Learning

By:	Yanqing Ma; Carmine Ventre; Maria Polukarov
Abstract:	The introduction of electronic trading platforms effectively changed the organisation of traditional systemic trading from quote-driven markets into order-driven markets. Its convenience led to an exponentially increasing amount of financial data, which is however hard to use for the prediction of future prices, due to the low signal-to-noise ratio and the non-stationarity of financial time series. Simpler classification tasks -- where the goal is to predict the directions of future price movement -- via supervised learning algorithms, need sufficiently reliable labels to generalise well. Labelling financial data is however less well defined than other domains: did the price go up because of noise or because of signal? The existing labelling methods have limited countermeasures against noise and limited effects in improving learning algorithms. This work takes inspiration from image classification in trading and success in self-supervised learning. We investigate the idea of applying computer vision techniques to financial time-series to reduce the noise exposure and hence generate correct labels. We look at the label generation as the pretext task of a self-supervised learning approach and compare the naive (and noisy) labels, commonly used in the literature, with the labels generated by a denoising autoencoder for the same downstream classification task. Our results show that our denoised labels improve the performances of the downstream learning algorithm, for both small and large datasets. We further show that the signals we obtain can be used to effectively trade with binary strategies. We suggest that with proposed techniques, self-supervised learning constitutes a powerful framework for generating "better" financial labels that are useful for studying the underlying patterns of the market.
Date:	2021–12
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2112.10139&r=

Differentiating Approach and Avoidance from Traditional Notions of Sentiment in Economic Contexts

By:	Jacob Turton; Ali Kabiri; David Tuckett; Robert Elliott Smith; David P. Vinson
Abstract:	There is growing interest in the role of sentiment in economic decision-making. However, most research on the subject has focused on positive and negative valence. Conviction Narrative Theory (CNT) places Approach and Avoidance sentiment (that which drives action) at the heart of real-world decision-making, and argues that it better captures emotion in financial markets. This research, bringing together psychology and machine learning, introduces new techniques to differentiate Approach and Avoidance from positive and negative sentiment on a fundamental level of meaning. It does this by comparing word-lists, previously constructed to capture these concepts in text data, across a large range of semantic features. The results demonstrate that Avoidance in particular is well defined as a separate type of emotion, which is evaluative/cognitive and action-orientated in nature. Refining the Avoidance word-list according to these features improves macroeconomic models, suggesting that they capture the essence of Avoidance and that it plays a crucial role in driving real-world economic decision-making.
Date:	2021–12
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2112.02607&r=

A methodology for linking the Energy-related Policies of the European Green Deal to the 17 SDGs using Machine Learning

By:	Phoebe Koundouri; Nicolaos Theodossiou; Charalampos Stavridis; Stathis Devves; Angelos Plataniotis
Abstract:	The European Green Deal (EGD) was published in December 2019 with the ambition of being Europe's new growth strategy, making it climate neutral by 2050, and ensuring its citizens a sustainable, prosperous, and inclusive future. The energy sector is central to this ambition, as the European Commission's objectives are, among others, to increase the efficiency of energy production by establishing a fully integrated, interconnected, and digitalized EU energy market. The EGD was the starting point for the publication of a large number of Policy and Strategy documents for achieving Sustainability in Europe. One of the first attempts to systematically correlate the policy areas of the European Green Deal with the 17 Sustainable Development Goals (SDGs) was made in the first report of the UN SDSN's Senior Working Group for the Joint Implementation of the SDGs and the EGD, which was published in February 2021, where the EGD framework was linked to each of the 17 SDGs using textual analysis. Building on this methodology, in this chapter we extend the manual linkage of policy texts to SDGs, by using Natural Language Processing and Machine Learning techniques to automate it, focusing on Energy-related documents derived by the EGD.
Keywords:	European Green Deal, Policies, Sustainable Development Goals, Deep Learning, Natural Language Processing, Semantics.
Date:	2022–01–17
URL:	http://d.repec.org/n?u=RePEc:aue:wpaper:2202&r=

Compiling Granular Population Data Using Geospatial Information

By:	Mitterling, Thomas (World Data Lab); Fenz, Katharina (World Data Lab); Martinez Jr, Arturo (Asian Development Bank); Bulan, Joseph (Asian Development Bank); Addawe, Mildred (Asian Development Bank); Durante, Ron Lester (Asian Development Bank); Martillan, Marymell (Asian Development Bank)
Abstract:	Granular spatial information on the distributions of human population is relevant to a variety of fields like health, economics, and other areas of public sector planning. This paper applies ensemble methods and aims at assessing their applicability to analyzing and forecasting population density on a grid level. In a first step, we use a Random Forest approach to estimate population density in the Philippines and Thailand on a 100 meter by 100-meter level. Second, we use different specifications of Random Forest and Bayesian model averaging techniques to create forecasts of the grid-level population density in three Thailand provinces and evaluate their predictive power.
Keywords:	population mapping; big data; random forest estimation; Philippines; Thailand
JEL:	C19 D30 O15
Date:	2021–12–31
URL:	http://d.repec.org/n?u=RePEc:ris:adbewp:0643&r=

Importance sampling for option pricing with feedforward neural networks

By:	Aleksandar Arandjelovi\'c; Thorsten Rheinl\"ander; Pavel V. Shevchenko
Abstract:	We study the problem of reducing the variance of Monte Carlo estimators through performing suitable changes of the sampling measure which are induced by feedforward neural networks. To this end, building on the concept of vector stochastic integration, we characterize the Cameron-Martin spaces of a large class of Gaussian measures which are induced by vector-valued continuous local martingales with deterministic covariation. We prove that feedforward neural networks enjoy, up to an isometry, the universal approximation property in these topological spaces. We then prove that sampling measures which are generated by feedforward neural networks can approximate the optimal sampling measure arbitrarily well. We conclude with a numerical study pricing path-dependent European basket and barrier options in the case of Black-Scholes and several stochastic volatility models for the underlying multivariate asset.
Date:	2021–12
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2112.14247&r=

The Employment in Innovative Enterprises in Europe

By:	Laureti, Lucio; Costantiello, Alberto; Matarrese, Marco Maria; Leogrande, Angelo
Abstract:	In this article we evaluate the determinants of the Employment in Innovative Enterprises in Europe. We use data from the European Innovation Scoreboard of the European Commission for 36 countries in the period 2000-2019 with Panel Data with Fixed Effects, Panel Data with Random Effects, Dynamic Panel, WLS and Pooled OLS. We found that the “Employment in Innovative Enterprises in Europe” is positively associated with “Broadband Penetration in Europe”, “Foreign Controlled Enterprises Share of Value Added”, “Innovation Index”, “Medium and High-Tech Product Exports” and negatively associated to “Basic School Entrepreneurial Education and Training”, “International Co-Publications”, and “Marketing or Organizational Innovators”. Secondly, we perform a cluster analysis with the k-Means algorithm optimized with the Silhouette Coefficient and we found the presence of four different clusters. Finally, we perform a comparison among eight different machine learning algorithms to predict the level of “Employment in Innovative Enterprises” in Europe and we found that the Linear Regression is the best predictor.
Keywords:	Innovation and Invention: Processes and Incentives; Management of Technological Innovation and R&D; Technological Change: Choices and Consequences • Diffusion Processes; Intellectual Property and Intellectual Capital.
JEL:	O30 O31 O32 O33 O34
Date:	2022–01–01
URL:	http://d.repec.org/n?u=RePEc:pra:mprapa:111335&r=

Using Neural Networks to Predict Micro-Spatial Economic Growth

By:	Arman Khachiyan; Anthony Thomas; Huye Zhou; Gordon H. Hanson; Alex Cloninger; Tajana Rosing; Amit Khandelwal
Abstract:	We apply deep learning to daytime satellite imagery to predict changes in income and population at high spatial resolution in US data. For grid cells with lateral dimensions of 1.2km and 2.4km (where the average US county has dimension of 55.6km), our model predictions achieve R2 values of 0.85 to 0.91 in levels, which far exceed the accuracy of existing models, and 0.32 to 0.46 in decadal changes, which have no counterpart in the literature and are 3-4 times larger than for commonly used nighttime lights. Our network has wide application for analyzing localized shocks.
JEL:	R0
Date:	2021–12
URL:	http://d.repec.org/n?u=RePEc:nbr:nberwo:29569&r=

Data-driven integration of regularized mean-variance portfolios

By:	Andrew Butler; Roy H. Kwon
Abstract:	Mean-variance optimization (MVO) is known to be highly sensitive to estimation error in its inputs. Recently, norm penalization of MVO programs has proven to be an effective regularization technique that can help mitigate the adverse effects of estimation error. In this paper, we augment the standard MVO program with a convex combination of parameterized $L_1$ and $L_2$ norm penalty functions. The resulting program is a parameterized penalized quadratic program (PPQP) whose primal and dual form are shown to be constrained quadratic programs (QPs). We make use of recent advances in neural-network architecture for differentiable QPs and present a novel, data-driven stochastic optimization framework for optimizing parameterized regularization structures in the context of the final decision-based MVO problem. The framework is highly flexible and capable of jointly optimizing both prediction and regularization model parameters in a fully integrated manner. We provide several historical simulations using global futures data and highlight the benefits and flexibility of the stochastic optimization approach.
Date:	2021–12
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2112.07016&r=

Business Closures and (Re)Openings in Real Time Using Google Places

By:	Thibaut Duprey; Daniel E. Rigobon; Philip Schnattinger; Artur Kotlicki; Soheil Baharian; T. R. Hurd
Abstract:	We present a new method to measure business opening and closure rates using real-time information from Google Places, the dataset behind the Google Maps service. Our Canadian application confirms the importance of temporary closures and reopenings during the COVID-19 pandemic. Over 50% of the temporarily closed food and retail businesses during the April 2021 lockdown reopened by the end of September. Our estimates align well with the timing of COVID-19 restrictions and are validated by a survey of recently opened businesses. Our framework provides policy-makers with a tool for the timely monitoring of business dynamics.
Keywords:	Firm dynamics; Recent economic and financial developments
JEL:	D22 E32 C55 C81
Date:	2022–01
URL:	http://d.repec.org/n?u=RePEc:bca:bocawp:22-1&r=

“An application of deep learning for exchange rate forecasting”

By:	Oscar Claveria (AQR-IREA, University of Barcelona); Enric Monte (Polytechnic University of Catalunya); Petar Soric (University of Zagreb); Salvador Torra (Riskcenter-IREA, University of Barcelona)
Abstract:	This paper examines the performance of several state-of-the-art deep learning techniques for exchange rate forecasting (deep feedforward network, convolutional network and a long short-term memory). On the one hand, the configuration of the different architectures is clearly detailed, as well as the tuning of the parameters and the regularisation techniques used to avoid overfitting. On the other hand, we design an out-of-sample forecasting experiment and evaluate the accuracy of three different deep neural networks to predict the US/UK foreign exchange rate in the days after the Brexit took effect. Of the three configurations, we obtain the best results with the deep feedforward architecture. When comparing the deep learning networks to time-series models used as a benchmark, the obtained results are highly dependent on the specific topology used in each case. Thus, although the three architectures generate more accurate predictions than the time-series models, the results vary considerably depending on the specific topology. These results hint at the potential of deep learning techniques, but they also highlight the importance of properly configuring, implementing and selecting the different topologies.
Keywords:	Forecasting, Exchange rates, Deep learning, Deep neural networks, Convolutional networks, Long short-term memory JEL classification: C45, C58, E47, F31, G17
Date:	2022–01
URL:	http://d.repec.org/n?u=RePEc:aqr:wpaper:202201&r=

This nep-big issue is ©2022 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.