nep-big 2020-05-11 papers

on Big Data

Issue of 2020‒05‒11
twenty-six papers chosen by
Tom Coupé
University of Canterbury

Differential Machine Learning By Antoine Savine; Brian Huge
A Time Series Analysis-Based Stock Price Prediction Using Machine Learning and Deep Learning Models By Sidra Mehtab; Jaydip Sen
Best Practices for Artificial Intelligence in Life Sciences Research By Makarov, Vladimir; Stouch, Terry; Allgood, Brandon; Willis, Christopher; Lynch, Nick
Machine Learning Econometrics: Bayesian algorithms and methods By Dimitris Korobilis; Davide Pettenuzzo
Neural Networks and Value at Risk By Alexander Arimond; Damian Borth; Andreas Hoepner; Michael Klawunn; Stefan Weisheit
A machine learning approach to portfolio pricing and risk management for high-dimensional problems By Lucio Fernandez Arjona; Damir Filipović
Tracking the digital footprint in Latin America and the Caribbean: Lessons learned from using big data to assess the digital economy By -
Sequential hypothesis testing in machine learning driven crude oil jump detection By Michael Roberts; Indranil SenGupta
Long short-term memory networks and laglasso for bond yield forecasting: Peeping inside the black box By Manuel Nunes; Enrico Gerding; Frank McGroarty; Mahesan Niranjan
Comparing conventional and machine-learning approaches to risk assessment in domestic abuse cases By Grogger, Jeffrey; Ivandic, Ria; Kirchmaier, Thomas
Hedging and machine learning driven crude oil data analysis using a refined Barndorff-Nielsen and Shephard model By Humayra Shoshi; Indranil SenGupta
ESG2Risk: A Deep Learning Framework from ESG News to Stock Volatility Prediction By Tian Guo; Nicolas Jamet; Valentin Betrix; Louis-Alexandre Piquet; Emmanuel Hauptmann
Environmental Economics and Uncertainty: Review and a Machine Learning Outlook By Ruda Zhang; Patrick Wingo; Rodrigo Duran; Kelly Rose; Jennifer Bauer; Roger Ghanem
Multimarket Contact and Collusion in Online Retail By Poppius, Hampus
A machine learning approach to portfolio pricing and risk management for high-dimensional problems By Lucio Fernandez-Arjona; Damir Filipovi\'c
Managing Intelligence: Skilled Experts and AI in Markets for Complex Products By Jonathan Gruber; Benjamin R. Handel; Samuel H. Kina; Jonathan T. Kolstad
A generative adversarial network approach to calibration of local stochastic volatility models By Christa Cuchiero; Wahid Khosrawi; Josef Teichmann
Do Female Role Models Reduce the Gender Gap in Science? Evidence from French High Schools By Breda, Thomas; Grenet, Julien; Monnet, Marion; Van Effenterre, Clémentine
A neural network model for solvency calculations in life insurance By Lucio Fernandez-Arjona
How average is average? Temporal patterns in human behaviour as measured by mobile phone data -- or why chose Thursdays By Marina Toger; Ian Shuttleworth; John \"Osth
On the Equivalence of Neural and Production Networks By Roy Gernhardt; Bjorn Persson
The technological contest between China and the United States By Toro Hardy, Alfredo
Hedging with Neural Networks By Johannes Ruf; Weiguan Wang
Deep xVA solver -- A neural network based counterparty credit risk management framework By Alessandro Gnoatto; Athena Picarelli; Christoph Reisinger
Flooded cities By Kocornik-Mina, Adriana; McDermott, Thomas K.J.; Michaels, Guy; Rauch, Ferdinand
A Multialternative Neural Decision Process By Simone Cerreia-Vioglio; Fabio Maccheroni; Massimo Marinacci

By:	Antoine Savine; Brian Huge
Abstract:	Differential machine learning (ML) extends supervised learning, with models trained on examples of not only inputs and labels, but also differentials of labels to inputs. Differential ML is applicable in all situations where high quality first order derivatives wrt training inputs are available. In the context of financial Derivatives risk management, pathwise differentials are efficiently computed with automatic adjoint differentiation (AAD). Differential ML, combined with AAD, provides extremely effective pricing and risk approximations. We can produce fast pricing analytics in models too complex for closed form solutions, extract the risk factors of complex transactions and trading books, and effectively compute risk management metrics like reports across a large number of scenarios, backtesting and simulation of hedge strategies, or capital regulations. The article focuses on differential deep learning (DL), arguably the strongest application. Standard DL trains neural networks (NN) on punctual examples, whereas differential DL teaches them the shape of the target function, resulting in vastly improved performance, illustrated with a number of numerical examples, both idealized and real world. In the online appendices, we apply differential learning to other ML models, like classic regression or principal component analysis (PCA), with equally remarkable results. This paper is meant to be read in conjunction with its companion GitHub repo https://github.com/differential-machine-learning, where we posted a TensorFlow implementation, tested on Google Colab, along with examples from the article and additional ones. We also posted appendices covering many practical implementation details not covered in the paper, mathematical proofs, application to ML models besides neural networks and extensions necessary for a reliable implementation in production.
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2005.02347&r=all

A Time Series Analysis-Based Stock Price Prediction Using Machine Learning and Deep Learning Models

By:	Sidra Mehtab; Jaydip Sen
Abstract:	Prediction of future movement of stock prices has always been a challenging task for the researchers. While the advocates of the efficient market hypothesis (EMH) believe that it is impossible to design any predictive framework that can accurately predict the movement of stock prices, there are seminal work in the literature that have clearly demonstrated that the seemingly random movement patterns in the time series of a stock price can be predicted with a high level of accuracy. Design of such predictive models requires choice of appropriate variables, right transformation methods of the variables, and tuning of the parameters of the models. In this work, we present a very robust and accurate framework of stock price prediction that consists of an agglomeration of statistical, machine learning and deep learning models. We use the daily stock price data, collected at five minutes interval of time, of a very well known company that is listed in the National Stock Exchange (NSE) of India. The granular data is aggregated into three slots in a day, and the aggregated data is used for building and training the forecasting models. We contend that the agglomerative approach of model building that uses a combination of statistical, machine learning, and deep learning approaches, can very effectively learn from the volatile and random movement patterns in a stock price data. We build eight classification and eight regression models based on statistical and machine learning approaches. In addition to these models, a deep learning regression model using a long-and-short-term memory (LSTM) network is also built. Extensive results have been presented on the performance of these models, and the results are critically analyzed.
Date:	2020–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2004.11697&r=all

Best Practices for Artificial Intelligence in Life Sciences Research

By:	Makarov, Vladimir; Stouch, Terry; Allgood, Brandon; Willis, Christopher; Lynch, Nick
Abstract:	We describe 11 best practices for the successful use of Artificial Intelligence and Machine Learning in the pharmaceutical and biotechnology research, on the data, technology, and organizational management levels.
Date:	2020–04–20
URL:	http://d.repec.org/n?u=RePEc:osf:osfxxx:eqm9j&r=all

Machine Learning Econometrics: Bayesian algorithms and methods

By:	Dimitris Korobilis; Davide Pettenuzzo
Abstract:	As the amount of economic and other data generated worldwide increases vastly, a challenge for future generations of econometricians will be to master efficient algorithms for inference in empirical models with large information sets. This Chapter provides a review of popular estimation algorithms for Bayesian inference in econometrics and surveys alternative algorithms developed in machine learning and computing science that allow for efficient computation in high-dimensional settings. The focus is on scalability and parallelizability of each algorithm, as well as their ability to be adopted in various empirical settings in economics and finance.
Date:	2020–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2004.11486&r=all

Neural Networks and Value at Risk

By:	Alexander Arimond; Damian Borth; Andreas Hoepner; Michael Klawunn; Stefan Weisheit
Abstract:	Utilizing a generative regime switching framework, we perform Monte-Carlo simulations of asset returns for Value at Risk threshold estimation. Using equity markets and long term bonds as test assets in the global, US, Euro area and UK setting over an up to 1,250 weeks sample horizon ending in August 2018, we investigate neural networks along three design steps relating (i) to the initialization of the neural network, (ii) its incentive function according to which it has been trained and (iii) the amount of data we feed. First, we compare neural networks with random seeding with networks that are initialized via estimations from the best-established model (i.e. the Hidden Markov). We find latter to outperform in terms of the frequency of VaR breaches (i.e. the realized return falling short of the estimated VaR threshold). Second, we balance the incentive structure of the loss function of our networks by adding a second objective to the training instructions so that the neural networks optimize for accuracy while also aiming to stay in empirically realistic regime distributions (i.e. bull vs. bear market frequencies). In particular this design feature enables the balanced incentive recurrent neural network (RNN) to outperform the single incentive RNN as well as any other neural network or established approach by statistically and economically significant levels. Third, we half our training data set of 2,000 days. We find our networks when fed with substantially less data (i.e. 1,000 days) to perform significantly worse which highlights a crucial weakness of neural networks in their dependence on very large data sets ...
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2005.01686&r=all

A machine learning approach to portfolio pricing and risk management for high-dimensional problems

By:	Lucio Fernandez Arjona (Zurich Insurance Group); Damir Filipović (Ecole Polytechnique Fédérale de Lausanne; Swiss Finance Institute)
Abstract:	We present a general framework for portfolio risk management in discrete time, based on a replicating martingale. This martingale is learned from a finite sample in a supervised setting. The model learns the features necessary for an effective low-dimensional representation, overcoming the curse of dimensionality common to function approximation in high-dimensional spaces. We show results based on polynomial and neural network bases. Both offer superior results to naive Monte Carlo methods and other existing methods like least-squares Monte Carlo and replicating portfolios.
Keywords:	Solvency capital; dimensionality reduction; neural networks; nested Monte Carlo; replicating portfolios.
Date:	2020–04
URL:	http://d.repec.org/n?u=RePEc:chf:rpseri:rp2028&r=all

Tracking the digital footprint in Latin America and the Caribbean: Lessons learned from using big data to assess the digital economy

By:	-
Abstract:	This report explores the opportunities for and challenges of the systematic use of publicly available digital data as a tool for formulating public policies for the development of the digital economy in Latin America and the Caribbean. The objective is to share lessons learned in order to advance a research agenda that allows the countries of the region to create alternative measuring tools based on the digital footprint. Using big data techniques, the digital footprint left behind by labour market portals, e-commerce platforms and social media networks offer unprecedented information, both in terms of scope and detail.
Keywords:	ECONOMIA BASADA EN EL CONOCIMIENTO, MACRODATOS, PROCESAMIENTO DE DATOS, RECOPILACION DE DATOS, EMPLEO, MERCADO DE TRABAJO, RECURSOS HUMANOS, TECNOLOGIA DIGITAL, PRECIOS, PEQUEÑAS EMPRESAS, EMPRESAS MEDIANAS, FINANCIAMIENTO DE EMPRESAS, BANDA ANCHA, INTERNET, TECNOLOGIA DE LA INFORMACION, TECNOLOGIA DE LAS COMUNICACIONES, REDES SOCIALES, MONEDAS, POLITICA DE DESARROLLO, KNOWLEDGE-BASED ECONOMY, BIG DATA, DATA PROCESSING, DATA COLLECTION, EMPLOYMENT, LABOUR MARKET, HUMAN RESOURCES, DIGITAL TECHNOLOGY, PRICES, MEDIUM ENTERPRISES, SMALL ENTERPRISES, BUSINESS FINANCING, BROADBAND, INTERNET, INFORMATION TECHNOLOGY, COMMUNICATION TECHNOLOGY, SOCIAL MEDIA, CURRENCY, DEVELOPMENT POLICY
Date:	2020–04–28
URL:	http://d.repec.org/n?u=RePEc:ecr:col022:45484&r=all

Sequential hypothesis testing in machine learning driven crude oil jump detection

By:	Michael Roberts; Indranil SenGupta
Abstract:	In this paper we present a sequential hypothesis test for the detection of general jump size distrubution. Infinitesimal generators for the corresponding log-likelihood ratios are presented and analyzed. Bounds for infinitesimal generators in terms of super-solutions and sub-solutions are computed. This is shown to be implementable in relation to various classification problems for a crude oil price data set. Machine and deep learning algorithms are implemented to extract a specific deterministic component from the crude oil data set, and the deterministic component is implemented to improve the Barndorff-Nielsen and Shephard model, a commonly used stochastic model for derivative and commodity market analysis.
Date:	2020–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2004.08889&r=all

Long short-term memory networks and laglasso for bond yield forecasting: Peeping inside the black box

By:	Manuel Nunes; Enrico Gerding; Frank McGroarty; Mahesan Niranjan
Abstract:	Modern decision-making in fixed income asset management benefits from intelligent systems, which involve the use of state-of-the-art machine learning models and appropriate methodologies. We conduct the first study of bond yield forecasting using long short-term memory (LSTM) networks, validating its potential and identifying its memory advantage. Specifically, we model the 10-year bond yield using univariate LSTMs with three input sequences and five forecasting horizons. We compare those with multilayer perceptrons (MLP), univariate and with the most relevant features. To demystify the notion of black box associated with LSTMs, we conduct the first internal study of the model. To this end, we calculate the LSTM signals through time, at selected locations in the memory cell, using sequence-to-sequence architectures, uni and multivariate. We then proceed to explain the states' signals using exogenous information, for what we develop the LSTM-LagLasso methodology. The results show that the univariate LSTM model with additional memory is capable of achieving similar results as the multivariate MLP using macroeconomic and market information. Furthermore, shorter forecasting horizons require smaller input sequences and vice-versa. The most remarkable property found consistently in the LSTM signals, is the activation/deactivation of units through time, and the specialisation of units by yield range or feature. Those signals are complex but can be explained by exogenous variables. Additionally, some of the relevant features identified via LSTM-LagLasso are not commonly used in forecasting models. In conclusion, our work validates the potential of LSTMs and methodologies for bonds, providing additional tools for financial practitioners.
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2005.02217&r=all

Comparing conventional and machine-learning approaches to risk assessment in domestic abuse cases

By:	Grogger, Jeffrey; Ivandic, Ria; Kirchmaier, Thomas
Abstract:	We compare predictions from a conventional protocol-based approach to risk assessment with those based on a machine-learning approach. We first show that the conventional predictions are less accurate than, and have similar rates of negative prediction error as, a simple Bayes classifier that makes use only of the base failure rate. A random forest based on the underlying risk assessment questionnaire does better under the assumption that negative prediction errors are more costly than positive prediction errors. A random forest based on two-year criminal histories does better still. Indeed, adding the protocol-based features to the criminal histories adds almost nothing to the predictive adequacy of the model. We suggest using the predictions based on criminal histories to prioritize incoming calls for service, and devising a more sensitive instrument to distinguish true from false positives that result from this initial screening.
Keywords:	domestic abuse; risk assessment; machine learning
JEL:	K42
Date:	2020–02–01
URL:	http://d.repec.org/n?u=RePEc:ehl:lserod:104159&r=all

Hedging and machine learning driven crude oil data analysis using a refined Barndorff-Nielsen and Shephard model

By:	Humayra Shoshi; Indranil SenGupta
Abstract:	In this paper, a refined Barndorff-Nielsen and Shephard (BN-S) model is implemented to find an optimal hedging strategy for commodity markets. The refinement of the BN-S model is obtained with various machine and deep learning algorithms. The refinement leads to the extraction of a deterministic parameter from the empirical data set. The problem is transformed to an appropriate classification problem with a couple of different approaches: the volatility approach and the duration approach. The analysis is implemented to the Bakken crude oil data and the aforementioned deterministic parameter is obtained for a wide range of data sets. With the implementation of this parameter in the refined model, the resulting model performs much better than the classical BN-S model.
Date:	2020–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2004.14862&r=all

ESG2Risk: A Deep Learning Framework from ESG News to Stock Volatility Prediction

By:	Tian Guo; Nicolas Jamet; Valentin Betrix; Louis-Alexandre Piquet; Emmanuel Hauptmann
Abstract:	Incorporating environmental, social, and governance (ESG) considerations into systematic investments has drawn numerous attention recently. In this paper, we focus on the ESG events in financial news flow and exploring the predictive power of ESG related financial news on stock volatility. In particular, we develop a pipeline of ESG news extraction, news representations, and Bayesian inference of deep learning models. Experimental evaluation on real data and different markets demonstrates the superior predicting performance as well as the relation of high volatility prediction to stocks with potential high risk and low return. It also shows the prospect of the proposed pipeline as a flexible predicting framework for various textual data and target variables.
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2005.02527&r=all

Environmental Economics and Uncertainty: Review and a Machine Learning Outlook

By:	Ruda Zhang; Patrick Wingo; Rodrigo Duran; Kelly Rose; Jennifer Bauer; Roger Ghanem
Abstract:	Economic assessment in environmental science concerns the measurement or valuation of environmental impacts, adaptation, and vulnerability. Integrated assessment modeling is a unifying framework of environmental economics, which attempts to combine key elements of physical, ecological, and socioeconomic systems. Uncertainty characterization in integrated assessment varies by component models: uncertainties associated with mechanistic physical models are often assessed with an ensemble of simulations or Monte Carlo sampling, while uncertainties associated with impact models are evaluated by conjecture or econometric analysis. Manifold sampling is a machine learning technique that constructs a joint probability model of all relevant variables which may be concentrated on a low-dimensional geometric structure. Compared with traditional density estimation methods, manifold sampling is more efficient especially when the data is generated by a few latent variables. The manifold-constrained joint probability model helps answer policy-making questions from prediction, to response, and prevention. Manifold sampling is applied to assess risk of offshore drilling in the Gulf of Mexico.
Date:	2020–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2004.11780&r=all

Multimarket Contact and Collusion in Online Retail

By:	Poppius, Hampus (Department of Economics, Lund University)
Abstract:	When ﬁrms meet in multiple markets, they can leverage punishment ability in one market to sustain collusion in another. This is the ﬁrst paper to test this theory for multiproduct retailers that sell consumer goods online. With data on the universe of consumer goods sold online in Sweden, I estimate that multimarket contact increases prices. To more closely investigate what drives the effect, I employ a machine-learning method to estimate effect heterogeneity. The main ﬁnding is that multimarket contact increases prices to a higher extent if there are fewer ﬁrms participating in the contact markets, which is one of the theoretical predictions. Previous studies focus on geographical markets, where ﬁrms provide a good or service in different locations. I instead deﬁne markets as different product markets, where each market is deﬁned by the type of good. This is the ﬁrst paper to study multimarket contact and collusion with this type of market deﬁnition. The effect is stronger than in previously studied settings.
Keywords:	Tacit collusion; pricing; e-commerce; causal machine learning
JEL:	D22 D43 L41 L81
Date:	2020–04–08
URL:	http://d.repec.org/n?u=RePEc:hhs:lunewp:2020_005&r=all

A machine learning approach to portfolio pricing and risk management for high-dimensional problems

By:	Lucio Fernandez-Arjona (University of Zurich); Damir Filipovi\'c (EPFL and Swiss Finance Institute)
Abstract:	We present a general framework for portfolio risk management in discrete time, based on a replicating martingale. This martingale is learned from a finite sample in a supervised setting. The model learns the features necessary for an effective low-dimensional representation, overcoming the curse of dimensionality common to function approximation in high-dimensional spaces. We show results based on polynomial and neural network bases. Both offer superior results to naive Monte Carlo methods and other existing methods like least-squares Monte Carlo and replicating portfolios.
Date:	2020–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2004.14149&r=all

Managing Intelligence: Skilled Experts and AI in Markets for Complex Products

By:	Jonathan Gruber; Benjamin R. Handel; Samuel H. Kina; Jonathan T. Kolstad
Abstract:	In numerous high stakes markets skilled experts play a key role in facilitating consumer choice of complex products. New artificial intelligence (AI) technologies are increasingly being used to augment expert decisions. We study the role of technology and expertise in the market for health insurance, where consumer choices are widely known to be sub-optimal. Our analysis leverages the large-scale implementation of an AI-based decision support tool in a private Medicare exchange where consumers are randomized to skilled agents over time. We find that, prior to AI-based technology, skilled experts in this market exhibit the same type of inconsistent behavior found in previous studies of individual choices, costing consumers $1260 on average. The addition of AI-based decision support improves outcomes by $278 on average and substantially reduces heterogeneity in broker performance. Experts efficiently synthesize private information, incorporating AI-based recommendations along dimensions that are well suited to AI (e.g. total expected patient costs), but overruling AI-based recommendations along dimensions for which humans are better suited (e.g. specifics of doctor networks). As a result, switching plans, an ex-post measure of plan satisfaction, is meaningfully lower for agents making AI-based recommendations. While AI is a complement to skill on average, we find that it is a substitute across the skill distribution; lower quality agents provide better recommendations with AI than the top agents did without it. Overall productivity rises, with the introduction of decision support associated with a 21% reduction in call time for enrollment.
JEL:	I13 J24 L15
Date:	2020–04
URL:	http://d.repec.org/n?u=RePEc:nbr:nberwo:27038&r=all

A generative adversarial network approach to calibration of local stochastic volatility models

By:	Christa Cuchiero; Wahid Khosrawi; Josef Teichmann
Abstract:	We propose a fully data driven approach to calibrate local stochastic volatility (LSV) models, circumventing in particular the ad hoc interpolation of the volatility surface. To achieve this, we parametrize the leverage function by a family of feed forward neural networks and learn their parameters directly from the available market option prices. This should be seen in the context of neural SDEs and (causal) generative adversarial networks: we generate volatility surfaces by specific neural SDEs, whose quality is assessed by quantifying, in an adversarial manner, distances to market prices. The minimization of the calibration functional relies strongly on a variance reduction technique based on hedging and deep hedging, which is interesting in its own right: it allows to calculate model prices and model implied volatilities in an accurate way using only small sets of sample paths. For numerical illustration we implement a SABR-type LSV model and conduct a thorough statistical performance analyis on many samples of implied volatility smiles, showing the accuracy and stability of the method.
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2005.02505&r=all

Do Female Role Models Reduce the Gender Gap in Science? Evidence from French High Schools

By:	Breda, Thomas (Paris School of Economics); Grenet, Julien (Paris School of Economics); Monnet, Marion (Paris School of Economics); Van Effenterre, Clémentine (University of Toronto)
Abstract:	This paper, based on a large-scale field experiment, tests whether a one-hour exposure to external female role models with a background in science affects students' perceptions and choice of field of study. Using a random assignment of classroom interventions carried out by 56 female scientists among 20,000 high school students in the Paris Region, we provide the first evidence of the positive impact of external female role models on student enrollment in STEM fields. We show that the interventions increased the share of Grade 12 girls enrolling in selective (male-dominated) STEM programs in higher education, from 11 to 14.5 percent. These effects are driven by high-achieving girls in mathematics. We find limited effects on boys' educational choices in Grade 12, and no effect for students in Grade 10. Evidence from survey data shows that the program raised students' interest in science-related careers and slightly improved their math self-concept. It sharply reduced the prevalence of stereotypes associated with jobs in science and gender differences in abilities, but it made the underrepresentation of women in science more salient. Using machine learning methods, we leverage the diversity of role model profiles to document substantial heterogeneity in the effectiveness of role models and shed light on the channels through which they can influence female students' choice of study. Results suggest that emphasis on the gender theme is less important to the effectiveness of this type of intervention than the ability of role models to convey a positive and more inclusive image of STEM careers.
Keywords:	role models, gender gap, STEM, stereotypes, choice of studies
JEL:	C93 I24 J16
Date:	2020–04
URL:	http://d.repec.org/n?u=RePEc:iza:izadps:dp13163&r=all

A neural network model for solvency calculations in life insurance

By:	Lucio Fernandez-Arjona
Abstract:	Insurance companies make extensive use of Monte Carlo simulations in their capital and solvency models. To overcome the computational problems associated with Monte Carlo simulations, most large life insurance companies use proxy models such as replicating portfolios. In this paper, we present an example based on a variable annuity guarantee, showing the main challenges faced by practitioners in the construction of replicating portfolios: the feature engineering step and subsequent basis function selection problem. We describe how neural networks can be used as a proxy model and how to apply risk-neutral pricing on a neural network to integrate such a model into a market risk framework. The proposed model naturally solves the feature engineering and feature selection problems of replicating portfolios.
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2005.02318&r=all

How average is average? Temporal patterns in human behaviour as measured by mobile phone data -- or why chose Thursdays

By:	Marina Toger; Ian Shuttleworth; John \"Osth
Abstract:	Mobile phone data -- with file sizes scaling into terabytes -- easily overwhelm the computational capacity available to some researchers. Moreover, for ethical reasons, data access is often granted only to particular subsets, restricting analyses to cover single days, weeks, or geographical areas. Consequently, it is frequently impossible to set a particular analysis or event in its context and know how typical it is, compared to other days, weeks or months. This is important for academic referees questioning research on mobile phone data and for the analysts in deciding how to sample, how much data to process, and which events are anomalous. All these issues require an understanding of variability in Big Data to answer the question of how average is average? This paper provides a method, using a large mobile phone dataset, to answer these basic but necessary questions. We show that file size is a robust proxy for the activity level of phone users by profiling the temporal variability of the data at an hourly, daily and monthly level. We then apply time-series analysis to isolate temporal periodicity. Finally, we discuss confidence limits to anomalous events in the data. We recommend an analytical approach to mobile phone data selection which suggests that ideally data should be sampled across days, across working weeks, and across the year, to obtain a representative average. However, where this is impossible, the temporal variability is such that specific weekdays' data can provide a fair picture of other days in their general structure.
Date:	2020–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2005.00137&r=all

On the Equivalence of Neural and Production Networks

By:	Roy Gernhardt; Bjorn Persson
Abstract:	This paper identifies for the first time the mathematical equivalence between economic networks of Cobb-Douglas agents and Artificial Neural Networks. It explores two implications of this equivalence under general conditions. First, a burgeoning literature has established that network propagation can transform microeconomic perturbations into large aggregate shocks. Neural network equivalence amplifies the magnitude and complexity of this phenomenon. Second, if economic agents adjust their production and utility functions in optimal response to local conditions, market pricing is a sufficient and robust channel for information feedback leading to global, macro-scale learning at the level of the economy as a whole.
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2005.00510&r=all

The technological contest between China and the United States

By:	Toro Hardy, Alfredo
Abstract:	China’s proclaimed aim of becoming the world’s leader in science, technology and innovation by the mid twenty first century has triggered an intense competition with the United States. The latter, feeling threatened in its supremacy in this field, has reacted forcefully. This GLO Discussion Paper examines the nature of this contest, the comparative technological standing of both countries, the pros and cons in this area derived from their respective development models and the plausible outcomes of this competition.
Keywords:	Artificial Intelligence,China, Market economy,Research & Development,Science & Technology,State led model,Silicon Valley, United States
JEL:	D78 F01 F52 H52 I25
Date:	2020
URL:	http://d.repec.org/n?u=RePEc:zbw:glodps:521&r=all

Hedging with Neural Networks

By:	Johannes Ruf; Weiguan Wang
Abstract:	We study neural networks as nonparametric estimation tools for the hedging of options. To this end, we design a network, named HedgeNet, that directly outputs a hedging strategy. This network is trained to minimise the hedging error instead of the pricing error. Applied to end-of-day and tick prices of S&P 500 and Euro Stoxx 50 options, the network is able to reduce the mean squared hedging error of the Black-Scholes benchmark significantly. We illustrate, however, that a similar benefit arises by simple linear regressions that incorporate the leverage effect. Finally, we show how a faulty training/test data split, possibly along with an additional 'tagging' of data, leads to a significant overestimation of the outperformance of neural networks.
Date:	2020–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2004.08891&r=all

Deep xVA solver -- A neural network based counterparty credit risk management framework

By:	Alessandro Gnoatto; Athena Picarelli; Christoph Reisinger
Abstract:	In this paper, we present a novel computational framework for portfolio-wide risk management problems where the presence of a potentially large number of risk factors makes traditional numerical techniques ineffective. The new method utilises a coupled system of BSDEs for the valuation adjustments (xVA) and solves these by a recursive application of a neural network based BSDE solver. This not only makes the computation of xVA for high-dimensional problems feasible, but also produces hedge ratios and dynamic risk measures for xVA, and allows simulations of the collateral account.
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2005.02633&r=all

Flooded cities

By:	Kocornik-Mina, Adriana; McDermott, Thomas K.J.; Michaels, Guy; Rauch, Ferdinand
Abstract:	Does economic activity relocate away from areas that are at high risk of recurring shocks? We examine this question in the context of floods, which are among the costliest and most common natural disasters. Over the past thirty years, floods worldwide killed more than 500,000 people and displaced over 650,000,000 people. This paper analyzes the effect of large scale floods, which displaced at least 100,000 people each, in over 1,800 cities in 40 countries, from 2003-2008. We conduct our analysis using spatially detailed inundation maps and night lights data spanning the globe’s urban areas, which we use to measure local economic activity. We find that low elevation areas are about 3-4 times more likely to be hit by large floods than other areas, and yet they concentrate more economic activity per square kilometer. When cities are hit by large floods, these low elevation areas also sustain damage, but like the rest of the flooded cities they recover rapidly, and economic activity does not move to safer areas. Only in more recently populated urban areas, flooded areas show a larger and more persistent decline in economic activity. Our findings have important policy implications for aid, development and urban planning in a world with rapid urbanization and rising sea levels.
Keywords:	Urbanization; Flooding; Climate change; Urban recovery
JEL:	O18 Q54 R11 R58
Date:	2020–04–01
URL:	http://d.repec.org/n?u=RePEc:ehl:lserod:100031&r=all

A Multialternative Neural Decision Process

By:	Simone Cerreia-Vioglio; Fabio Maccheroni; Massimo Marinacci
Abstract:	We introduce an algorithmic decision process for multialternative choice that combines binary comparisons and Markovian exploration. We show that a functional property, transitivity, makes it testable.
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2005.01081&r=all

This nep-big issue is ©2020 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.