nep-big New Economics Papers
on Big Data
Issue of 2019‒07‒29
twelve papers chosen by
Tom Coupé
University of Canterbury

  1. Artificial Intelligence, Data, Ethics. An Holistic Approach for Risks and Regulation By Alexis Bogroff; Dominique Guégan
  2. Design and Evaluation of Product Aesthetics: A Human-Machine Hybrid Approach By Alex Burnap; John R. Hauser; Artem Timoshenko
  3. Fair and Unbiased Algorithmic Decision Making: Current State and Future Challenges By Songül Tolan
  4. Socioeconomic and Biophysical Drivers of Cropland Use Intensification in India: Analysis using satellite data and administrative surveys By Arora, Gaurav; Rathore, Tushita; Gupta, Gargi; Anand, Saket
  5. Dirty Density: Air Quality and the Density of American Cities By Felipe Carozzi; Sefi Roth
  6. Bank Net Interest Margin Forecasting and Capital Adequacy Stress Testing by Machine Learning Techniques By Brummelhuis, Raymond; Luo, Zhongmin
  7. Cholesky-ANN models for predicting multivariate realized volatility By Bucci, Andrea
  8. The impact of information and communication technologies on jobs in Africa: a literature review By Melia, Elvis
  9. Multidimensional Diffusion Processes in Dynamic Online Networks By David Easley; Christopher Rojas
  10. Neural network regression for Bermudan option pricing By Bernard Lapeyre; J\'er\^ome Lelong
  11. Deep Reinforcement Learning in Financial Markets By Souradeep Chakraborty
  12. Dreaming machine learning: Lipschitz extensions for reinforcement learning on financial markets By J. M. Calabuig; H. Falciani; E. A. S\'anchez-P\'erez

  1. By: Alexis Bogroff (University Paris 1 Panthéon-Sorbonne); Dominique Guégan (University Paris 1 Panthéon-Sorbonne; labEx ReFi France; University Ca’ Foscari Venice)
    Abstract: An extensive list of risks relative to big data frameworks and their use through models of artificial intelligence is provided along with measurements and implementable solutions. Bias, interpretability and ethics are studied in depth, with several interpretations from the point of view of developers, companies and regulators. Reflexions suggest that fragmented frameworks increase the risks of models misspecification, opacity and bias in the result. Domain experts and statisticians need to be involved in the whole process as the business objective must drive each decision from the data extraction step to the final activatable prediction. We propose an holistic and original approach to take into account the risks encountered all along the implementation of systems using artificial intelligence from the choice of the data and the selection of the algorithm, to the decision making.
    Keywords: Artificial Intelligence, Bias, Big Data, Ethics, Governance, Interpretability, Regulation, Risk
    JEL: C4 C5 C6 C8 D8 G28 G38 K2
    Date: 2019
    URL: http://d.repec.org/n?u=RePEc:ven:wpaper:2019:19&r=all
  2. By: Alex Burnap; John R. Hauser; Artem Timoshenko
    Abstract: Aesthetics are critically important to market acceptance in many product categories. In the automotive industry in particular, an improved aesthetic design can boost sales by 30% or more. Firms invest heavily in designing and testing new product aesthetics. A single automotive "theme clinic" costs between \$100,000 and \$1,000,000, and hundreds are conducted annually. We use machine learning to augment human judgment when designing and testing new product aesthetics. The model combines a probabilistic variational autoencoder (VAE) and adversarial components from generative adversarial networks (GAN), along with modeling assumptions that address managerial requirements for firm adoption. We train our model with data from an automotive partner-7,000 images evaluated by targeted consumers and 180,000 high-quality unrated images. Our model predicts well the appeal of new aesthetic designs-38% improvement relative to a baseline and substantial improvement over both conventional machine learning models and pretrained deep learning models. New automotive designs are generated in a controllable manner for the design team to consider, which we also empirically verify are appealing to consumers. These results, combining human and machine inputs for practical managerial usage, suggest that machine learning offers significant opportunity to augment aesthetic design.
    Date: 2019–07
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1907.07786&r=all
  3. By: Songül Tolan (European Commission – JRC)
    Abstract: Machine learning algorithms are now frequently used in sensitive contexts that substantially affect the course of human lives, such as credit lending or criminal justice. This is driven by the idea that‘objective’ machines base their decisions solely on facts and remain unaffected by human cognitive biases, discriminatory tendencies or emotions. Yet, there is overwhelming evidence showing that algorithms can inherit or even perpetuate human biases in their decision making when they are based on data that contains biased human decisions. This has led to a call for fairness-aware machine learning. However, fairness is a complex concept which is also reflected in the attempts to formalize fairness for algorithmic decision making. Statistical formalizations of fairness lead to a long list of criteria that are each flawed (or harmful even) in different contexts. Moreover,inherent tradeoffs in these criteria make it impossible to unify them in one general framework. Thus,fairness constraintsin algorithms have to be specific to the domains to which the algorithms are applied. In the future, research in algorithmic decision making systems should be aware of data and developer biases and add a focus on transparency to facilitate regular fairness audits.
    Keywords: fairness, machine learning, algorithmic bias, algorithmic transparency
    Date: 2018–12
    URL: http://d.repec.org/n?u=RePEc:ipt:decwpa:2018-10&r=all
  4. By: Arora, Gaurav; Rathore, Tushita; Gupta, Gargi; Anand, Saket
    Keywords: Resource /Energy Economics and Policy
    Date: 2019–06–25
    URL: http://d.repec.org/n?u=RePEc:ags:aaea19:291104&r=all
  5. By: Felipe Carozzi; Sefi Roth
    Abstract: We study whether urban density affects the exposure of city dwellers to ambient air pollution using satellite-derived measures of air quality for the contiguous United States. For identification, we rely on an instrumental variable strategy, which induces exogenous variation in density without affecting pollution directly. For this purpose, we use three variables measuring geological characteristics as instruments for density: earthquake risks, soil drainage capacity and the presence of aquifers. We find a positive and statistically significant pollution-density elasticity of 0.13. We also assess the health implications of our findings and find that doubling density in an average city increases annual mortality costs by as much as $630 per capita. Our results suggest that, despite the common claim that denser cities tend to be more environmentally friendly, air pollution exposure is higher in denser cities. This in turn highlights the possible trade-off between reducing global greenhouse gas emissions and preserving environmental quality within cities.
    Keywords: air pollution, urban congestion, density, health
    JEL: Q53 R11 I10
    Date: 2019–07
    URL: http://d.repec.org/n?u=RePEc:cep:cepdps:dp1635&r=all
  6. By: Brummelhuis, Raymond; Luo, Zhongmin
    Abstract: The 2007-09 financial crisis revealed that the investors in the financial market were more concerned about the future as opposed to the current capital adequacy for banks. Stress testing promises to complement the regulatory capital adequacy regimes, which assess a bank's current capital adequacy, with the ability to assess its future capital adequacy based on the projected asset-losses and incomes from the forecasting models from regulators and banks. The effectiveness of stress-test rests on its ability to inform the financial market, which depends on whether or not the market has confidence in the model-projected asset-losses and incomes for banks. Post-crisis studies found that the stress-test results are uninformative and receive insignificant market reactions; others question its validity on the grounds of the poor forecast accuracy using linear regression models which forecast the banking-industry incomes measured by Aggregate Net Interest Margin. Instead, our study focuses on NIM forecasting at an individual bank's level and employs both linear regression and non-linear Machine Learning techniques. First, we present both the linear and non-linear Machine Learning regression techniques used in our study. Then, based on out-of-sample tests and literature-recommended forecasting techniques, we compare the NIM forecast accuracy by 162 models based on 11 different regression techniques, finding that some Machine Learning techniques as well as some linear ones can achieve significantly higher accuracy than the random-walk benchmark, which invalidates the grounds used by the literature to challenge the validity of stress-test. Last, our results from forecast accuracy comparisons are either consistent with or complement those from existing forecasting literature. We believe that the paper is the first systematic study on forecasting bank-specific NIM by Machine Learning Techniques; also, it is a first systematic study on forecast accuracy comparison including both linear and non-linear Machine Learning techniques using financial data for a critical real-world problem; it is a multi-step forecasting example involving iterative forecasting, rolling-origins, recalibration with forecast accuracy measure being scale-independent; robust regression proved to be beneficial for forecasting in presence of outliers. It concludes with policy suggestions and future research directions.
    Keywords: Regression, Machine Learning, Time Series Analysis, Bank Capital, Stress Test, Net Interest Margin, Forecasting, PPNR, CCAR
    JEL: C4 C45 C5 C58 C6 G01
    Date: 2019–03–02
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:94779&r=all
  7. By: Bucci, Andrea
    Abstract: Accurately forecasting multivariate volatility plays a crucial role for the financial industry. The Cholesky-Artificial Neural Networks specification here presented provides a twofold advantage for this topic. On the one hand, the use of the Cholesky decomposition ensures positive definite forecasts. On the other hand, the implementation of artificial neural networks allows to specify nonlinear relations without any particular distributional assumption. Out-of-sample comparisons reveal that Artificial neural networks are not able to strongly outperform the competing models. However, long-memory detecting networks, like Nonlinear Autoregressive model process with eXogenous input and long shortterm memory, show improved forecast accuracy respect to existing econometric models.
    Keywords: Neural Networks; Machine Learning; Stock market volatility; Realized Volatility
    JEL: C22 C45 C53 G17
    Date: 2019–07
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:95137&r=all
  8. By: Melia, Elvis
    Abstract: In the past two decades, Africa has experienced a wave of mobile telephony and the early stages of internet connectivity. This paper summarises recent empirical research findings on the impact that information and communication technologies (ICTs) have had on jobs in Africa, be it in creating new jobs, destroying old jobs, or changing the quality of existing jobs in levels of productivity, incomes, or working conditions. The paper discusses various channels in which ICTs can impact jobs: In theory, they have the potential to allow for text-based services platforms that can help farmers and small and medium-sized enterprises (SMEs) become more productive or receive better access to market information; mobile money has the potential to allow the most vulnerable workers more independence and security; and the internet could allow women, in particular, to increase their incomes and independence. This literature review examines what rigorous empirical evidence actually exists to corroborate these claims. Most of the studies reviewed do indeed find positive effects of ICTs on jobs (or related variables) in Africa. On the basis of these findings, the paper reviews policy options for those interested in job creation in Sub-Saharan Africa. The paper concludes by highlighting that these positive findings may exist in parallel with negative structural dynamics that are more difficult to measure. Also, the review’s findings - while positive across the board - should be seen as distinct for ICTs in the period of the 2000s and 2010s, and cannot easily be transferred to expect similarly positive effects of the much newer, Fourth Industrial Revolution Technologies (such as machine learning, blockchain technologies, big data analytics, platform economies), which may produce entirely different dynamics.
    Keywords: Digitalisierung
    Date: 2019
    URL: http://d.repec.org/n?u=RePEc:zbw:diedps:32019&r=all
  9. By: David Easley (Cornell University; Cornell University and EIEF); Christopher Rojas (Cornell University)
    Abstract: We develop a dynamic matched sample estimation algorithm to distinguish peer influence and homophily effects on item adoption decisions in dynamic networks, with numerous items diffusing simultaneously. We infer preferences using a machine learning algorithm applied to previous adoption decisions, and we match agents using those inferred preferences. We show that ignoring previous adoption decisions leads to significantly overestimating the role of peer influence in the diffusion of information, mistakenly confounding influence-based contagion with diffusion driven by common preferences. Our matching-on-preferences algorithm with machine learning reduces the relative effect of peer influence on item adoption decisions in this network significantly more than matching on earlier adoption decisions, as well other observable characteristics. We also show significant and intuitive heterogeneity in the relative effect of peer influence.
    Date: 2019
    URL: http://d.repec.org/n?u=RePEc:eie:wpaper:1912&r=all
  10. By: Bernard Lapeyre (CERMICS, MATHRISK); J\'er\^ome Lelong (LJK)
    Abstract: The pricing of Bermudan options amounts to solving a dynamic programming principle , in which the main difficulty, especially in large dimension, comes from the computation of the conditional expectation involved in the continuation value. These conditional expectations are classically computed by regression techniques on a finite dimensional vector space. In this work, we study neural networks approximation of conditional expectations. We prove the convergence of the well-known Longstaff and Schwartz algorithm when the standard least-square regression is replaced by a neural network approximation.
    Date: 2019–07
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1907.06474&r=all
  11. By: Souradeep Chakraborty
    Abstract: In this paper we explore the usage of deep reinforcement learning algorithms to automatically generate consistently profitable, robust, uncorrelated trading signals in any general financial market. In order to do this, we present a novel Markov decision process (MDP) model to capture the financial trading markets. We review and propose various modifications to existing approaches and explore different techniques to succinctly capture the market dynamics to model the markets. We then go on to use deep reinforcement learning to enable the agent (the algorithm) to learn how to take profitable trades in any market on its own, while suggesting various methodology changes and leveraging the unique representation of the FMDP (financial MDP) to tackle the primary challenges faced in similar works. Through our experimentation results, we go on to show that our model could be easily extended to two very different financial markets and generates a positively robust performance in all conducted experiments.
    Date: 2019–07
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1907.04373&r=all
  12. By: J. M. Calabuig; H. Falciani; E. A. S\'anchez-P\'erez
    Abstract: We develop a new topological structure for the construction of a reinforcement learning model in the framework of financial markets. It is based on Lipschitz type extension of reward functions defined in metric spaces. Using some known states of a dynamical system that represents the evolution of a financial market, we use our technique to simulate new states, that we call ``dreams". These new states are used to feed a learning algorithm designed to improve the investment strategy.
    Date: 2019–07
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1907.05697&r=all

This nep-big issue is ©2019 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.