nep-big New Economics Papers
on Big Data
Issue of 2017‒10‒29
thirteen papers chosen by
Tom Coupé
University of Canterbury

  1. BIG Data - BIG Gains? Understanding the Link Between Big Data Analytics and Innovation By Niebel, Thomas; Rasel, Fabienne; Viete, Steffen
  2. Sharing responsibility with a machine By Strobel, Christina; Kirchkamp, Oliver
  3. Pushing Crime Around the Corner? Estimating Experimental Impacts of Large-Scale Security Interventions By Christopher Blattman; Donald Green; Daniel Ortega; Santiago Tobón
  4. Illuminating Economic Development in Indigenous Communities By Donna Feir; Rob Gillezeau; Maggie Jones
  5. The view from space: Theory-based time-varying distances in the gravity model By Hinz, Julian
  6. Telecommunications Policy: An evaluation of 40 years' research history By Kwon, Youngsun; Kwon, Joungwon
  7. Effects of the central bank’s communications in Colombia. A content analysis By Luis E. Arango; Javier Pantoja; Carlos Velásquez
  8. Geometric Learning and Filtering in Finance By Anastasia Kratsios; Cody B. Hyndman
  9. Model economic phenomena with CART and Random Forest algorithms By Benjamin David
  10. Arbitrage-Free Regularization By Anastasia Kratsios; Cody B. Hyndman
  11. Sequential Design and Spatial Modeling for Portfolio Tail Risk Measurement By Michael Ludkovski; James Risk
  12. Asymptotic Expansion as Prior Knowledge in Deep Learning Method for high dimensional BSDEs By Masaaki Fujii; Akihiko Takahashi; Masayuki Takahashi
  13. "Asymptotic Expansion as Prior Knowledge in Deep Learning Method for high dimensional BSDEs" By Masaaki Fujii; Akihiko Takahashi; Masayuki Takahashi

  1. By: Niebel, Thomas; Rasel, Fabienne; Viete, Steffen
    Abstract: This paper analyzes the relationship between firms’ use of big data analytics and their innovative performance for product innovations. Since big data technologies provide new data information practices, they create new decision-making possibilities, which firms can use to realize innovations. Applying German firm-level data we find suggestive evidence that big data analytics matters for the likelihood of becoming a product innovator as well as the market success of the firms’ product innovations. The regression analysis reveals that firms which make use of big data have a higher likelihood of realizing product innovations as well as a higher innovation intensity. Interestingly, the results are of equal magnitude in the manufacturing and services industries. The results support the view that big data analytics have the potential to enable innovation.
    Keywords: Big data,data-driven decision-making,innovation,product innovation,firmlevel data
    JEL: D22 L20 O33
    Date: 2017
  2. By: Strobel, Christina; Kirchkamp, Oliver
    Abstract: More and more often the partner in this decision is not another human but, instead, a machine. Here we ask whether a machine partner affects our responsibility, our perception of the choice and our choice differently from a human partner. We use a modified dictator game with two joint decision makers: either two humans or one human and one machine. We find a strong treatment effect on perceived responsibility. We do, however, find only a small and insignificant effect on actual choices.
    JEL: C91 C92
    Date: 2017
  3. By: Christopher Blattman; Donald Green; Daniel Ortega; Santiago Tobón
    Abstract: Bogotá intensified state presence to make high-crime streets safer. We show that spillovers outweighed direct effects on security. We randomly assigned 1,919 “hot spot” streets to eight months of doubled policing, increased municipal services, both, or neither. Spillovers in dense networks cause “fuzzy clustering,” and we show valid hypothesis testing requires randomization inference. State presence improved security on hot spots. But data from all streets suggest that intensive policing pushed property crime around the corner, with ambiguous impacts on violent crime. Municipal services had positive but imprecise spillovers. These results contrast with prior studies concluding policing has positive spillovers.
    JEL: C93 K42 O10
    Date: 2017–10
  4. By: Donna Feir (Department of Economics, University of Victoria); Rob Gillezeau (Department of Economics, University of Victoria); Maggie Jones (Department of Economics, Queen's University)
    Abstract: There are over 1,000 First Nations and Inuit communities in Canada. However, the most comprehensive public data source on economic activity, the Community Well-Being (CWB) database, only includes consistent data for 357 of these communities every five years between 1991 and 2011. We propose an alternative measure of economic well-being that is available annually since 1992 for all First Nations, Inuit, and non-Indigenous communities in Canada: nighttime light density from satellites. Nighttime light data have been used by development economists to measure economic activity elsewhere and have been shown to be a flexible alternative to traditional measures of economic activity. We find that nighttime light density is a useful proxy for per capita income in the Canadian context and provide evidence of sample selection issues with the pre-existing indicators of well-being in First Nations and Inuit communities. We suggest that using nighttime light density overcomes the biased selection of communities into the CWB samples and thus may present a more complete picture of economic activity in Canada.
    Keywords: Light density, nighttime light density, Indigenous peoples, Economic development, Community Well-Being Index
    JEL: I15 J15 J24
    Date: 2017–10–23
  5. By: Hinz, Julian
    Abstract: I compute distances used in the gravity model of international trade that improve the existing measures along multiple lines and help remedy the border puzzle by up to 50%. I derive a trade cost aggregation that is agnostic to the underlying gravity framework while taking into account the economic geography of countries. Using this method I compute bilateral and internal country distances, making use of nightlight satellite imagery for information on the economic geography of countries.
    JEL: F10 F14
    Date: 2017
  6. By: Kwon, Youngsun; Kwon, Joungwon
    Abstract: Telecommunications Policy (TP) marked its 40-year milestone in 2016. At this juncture of 40-year milestone, this paper implements a text analysis with keyword frequency data derived from the abstracts of papers published for the past 40 years to take a look at the big picture of key concepts that constitute research subjects of the journal. With keywords and bibliographic data, this paper calculates key research indexes to overview dynamically changes in research focuses in the journal. This paper found that the difference in research performance across research subjects within three continents has declined over time, even though still there exist wide differences in research performance across subjects within nations, and the mid-1990s marked a watershed dividing the 40-year history into two parts.
    Keywords: Telecommunications policy,Key research area,Key research index,Theil index,Text analysis,Bibliographic analysis
    Date: 2017
  7. By: Luis E. Arango (Banco de la República de Colombia); Javier Pantoja; Carlos Velásquez
    Abstract: We carry out a reading analysis that consists of two elements. First, we observe the coherence between monetary policy actions and press releases. In this case, we found that inflation and growth are significant themes in the adoption of the policy measures between September 2004 and March 2016. Moreover, when inflation and economic growth are both raising the monetary actions becomes tighter. Nevertheless, economic activity has always coefficients greater than those of inflation. In second place, the monetary authority goes beyond explanations in the press releases: there are some traces of forward guidance in a number of communications with different degrees of commitment. We also assess whether Colombia’s Central Bank uses its communications as a complementary monetary policy tool and estimate the effectiveness of this strategy. To do so, we use a machine learning technique to unveil the semantic structure of the central bank´s communications. This technique allows us to extract some semantic factors that are used in a structural VAR to identify and measure the impact of these communications on inflation expectations. Our results indicate that Colombia’s Central Bank uses communications as a monetary policy tool and that this strategy influences market inflation expectations. Classification JEL: C4, E4, E5
    Keywords: text mining, content analysis, latent semantic analysis, central bank’s communications
    Date: 2017–10
  8. By: Anastasia Kratsios; Cody B. Hyndman
    Abstract: We develop a method for incorporating relevant non-Euclidean geometric information into a broad range of classical filtering and statistical or machine learning algorithms. We apply these techniques to approximate the solution of the non-Euclidean filtering problem to arbitrary precision. We then extend the particle filtering algorithm to compute our asymptotic solution to arbitrary precision. Moreover, we find explicit error bounds measuring the discrepancy between our locally triangulated filter and the true theoretical non-Euclidean filter. Our methods are motivated by certain fundamental problems in mathematical finance. In particular we apply these filtering techniques to incorporate the non-Euclidean geometry present in stochastic volatility models and optimal Markowitz portfolios. We also extend Euclidean statistical or machine learning algorithms to non-Euclidean problems by using the local triangulation technique, which we show improves the accuracy of the original algorithm. We apply the local triangulation method to obtain improvements of the (sparse) principal component analysis and the principal geodesic analysis algorithms and show how these improved algorithms can be used to parsimoniously estimate the evolution of the shape of forward-rate curves. While focused on financial applications, the non-Euclidean geometric techniques presented in this paper can be employed to provide improvements to a range of other statistical or machine learning algorithms and may be useful in other areas of application.
    Date: 2017–10
  9. By: Benjamin David
    Abstract: The aim of this paper is to highlight the advantages of algorithmic methods for economic research with quantitative orientation. We describe four typical problems involved in econometric modeling, namely the choice of explanatory variables, a functional form, a probability distribution and the inclusion of interactions in a model. We detail how those problems can be solved by using "CART" and "Random Forest" algorithms in a context of massive increasing data availability. We base our analysis on two examples, the identification of growth drivers and the prediction of growth cycles. More generally, we also discuss the application fields of these methods that come from a machine-learning framework by underlining their potential for economic applications.
    Keywords: decision trees, CART, Random Forest
    JEL: C4 C18 C38
    Date: 2017
  10. By: Anastasia Kratsios; Cody B. Hyndman
    Abstract: We introduce a path-dependent geometric framework which generalizes the HJM modeling approach to a wide variety of other asset classes. A machine learning regularization framework is developed with the objective of removing arbitrage opportunities from models within this general framework. The regularization method relies on minimal deformations of a model subject to a path-dependent penalty that detects arbitrage opportunities. We prove that the solution of this regularization problem is independent of the arbitrage-penalty chosen, subject to a fixed information loss functional. In addition to the general properties of the minimal deformation, we also consider several explicit examples. This paper is focused on placing machine learning methods in finance on a sound theoretical basis and the techniques developed to achieve this objective may be of interest in other areas of application.
    Date: 2017–10
  11. By: Michael Ludkovski; James Risk
    Abstract: We consider calculation of capital requirements when the underlying economic scenarios are determined by simulatable risk factors. In the respective nested simulation framework, the goal is to estimate portfolio tail risk, quantified via VaR or TVaR of a given collection of future economic scenarios representing factor levels at the risk horizon. Traditionally, evaluating portfolio losses of an outer scenario is done by computing a conditional expectation via inner-level Monte Carlo and is computationally expensive. We introduce several inter-related machine learning techniques to speed up this computation, in particular by properly accounting for the simulation noise. Our main workhorse is an advanced Gaussian Process (GP) regression approach which uses nonparametric spatial modeling to efficiently learn the relationship between the stochastic factors defining scenarios and corresponding portfolio value. Leveraging this emulator, we develop sequential algorithms that adaptively allocate inner simulation budgets to target the quantile region. The GP framework also yields better uncertainty quantification for the resulting VaR/TVaR estimators that reduces bias and variance compared to existing methods. We illustrate the proposed strategies with two case-studies in two and six dimensions.
    Date: 2017–10
  12. By: Masaaki Fujii (Quantitative Finance Course, Graduate School of Economics, The University of Tokyo); Akihiko Takahashi (Quantitative Finance Course, Graduate School of Economics, The University of Tokyo); Masayuki Takahashi (Quantitative Finance Course, Graduate School of Economics, The University of Tokyo)
    Abstract: We demonstrate that the use of asymptotic expansion as prior knowledge in the “deep BSDE solver†, which is a deep learning method for high dimensional BSDEs proposed by Weinan E, Han & Jentzen (2017), drastically reduces the loss function and accelerates the speed of convergence. We illustrate the technique and its implications using Bergman’s model with different lending and borrowing rates and a class of quadratic-growth BSDEs.
    Date: 2017–10
  13. By: Masaaki Fujii (Faculty of Economics, The University of Tokyo); Akihiko Takahashi (Faculty of Economics, The University of Tokyo); Masayuki Takahashi (Graduate School of Economics, The University of Tokyo)
    Abstract: We demonstrate that the use of asymptotic expansion as prior knowledge in the "deep BSDE solver", which is a deep learning method for high dimensional BSDEs proposed by Weinan E, Han & Jentzen (2017), drastically reduces the loss function and accelerates the speed of convergence. We illustrate the technique and its implications using Bergman's model with different lending and borrowing rates and a class of quadratic-growth BSDEs.
    Date: 2017–10

This nep-big issue is ©2017 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.