nep-big New Economics Papers
on Big Data
Issue of 2018‒04‒02
four papers chosen by
Tom Coupé
University of Canterbury

  1. Can Media and Text Analytics Provide Insights into Labour Market Conditions in China? By Jeannine Bailliu; Xinfen Han; Mark Kruger; Yu-Hsien Liu; Sri Thanabalasingam
  2. The effect of big data on recommendation quality: The example of internet search By Schaefer, Maximilian; Sapi, Geza; Lorincz, Szabolcs
  3. Does Deforestation Increase Malaria Prevalence? Evidence from Satellite Data and Health Surveys By Sebastian Bauhoff; Jonah Busch
  4. Big Data aus wettbewerbs- und ordnungspolitischer Perspektive By Haucap, Justus

  1. By: Jeannine Bailliu; Xinfen Han; Mark Kruger; Yu-Hsien Liu; Sri Thanabalasingam
    Abstract: The official Chinese labour market indicators have been seen as problematic, given their small cyclical movement and their only-partial capture of the labour force. In our paper, we build a monthly Chinese labour market conditions index (LMCI) using text analytics applied to mainland Chinese-language newspapers over the period from 2003 to 2017. We use a supervised machine learning approach by training a support vector machine classification model. The information content and the forecast ability of our LMCI are tested against official labour market activity measures in wage and credit growth estimations. Surprisingly, one of our findings is that the much-maligned official labour market indicators do contain information. However, their information content is not robust and, in many cases, our LMCI can provide forecasts that are significantly superior. Moreover, regional disaggregation of the LMCI illustrates that labour conditions in the export-oriented coastal region are sensitive to export growth, while those in inland regions are not. This suggests that text analytics can, indeed, be used to extract useful labour market information from Chinese newspaper articles.
    Keywords: Econometric and statistical methods, International topics, Labour markets
    JEL: C38 E24 E27
    Date: 2018
  2. By: Schaefer, Maximilian; Sapi, Geza; Lorincz, Szabolcs
    Abstract: Are there economies of scale to data in internet search? This paper is first to use real search engine query logs to empirically investigate how data drives the quality of internet search results. We find evidence that the quality of search results improve with more data on previous searches. Moreover, our results indicate that the type of data matters as well: personalized information is particularly valuable as it massively increases the speed of learning. We also provide some evidence that factors not directly related to data such as the general quality of the applied algorithms play an important role. The suggested methods to disentangle the effect of data from other factors driving the quality of search results can be applied to assess the returns to data in various recommendation systems in e-commerce, including product and information search. We also discuss the managerial, privacy, and competition policy implications of our findings.
    Keywords: Big Data,Recommendation quality,Internet search,E-Commerce,Economies of Scale,Search engines
    Date: 2018
  3. By: Sebastian Bauhoff (Center for Global Development); Jonah Busch (Center for Global Development)
    Abstract: Deforestation has been found to increase malaria risk in some settings, while a growing number of studies have found that deforestation increases malaria prevalence in humans, suggesting that in some cases forest conservation might belong in a portfolio of anti-malarial interventions. However, previous studies of deforestation and malaria prevalence were based on a small number of countries and observations, commonly using cross-sectional analyses of less-than-ideal forest data at the aggregate jurisdictional level. In this paper we combine fourteen years of high-resolution satellite data on forest loss with individual-level survey data on malaria in more than 60,000 rural children in 17 countries in Africa, and fever in more than 470,000 rural children in 41 countries in Latin America, Africa, and Asia. Adhering to methods that we pre-specified in a pre-analysis plan, we tested ex-ante hypotheses derived from previous literature. We did not find that deforestation increases malaria prevalence nor that intermediate levels of forest cover have higher malaria prevalence. Our findings differ from most previous empirical studies, which found that deforestation is associated with greater malaria prevalence in other contexts. We speculate that this difference may be because deforestation in Africa is largely driven by the slow expansion of subsistence or smallholder agriculture for domestic use by long-time residents in stable socio-economic settings rather than by rapid clearing for market-driven agricultural exports by new frontier migrants as in Latin America and Asia. Our results imply that at least in Africa anti-malarial efforts should focus on other proven interventions such as bed nets, spraying, and housing improvements. Forest conservation efforts should focus on securing other benefits of forests, including carbon storage, biodiversity habitat, clean water provision, and other goods and services.
    Keywords: Africa, pre-analysis plan, public health, Sustainable Development Goals
    JEL: C21 C23 I18 Q23
    Date: 2018–03–22
  4. By: Haucap, Justus
    Abstract: Der vorliegende Beitrag untersucht die Rolle von Daten als Wettbewerbsfaktor und analysiert Anpassungsbedarf in den ordnungspolitischen Rahmenbedingungen. Zunächst wird dazu die momentane kartellrechtliche Behandlung des Sammelns und der Verarbeitung von Daten sowie möglicher kartellrechtlicher Anpassungsbedarf erörtert, bevor Implikationen einer datengetriebenen Preisbildung diskutiert werden. Dies betrifft die dynamische als auch die personalisierte, datengetriebene Preissetzung sowie die Preissetzung mit Hilfe von Algorithmen. Behandelt werden zudem Chancen und Herausforderungen der Sharing Economy als ein Beispiel für neue datengetriebene Geschäftsmodelle, bevor die Themen Breitbandausbau und digitales Unternehmertum adressiert werden. Der Beitrag geht zudem exemplarisch auf Veränderungen bei Mobilität, Literaturbetrieb und in der Medienbranche ein.
    Date: 2018

This nep-big issue is ©2018 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.