nep-big New Economics Papers
on Big Data
Issue of 2018‒01‒01
four papers chosen by
Tom Coupé
University of Canterbury

  1. On monitoring development using high resolution satellite images By Potnuru Kishen Suraj; Ankesh Gupta; Makkunda Sharma; Sourabh Bikash Paul; Subhashis Banerjee
  2. Listening to Chaotic Whispers: A Deep Learning Framework for News-oriented Stock Trend Prediction By Ziniu Hu; Weiqing Liu; Jiang Bian; Xuanzhe Liu; Tie-Yan Liu
  3. How do the EM Central Bank talk? A Big Data approach to the Central Bank of Turkey By Joaquin Iglesias; Alvaro Ortiz; Tomasa Rodrigo
  4. Forecasting Tourist Arrivals in Prague: Google Econometrics By Zeynalov, Ayaz

  1. By: Potnuru Kishen Suraj; Ankesh Gupta; Makkunda Sharma; Sourabh Bikash Paul; Subhashis Banerjee
    Abstract: We develop a predictive machine learning based tool for accurate regression of development and socio-economic indicators from high resolution daytime satellite imagery. The indicators are derived from Census 2011 [The Ministry of Home Affairs, Government of India, 2011] and NFHS-4 [The Ministry of Health and Family Welfare, Government of India, 2016] survey data. We use a deep convolutional neural network to build a model for accurate prediction of asset indicators from satellite images. We show that the direct regression of asset indicators is more accurate than transfer learning through night light data, which is a popular proxy for economic development used world wide. We use the asset prediction model for accurate transfer learning of other socio-economic and health indicators which are not intuitively related to observable features in satellite images. The tool can be extended to monitor the progress of development of a region over time, and to flag potential anomalies because of dissimilar outcomes due to different policy interventions in a geographic region by detecting sharp spatial discontinuities in the regression output.
    Date: 2017–12
  2. By: Ziniu Hu; Weiqing Liu; Jiang Bian; Xuanzhe Liu; Tie-Yan Liu
    Abstract: Stock trend prediction plays a critical role in seeking maximized profit from stock investment. However, precise trend prediction is very difficult since the highly volatile and non-stationary nature of stock market. Exploding information on Internet together with advancing development of natural language processing and text mining techniques have enable investors to unveil market trends and volatility from online content. Unfortunately, the quality, trustworthiness and comprehensiveness of online content related to stock market varies drastically, and a large portion consists of the low-quality news, comments, or even rumors. To address this challenge, we imitate the learning process of human beings facing such chaotic online news, driven by three principles: sequential content dependency, diverse influence, and effective and efficient learning. In this paper, to capture the first two principles, we designed a Hybrid Attention Networks to predict the stock trend based on the sequence of recent related news. Moreover, we apply the self-paced learning mechanism to imitate the third principle. Extensive experiments on real-world stock market data demonstrate the effectiveness of our approach.
    Date: 2017–12
  3. By: Joaquin Iglesias; Alvaro Ortiz; Tomasa Rodrigo
    Abstract: We apply the natural language processing or computational linguistics (NLP) to the analysis of the communication policy (i.e statements and minutes) of the Central Bank of Turkey (CBRT). While previous literature has focused on Developed countries, we extend the NLP analysis to the Central Banks of the Emerging Markets using the Dynamic Topic Modelling approach.
    Keywords: Working Paper , Central Banks , Digital economy , Economic Analysis , Emerging Economies , Turkey
    JEL: E52 E58
    Date: 2017–12
  4. By: Zeynalov, Ayaz
    Abstract: It is expected that what people are searching for today is predictive of what they have done recently or will do in the near future. This study analyzes the reliability of Google search data in predicting tourist arrivals and overnight stays in Prague. Three differ- ently weighted weekly Mixed-data sampling (MIDAS) models, ARIMA(1,1,1) with Monthly Google Trends information and model without informative Google Trends variable have been evaluated. The main objective was to assess whether Google Trends information is useful for forecasting tourist arrivals and overnight stays in Prague, as well as whether higher fre- quency data (weekly data) outperform same frequency data methods. The results of the study indicate an undeniable potential that Google Trends offers more accurate forecast- ing, particularly for tourism. The forecasting of the indicators using weekly MIDAS-Beta for tourist arrivals and weekly MIDAS-Almon models for overnight stays outperformed monthly Google Trends using ARIMA and models without Google Trends. The results confirm that predications based on Google searches are advantageous for policy makers and business operating in the tourism sector.
    Keywords: Google trends, Mixed-data sampling, forecasting, tourism
    JEL: C53 E17 L83
    Date: 2017–12–01

This nep-big issue is ©2018 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.