nep-big New Economics Papers
on Big Data
Issue of 2018‒03‒19
thirteen papers chosen by
Tom Coupé
University of Canterbury

  1. The Impact of Big Data on Firm Performance: An Empirical Investigation By Patrick Bajari; Victor Chernozhukov; Ali Hortaçsu; Junichi Suzuki
  2. Measuring Retail Trade Using Card Transactional Data By Diego Bodas; Juan Ramon Garcia; Juan Murillo; Matias Pacce; Tomasa Rodrigo; Juan de Dios Romero; Pep Ruiz; Camilo Ulloa; Heribert Valero
  3. Automation, skills use and training By Glenda Quintini
  4. Analysis of Financial Credit Risk Using Machine Learning By Jacky C. K. Chow
  5. Algorithmic Collusion in Cournot Duopoly Market: Evidence from Experimental Economics By Nan Zhou; Li Zhang; Shijian Li; Zhijian Wang
  6. Credit Risk Analysis using Machine and Deep Learning models By Peter Addo; Dominique Guegan; Bertrand Hassani
  7. Decision Sciences, Economics, Finance, Business, Computing, and Big Data: Connections By Chia-Lin Chang; Michael McAleer; Wing-Keung Wong
  8. "Bitcoin technical trading with artificial neural network" By Masafumi Nakano; Akihiko Takahashi; Soichiro Takahashi
  9. Radial Basis Functions Neural Networks for Nonlinear Time Series Analysis and Time-Varying Effects of Supply Shocks By KANAZAWA, Nobuyuki
  10. The Bank of Canada 2015 Retailer Survey on the Cost of Payment Methods: Nonresponse By Stan Hatko
  11. Strategy for Excavating Best Practice of Scientific and Cultural Contents By Jaeho Lee; Jaekwoun Shim; Hyunkyung Shin; Gayoung Lee; Junggyu Lee
  12. Bitcoin technical trading with artificial neural network By Masafumi Nakano; Akihiko Takahashi; Soichiro Takahashi
  13. "Bitcoin technical trading with artificial neural network" By Masafumi Nakano; Akihiko Takahashi; Soichiro Takahashi

  1. By: Patrick Bajari; Victor Chernozhukov; Ali Hortaçsu; Junichi Suzuki
    Abstract: In academic and policy circles, there has been considerable interest in the impact of “big data” on firm performance. We examine the question of how the amount of data impacts the accuracy of Machine Learned models of weekly retail product forecasts using a proprietary data set obtained from Amazon. We examine the accuracy of forecasts in two relevant dimensions: the number of products (N), and the number of time periods for which a product is available for sale (T). Theory suggests diminishing returns to larger N and T, with relative forecast errors diminishing at rate 1/√N+1/√T. Empirical results indicate gains in forecast improvement in the T dimension; as more and more data is available for a particular product, demand forecasts for that product improve over time, though with diminishing returns to scale. In contrast, we find an essentially flat N effect across the various lines of merchandise: with a few exceptions, expansion in the number of retail products within a category does not appear associated with increases in forecast performance. We do find that the firm’s overall forecast performance, controlling for N and T effects across product lines, has improved over time, suggesting gradual improvements in forecasting from the introduction of new models and improved technology.
    JEL: C53 L81
    Date: 2018–02
    URL: http://d.repec.org/n?u=RePEc:nbr:nberwo:24334&r=big
  2. By: Diego Bodas; Juan Ramon Garcia; Juan Murillo; Matias Pacce; Tomasa Rodrigo; Juan de Dios Romero; Pep Ruiz; Camilo Ulloa; Heribert Valero
    Abstract: In this paper we present a high-dimensionality Retail Trade Index (RTI) constructed to nowcast the retail trade sector economic performance in Spain, using Big Data sources and techniques. The data are the footprints of BBVA clients from their credit or debit card transactions at Spanish point of sale (PoS) terminals.
    Keywords: Working Paper , Economic Analysis , Spain
    JEL: C32 C81 E21
    Date: 2018–03
    URL: http://d.repec.org/n?u=RePEc:bbv:wpaper:1803&r=big
  3. By: Glenda Quintini
    Abstract: This study focuses on the risk of automation and its interaction with training and the use of skills at work. Building on the expert assessment carried out by Carl Frey and Michael Osborne in 2013, the paper estimates the risk of automation for individual jobs based on the Survey of Adult Skills (PIAAC). The analysis improves on other international estimates of the individual risk of automation by using a more disaggregated occupational classification and identifying the same automation bottlenecks emerging from the experts’ discussion. Hence, it more closely aligns to the initial assessment of the potential automation deriving from the development of Machine Learning. Furthermore, this study investigates the same methodology using national data from Germany and United Kingdom, providing insights into the robustness of the results. The risk of automation is estimated for the 32 OECD countries that have participated in the Survey of Adult Skills (PIAAC) so far. Beyond the share of jobs likely to be significantly disrupted by automation of production and services, the accent is put on characteristics of these jobs and the characteristics of the workers who hold them. The risk is also assessed against the use of ICT at work and the role of training in helping workers transit to new career opportunities.
    JEL: J20 J21 J23 J24
    Date: 2018–03–08
    URL: http://d.repec.org/n?u=RePEc:oec:elsaab:202-en&r=big
  4. By: Jacky C. K. Chow
    Abstract: Corporate insolvency can have a devastating effect on the economy. With an increasing number of companies making expansion overseas to capitalize on foreign resources, a multinational corporate bankruptcy can disrupt the world's financial ecosystem. Corporations do not fail instantaneously; objective measures and rigorous analysis of qualitative (e.g. brand) and quantitative (e.g. econometric factors) data can help identify a company's financial risk. Gathering and storage of data about a corporation has become less difficult with recent advancements in communication and information technologies. The remaining challenge lies in mining relevant information about a company's health hidden under the vast amounts of data, and using it to forecast insolvency so that managers and stakeholders have time to react. In recent years, machine learning has become a popular field in big data analytics because of its success in learning complicated models. Methods such as support vector machines, adaptive boosting, artificial neural networks, and Gaussian processes can be used for recognizing patterns in the data (with a high degree of accuracy) that may not be apparent to human analysts. This thesis studied corporate bankruptcy of manufacturing companies in Korea and Poland using experts' opinions and financial measures, respectively. Using publicly available datasets, several machine learning methods were applied to learn the relationship between the company's current state and its fate in the near future. Results showed that predictions with accuracy greater than 95% were achievable using any machine learning technique when informative features like experts' assessment were used. However, when using purely financial factors to predict whether or not a company will go bankrupt, the correlation is not as strong.
    Date: 2018–02
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1802.05326&r=big
  5. By: Nan Zhou; Li Zhang; Shijian Li; Zhijian Wang
    Abstract: Algorithmic collusion is an emerging concept in current artificial intelligence age. Whether algorithmic collusion is a creditable threat remains as an argument. In this paper, we propose an algorithm which can extort its human rival to collude in a Cournot duopoly competing market. In experiments, we show that, the algorithm can successfully extorted its human rival and gets higher profit in long run, meanwhile the human rival will fully collude with the algorithm. As a result, the social welfare declines rapidly and stably. Both in theory and in experiment, our work confirms that, algorithmic collusion can be a creditable threat. In application, we hope, the frameworks, the algorithm design as well as the experiment environment illustrated in this work, can be an incubator or a test bed for researchers and policymakers to handle the emerging algorithmic collusion.
    Date: 2018–02
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1802.08061&r=big
  6. By: Peter Addo (Lead Data Scientist - SNCF Mobilité); Dominique Guegan (UP1 - Université Panthéon-Sorbonne, Labex ReFi - UP1 - Université Panthéon-Sorbonne, University of Ca’ Foscari [Venice, Italy], CES - Centre d'économie de la Sorbonne - CNRS - Centre National de la Recherche Scientifique - UP1 - Université Panthéon-Sorbonne, IPAG - IPAG Business School - Ipag); Bertrand Hassani (Labex ReFi - UP1 - Université Panthéon-Sorbonne, Capgemini Consulting [Paris])
    Abstract: Due to the hyper technology associated to Big Data, data availability and computing power, most banks or lending financial institutions are renewing their business models. Credit risk predictions, monitoring, model reliability and effective loan processing are key to decision making and transparency. In this work, we build binary classifiers based on machine and deep learning models on real data in predicting loan default probability. The top 10 important features from these models are selected and then used in the modelling process to test the stability of binary classifiers by comparing performance on separate data. We observe that tree-based models are more stable than models based on multilayer artificial neural networks. This opens several questions relative to the intensive used of deep learning systems in the enterprises.
    Keywords: Credit risk,Financial regulation,Data Science,Bigdata,Deep learning
    Date: 2018–02
    URL: http://d.repec.org/n?u=RePEc:hal:cesptp:halshs-01719983&r=big
  7. By: Chia-Lin Chang (National Chung Hsing University); Michael McAleer (Asia University, University of Sydney Business School, Erasmus University Rotterdam); Wing-Keung Wong (Asia University, China Medical University Hospital, Hang Seng Management College)
    Abstract: This paper provides a review of some connecting literature in Decision Sciences, Economics, Finance, Business, Computing, and Big Data. We then discuss some research that is related to the six cognate disciplines. Academics could develop theoretical models and subsequent econometric and statistical models to estimate the parameters in the associated models. Moreover, they could then conduct simulations to examine whether the estimators or statistics in the new theories on estimation and hypothesis have small size and high power. Thereafter, academics and practitioners could then apply their theories to analyze interesting problems and issues in the six disciplines and other cognate areas.
    Keywords: Decision sciences; economics; finance; business; computing; and big data; theoretical models; econometric and statistical models; applications.
    JEL: A10 G00 G31 O32
    Date: 2018–03–14
    URL: http://d.repec.org/n?u=RePEc:tin:wpaper:20180024&r=big
  8. By: Masafumi Nakano (Graduate School of Economics, The University of Tokyo); Akihiko Takahashi (Faculty of Economics, The University of Tokyo); Soichiro Takahashi (Graduate School of Economics, The University of Tokyo)
    Abstract: This paper explores Bitcoin trading based on artificial neural networks for the return prediction. In particular, our deep learning method successfully discovers trading signals through a seven layered neural network structure for given input data of technical indicators, which are calculated by the past time-series of Bitcoin returns over every 15 minutes. Under feasible settings of execution costs, the numerical experiments demonstrate that our approach significantly improves the performance of a buy-and-hold strategy. Especially, our model performs well for a challenging period from December 2017 to January 2018, during which Bitcoin suffers from substantial minus returns.
    URL: http://d.repec.org/n?u=RePEc:tky:fseres:2017cf1078&r=big
  9. By: KANAZAWA, Nobuyuki
    Abstract: I propose a flexible nonlinear method for studying the time series properties of macroeconomic variables. In particular, I focus on a class of Artificial Neural Networks (ANN) called the Radial Basis Functions (RBF). To assess the validity of the RBF approach in the macroeconomic time series analysis, I conduct a Monte Carlo experiment using the data generated from a nonlinear New Keynesian (NK) model. I find that the RBF estimator can uncover the structure of the nonlinear NK model from the simulated data whose length is as small as 300 periods. Finally, I apply the RBF estimator to the quarterly US data and show that the response of the macroeconomic variables to a positive supply shock exhibits a substantial time variation. In particular, the positive supply shocks are found to have significantly weaker expansionary effects during the zero lower bound periods as well as periods between 2003 and 2004. The finding is consistent with a basic NK model, which predicts that the higher real interest rate due to the monetary policy inaction weakens the effects of supply shocks.
    Keywords: Neural Networks, Radial Basis Functions, Zero Lower Bound, Supply Shocks
    JEL: C45 E31
    Date: 2018–03
    URL: http://d.repec.org/n?u=RePEc:hit:hiasdp:hias-e-64&r=big
  10. By: Stan Hatko
    Abstract: Nonresponse is a considerable challenge in the Retailer Survey on the Cost of Payment Methods conducted by the Bank of Canada in 2015. There are two types of nonresponse in this survey: unit nonresponse, in which a business does not reply to the entire survey, and item nonresponse, in which a business does not respond to particular questions within the survey. Both types may create a bias when computing statistics such as means and weighted totals for different variables. This technical report analyzes solutions to fix the problem of nonresponse in the survey data. Unit nonresponse is addressed through response probability adjustment, in which response probabilities are modelled using logistic regression (a clustering approach for the unit response probabilities is also considered) and are used in the construction of a set of survey weights. Item nonresponse is addressed through imputation, in which the gradient boosting machine (GBM) and extreme gradient boosting (XGBoost) algorithms are used to predict missing values for variables of interest.
    Keywords: Central bank research
    JEL: C81 C83
    Date: 2017
    URL: http://d.repec.org/n?u=RePEc:bca:bocatr:107&r=big
  11. By: Jaeho Lee (Gyeoing National University of Education); Jaekwoun Shim (Korea University); Hyunkyung Shin (Gachon University); Gayoung Lee (Korea Foundation for the Advanced of Science & Creativity); Junggyu Lee (Korea Foundation for the Advanced of Science & Creativity)
    Abstract: Since the 1960s, the Korean Government has been carrying out various projects aiming at the popularization of science. The science popularization projects conducted by the Korean Government could be classified by the ages as follows. 1960s was the beginning stage, 1970s and 1980s was the formation stage, and from 1990s can be regarded as the extension stage of scientific and cultural activities. The Korean government's science popularization project has encountered a great turning point in 2016. The most important reason for this turning point is the beginning of a new paradigm by the Fourth Industrial Revolution. Under such circumstances, the Korean government is in the process of establishing a strategy for developing, disseminating, and managing scientific and cultural contents suitable for the Fourth Industrial Revolution era. In order to achieve such the project?s goal successfully, we have benchmarked the best practices of scientific and cultural contents. Benchmarking strategy for excavating the best practice of scientific and cultural contents proceeded step by step as follows. First, benchmarking countries were selected. To this end, the R&D investment cost ratios in terms of GDP, national science and technology innovation competency rankings, and national brand ranks have been considered in total. Five benchmarking countries were selected based on these evaluation criteria, and the results were the United States, Japan, Germany, the United Kingdom, and China. Second, we proposed an analytical framework for excavating best practices in scientific and cultural contents. The proposed analytical framework for analyzing scientific and cultural contents was designed to analyze the service types of contents, the country of production, the field of study, the user types of contents service, technological areas related to the Fourth Industrial Revolution, and learning value. In this paper the technical fields related to the Fourth Industrial Revolution were classified as follows. AR(Argumented Reality)/VR(Virtual Reality)/MR(Mixed Reality), Artificial Intelligence(AI), ICBM(Internet of Things, Cloud, Big Data, Mobile), Robot and so on.?This research was supported by Korea Foundation for the Advancement Science & Creativity(KOFAC)?
    Keywords: science popularization, scientific and cultural contents, the Fourth Industrial Revolution
    JEL: I29
    Date: 2017–10
    URL: http://d.repec.org/n?u=RePEc:sek:iacpro:5908316&r=big
  12. By: Masafumi Nakano (Graduate School of Economics, University of Tokyo); Akihiko Takahashi (Graduate School of Economics, University of Tokyo); Soichiro Takahashi (Graduate School of Economics, University of Tokyo)
    Abstract: This paper explores Bitcoin trading based on artificial neural networks for the return prediction. In particular, our deep learning method successfully discovers trading signals through a seven layered neural network structure for given input data of technical indicators, which are calculated by the past time-series of Bitcoin returns over every 15 minutes. Under feasible settings of execution costs, the numerical experiments demonstrate that our approach significantly improves the performance of a buy-and-hold strategy. Especially, our model performs well for a challenging period from December 2017 to January 2018, during which Bitcoin suffers from substantial minus returns.
    URL: http://d.repec.org/n?u=RePEc:cfi:fseres:cf430&r=big
  13. By: Masafumi Nakano (Graduate School of Economics, The University of Tokyo); Akihiko Takahashi (Faculty of Economics, The University of Tokyo); Soichiro Takahashi (Graduate School of Economics, The University of Tokyo)
    Abstract: This paper explores Bitcoin trading based on artificial neural networks for the return prediction. In particular, our deep learning method successfully discovers trading signals through a seven layered neural network structure for given input data of technical indicators, which are calculated by the past time-series of Bitcoin returns over every 15 minutes. Under feasible settings of execution costs, the numerical experiments demonstrate that our approach significantly improves the performance of a buy-and-hold strategy. Especially, our model performs well for a challenging period from December 2017 to January 2018, during which Bitcoin suffers from substantial minus returns.
    URL: http://d.repec.org/n?u=RePEc:tky:fseres:2018cf1078&r=big

This nep-big issue is ©2018 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.