nep-big New Economics Papers
on Big Data
Issue of 2018‒12‒10
eight papers chosen by
Tom Coupé
University of Canterbury

  1. Term Structure Models During the Global Financial Crisis: A Parsimonious Text Mining Approach By Kiyohiko G. Nishimura; Seisho Sato; Akihiko Takahashi
  2. Lagged correlation-based deep learning for directional trend change prediction in financial time series By Ben Moews; J. Michael Herrmann; Gbenga Ibikunle
  3. Machine learning in algorithmic trading strategy optimization - implementation and efficiency By Przemysław Ryś; Robert Ślepaczuk
  4. Spatial Ine?ciencies in Africa’s Trade Network By Tilman Graff
  5. Model Averaging and its Use in Economics By Steel, Mark F. J.
  6. Investments in big data analytics and firm performance: an empirical investigation of direct and mediating effects By Elisabetta Raguseo; Claudio Vitari
  7. Technological Singularity: A connectomics perspective By Adam Fedyniuk
  8. WIC Participation and Relative Quality of Household Food Purchases: Evidence from FoodAPS By Di Fang; Michael R. Thomsen; Rodolfo M. Nayga, Jr.; Aaron M. Novotny

  1. By: Kiyohiko G. Nishimura (National Graduate Institute for Policy Studies (GRIPS) and CARF, University of Tokyo); Seisho Sato (Graduate School of Economics and CARF, University of Tokyo); Akihiko Takahashi (Graduate School of Economics and CARF, University of Tokyo)
    Abstract: This work develops and estimates a three-factor term structure model with explicit sentiment factors in a period including the global financial crisis, where market confidence was said to erode considerably. It utilizes a large text data of real time, relatively high-frequency market news and takes account of the difficulties in incorporating market sentiment into the models. To the best of our knowledge, this is the first attempt to use this category of data in term-structure models. Although market sentiment or market confidence is often regarded as an important driver of asset markets, it is not explicitly incorporated in traditional empirical factor models for daily yield curve data because they are unobservable. To overcome this problem, we use a text mining approach to generate observable variables which are driven by otherwise unobservable sentiment factors. Then, applying the Monte Carlo filter as a filtering method in a state space Bayesian filtering approach, we estimate the dynamic stochastic structure of these latent factors from observable variables driven by these latent variables. As a result, the three-factor model with text mining is able to distinguish (1) a spread-steepening factor which is driven by pessimists’ view and explaining the spreads related to ultra-long term yields from (2) a spread-flattening factor which is driven by optimists’ view and influencing the long and medium term spreads. Also, the three-factor model with text mining has better fitting to the observed yields than the model without text mining. Moreover, we collect market participants’ views about specific spreads in the term structure and find that the movement of the identified sentiment factors are consistent with the market participants’ views, and thus market sentiment.
    Date: 2018–11
  2. By: Ben Moews; J. Michael Herrmann; Gbenga Ibikunle
    Abstract: Trend change prediction in complex systems with a large number of noisy time series is a problem with many applications for real-world phenomena, with stock markets as a notoriously difficult to predict example of such systems. We approach predictions of directional trend changes via complex lagged correlations between them, excluding any information about the target series from the respective inputs to achieve predictions purely based on such correlations with other series. We propose the use of deep neural networks that employ step-wise linear regressions with exponential smoothing in the preparatory feature engineering for this task, with regression slopes as trend strength indicators for a given time interval. We apply this method to historical stock market data from 2011 to 2016 as a use case example of lagged correlations between large numbers of time series that are heavily influenced by externally arising new information as a random factor. The results demonstrate the viability of the proposed approach, with state-of-the-art accuracies and accounting for the statistical significance of the results for additional validation, as well as important implications for modern financial economics.
    Date: 2018–11
  3. By: Przemysław Ryś (Quantitative Finance Research Group, Faculty of Economic Sciences, University of Warsaw); Robert Ślepaczuk (Quantitative Finance Research Group, Faculty of Economic Sciences, University of Warsaw)
    Abstract: The main aim of this paper was to formulate and analyze the machine learning methods, fitted to the strategy parameters optimization specificity. The most important problems are the sensitivity of a strategy performance to little parameter changes and numerous local extrema distributed over the solution space in an irregular way. The methods were designed for the purpose of significant shortening of the computation time, without a substantial loss of a strategy quality. The efficiency of methods was compared for three different pairs of assets in case of moving averages crossover system. The methods operated on the in sample data, containing 20 years of daily prices between 1998 and 2017. The problem was presented for three sets of two assets portfolios. In the first case, a strategy was trading on the SPX and DAX index futures, in the second on the AAPL and MSFT stocks and finally, in the third case on the HGF and CBF commodities futures. The major hypothesis verified in this thesis is that machine learning methods select strategies with evaluation criterion near to the highest one, but in significantly lower execution time than the Exhaustive Search.
    Keywords: machine learning, algorithm, trading, investment, automatization, strategy, optimization, differential evolutionary method, cross-validation, overfitting
    JEL: C4 C45 C61 C15 G14 G17
    Date: 2018
  4. By: Tilman Graff
    Abstract: Are roads in Africa connecting the right places to promote bene?cial trade? I assess the e?ciency of transport networks for every country in Africa. Using rich data from satellites and online routing services, I simulate optimal trade ?ows over a comprehensive grid of more than 70,000 links covering the entire continent. I employ a recently established framework from the optimal transport in economics literature to maximise over the space of networks and ?nd the optimal road system for every African state. Where would the social planner ideally build new roads and which roads are super?uous in promoting trade? My simulations predict that the entire continent would gain more than 1.1% of total welfare from better organising its national road systems. Comparing current and optimal networks, I then construct a novel dataset of local network ine?ciency for more than 10,000 African grid cells. I analyse roots of the substantial imbalances present in this dataset. I ?nd that colonial infrastructure projects from more than a century ago still persist in signi?cantly skewing trade networks towards a sub-optimal equilibrium. Areas close to former colonial railroads have about 1.7% too much welfare given their position in the network. I also ?nd evidence for regional favouritism, as the birthplaces of African leaders are overequipped with unnecessary roads. Lastly, I uncover a descriptive relationship whereby large transport infrastructure projects from The World Bank are not allocated to regions most in need of additional roads.
    Date: 2018
  5. By: Steel, Mark F. J.
    Abstract: The method of model averaging has become an important tool to deal with model uncertainty, for example in situations where a large amount of different theories exist, as are common in economics. Model averaging is a natural and formal response to model uncertainty in a Bayesian framework, and most of the paper deals with Bayesian model averaging. The important role of the prior assumptions in these Bayesian procedures is highlighted. In addition, frequentist model averaging methods are also discussed. Numerical methods to implement these methods are explained, and I point the reader to some freely available computational resources. The main focus is on uncertainty regarding the choice of covariates in normal linear regression models, but the paper also covers other, more challenging, settings, with particular emphasis on sampling models commonly used in economics. Applications of model averaging in economics are reviewed and discussed in a wide range of areas, among which growth economics, production modelling, finance and forecasting macroeconomic quantities.
    Keywords: Bayesian methods; Model uncertainty; Normal linear model; Prior specification; Robustness
    JEL: C11 C15 C20 C52 O47
    Date: 2017–09–19
  6. By: Elisabetta Raguseo (Polito - Politecnico di Torino [Torino]); Claudio Vitari (MTS - Management Technologique et Strategique - Grenoble École de Management (GEM))
    Date: 2017–11–06
  7. By: Adam Fedyniuk (Nicolaus Copernicus University)
    Abstract: There are many definitions, approaches and models of technological singularity. In most cases it can be summarized as ?changes in the mode of human life, which gives appearance of approaching some essential singularity in the history of human race, beyond which human affairs, as we know them, could not continue?1. When considering the possibility of technological singularity in the form of emergence of superintelligence, we are given also some variety in its facets like, self-improving technology, accelerating change or simply put, intelligence explosion2. These forms vary in the antecedents that define the initial state of affairs that would become the foundation of the arrival of singularity. The debate concerning this hypothetical phenomenon can be polarizing, with no consensus on the horizon. Even when we take into account cognitive science, as the basis for formulation of possible paths technological progress can take, and result in a singularity, there can be a stern critique3. Still, with a balance between enthusiasm and critique, being rooted in constructive approach to this idea, we can make viable attempt at better understanding and predicting what future may hold for the human race4. Comparably, with the advent of connectomics and the advancement of studies on large-scale networks we can make even more detailed attempt at explaining and modelling possible emergence of artificial general intelligence. Especially, due to how we can of define and model emergent properties that such biological organisations possess5. The methods, with which we can view, analyze and discover properties of hierarchical structures can lead the way into a more detailed view of cognition. There is also the possibility of emulating it on a more robust platform (self-improving array of integrated circuits, adiabatic processors and similar). we have the possibility to design AI that will spark a paradigm-shift for the future research and our understanding of mind.
    Keywords: technological singularity, connectomics, network theory, complexity, emergence
    JEL: O31 O33 D85
    Date: 2018–10
  8. By: Di Fang; Michael R. Thomsen; Rodolfo M. Nayga, Jr.; Aaron M. Novotny
    Abstract: We examine the effect of the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC) on the quality of household food purchases using the National Household Food Acquisition and Purchase Survey (FoodAPS) and propensity score matching. A healthy purchasing index (HPI) is used to measure nutritional quality of household food purchases. WIC foods explain the improvement in quality of food purchases, not self-selection of more nutrition-conscious households into the program. The improvement in purchase quality was driven entirely by WIC participating households who redeemed WIC foods during the interview week. There was no significant difference between WIC-participants who did not redeem WIC foods and eligible non-participants. In this sample, there is no evidence that lack of access to clinics has adverse effects on participation nor is there evidence that HPI depends on supermarket access. A supervised machine learning process supports our main conclusion on the importance of WIC foods.
    JEL: C21 D1 I1 I3 I38
    Date: 2018–11

This nep-big issue is ©2018 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.