nep-big New Economics Papers
on Big Data
Issue of 2018‒10‒15
fifteen papers chosen by
Tom Coupé
University of Canterbury

  1. Digitalization and development cooperation: An assessment of the debate and its implications for policy By Heimerl, Veronika; Raza, Werner
  2. Deep Factor Model By Kei Nakagawa; Takumi Uchida; Tomohisa Aoshima
  3. "The future of valuation An integrated academic and practical view on the impact of digitalisation on real estate valuation" By Florian Hackelberg; Matthias Kirsten
  4. Semi-supervised Text Regression with Conditional Generative Adversarial Networks By Tao Li; Xudong Liu; Shihan Su
  5. Failing to engage? Big data, smart cities and the built environment sector: an analysis of international case studies By Tim Dixon; Martin Sexton; Jorn Van De Wetering
  6. Racing With or Against the Machine? Evidence from Europe By Terry Gregory; Anna Salomons; Ulrich Zierahn
  7. An Artificial Neural Network Approach to Acreage-Share Modeling By Ramsey, Steven M.; Bergtold, Jason S.; Heier Stamm, Jessica
  8. Term Structure Models During the Global Financial Crisis: A Parsimonious Text Mining Approach By Kiyohiko G. Nishimura; Seisho Sato; Akihiko Takahashi
  9. Complex Interactions and Strategic Pricing of Brand-Level Nut Products in the United States: A Graph Theoretic Approach By Cheng, Guo; Dharmasena, Senarath
  10. On normalization and algorithm selection for unsupervised outlier detection By Sevvandi Kandanaarachchi; Mario A Munoz; Rob J Hyndman; Kate Smith-Miles
  11. Agro-Climatic Data by County (ACDC): Methods and Data Generating Processes By Yun, Seong Do; Gramig, Benjamin M.
  12. Innovation growth clusters: Lessons from the industrial revolution By DUDLEY, Leonard; RAUH, Christopher
  13. The Equilibrium Effects of Information Deletion: Evidence from Consumer Credit Markets By Andres Liberman; Christopher Neilson; Luis Opazo; Seth Zimmerman
  14. Food Insecurity, Poverty, Unemployment and Obesity in the United States: Effect of (Not) Considering Back-Door Paths in Policy Modeling By Dharmasena, Senarath; Bessler, David A.
  15. Predictable biases in macroeconomic forecasts and their impact across asset classes By Félix, Luiz; Kräussl, Roman; Stork, Philip

  1. By: Heimerl, Veronika; Raza, Werner
    Abstract: Digitalization technologies, such as automation, robotization, artificial intelligence and Big Data are increasingly shaping economic processes. In public discourse, extreme outlooks are widespread. Either digital technologies will provide the solution to most contemporary problems, or dystopian scenarios claim that digital technologies make human labor redundant for production processes, resulting in soaring unemployment rates and widespread social disintegration. Against this backdrop, research on the impacts of digitalization on developing countries is still at an early stage. This briefing paper provides a critical summary of the literature on the challenges and potentials of digitalization for developing economies. A sober account of the historical evidence suggests that both euphoria and dystopian views are misplaced. The major policy challenge for development cooperation will consist in supporting LDC governments in their efforts to manage the effects of the economic and social transition process brought about by digitalization.
    Keywords: digitalization,automation,robotization,technological revolution,development cooperation,developing countries
    Date: 2018
    URL: http://d.repec.org/n?u=RePEc:zbw:oefseb:19&r=big
  2. By: Kei Nakagawa; Takumi Uchida; Tomohisa Aoshima
    Abstract: We propose to represent a return model and risk model in a unified manner with deep learning, which is a representative model that can express a nonlinear relationship. Although deep learning performs quite well, it has significant disadvantages such as a lack of transparency and limitations to the interpretability of the prediction. This is prone to practical problems in terms of accountability. Thus, we construct a multifactor model by using interpretable deep learning. We implement deep learning as a return model to predict stock returns with various factors. Then, we present the application of layer-wise relevance propagation (LRP) to decompose attributes of the predicted return as a risk model. By applying LRP to an individual stock or a portfolio basis, we can determine which factor contributes to prediction. We call this model a deep factor model. We then perform an empirical analysis on the Japanese stock market and show that our deep factor model has better predictive capability than the traditional linear model or other machine learning methods. In addition , we illustrate which factor contributes to prediction.
    Date: 2018–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1810.01278&r=big
  3. By: Florian Hackelberg; Matthias Kirsten
    Abstract: BackgroundIn recent years, digitalisation has been introduced in the real estate industry and enjoys a growing popularity in the field of real estate valuation in Germany. This trend has already changed the valuation practices of both large real estate service companies and small valuation companies. In addition, new players enter the real estate market with numerous valuation start-ups and online real estate platforms. Given the massive transition, that digitalisation has already brought to other industry sectors, it is clear that the shift in real estate valuation has only just begun. This leads to the question of what the ""future of the valuation"" could look like and what is already available on the market.Area of research and questions to be addressedWhen it comes to digitalisation and valuation, the following questions are currently raised among the key issues by real estate academics and practitioners alike and should be addressed during the meeting:In what way might digitalisation change the valuation process in the future? How will digitalisation affect the critical value determination of the human appraiser? Will digital processes, artificial intelligence and big data completely replace the experience and market knowledge of the appraisers? What does the field of work of the surveyor of the future look like, (eg one part technology one part science and one part human?) What are limitations of automated real estate valuation and which areas may remain as the “classical” valuationProf. Dr. Florian Hackelberg MRICS, Professor for Real Estate Valuation at the HAWK University of Applied Sciences and Arts in Germany, will introduce the outlined topic from an academic perspective, while M.Sc. Dipl.-Ing. (FH) Matthias Kirsten MRICS, valuation expert of Value AG - the valuation group will share his view on the latest trends and developments in the market
    Keywords: Appraisal; Artificial Intelligence; Big data; Digitalisation; Valuation
    JEL: R3
    Date: 2018–01–01
    URL: http://d.repec.org/n?u=RePEc:arz:wpaper:eres2018_56&r=big
  4. By: Tao Li; Xudong Liu; Shihan Su
    Abstract: Enormous online textual information provides intriguing opportunities for understandings of social and economic semantics. In this paper, we propose a novel text regression model based on a conditional generative adversarial network (GAN), with an attempt to associate textual data and social outcomes in a semi-supervised manner. Besides promising potential of predicting capabilities, our superiorities are twofold: (i) the model works with unbalanced datasets of limited labelled data, which align with real-world scenarios; and (ii) predictions are obtained by an end-to-end framework, without explicitly selecting high-level representations. Finally we point out related datasets for experiments and future research directions.
    Date: 2018–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1810.01165&r=big
  5. By: Tim Dixon; Martin Sexton; Jorn Van De Wetering
    Abstract: We live in an increasingly urbanised world. Currently more than 50 per cent of the world’s population lives in cities, and this is set to grow to 70% by 2050. Recently we have seen an increasing focus on information and communications technology (ICT) to argue the case for ‘smart cities’. This places a strong emphasis on an ICT-led and a ‘data-driven’ future, which also positions the development of new products, processes, organisational methods and markets at the heart of the continued ambition for urban economic growth. The interconnected agendas of smart cities and big data and open data, on the face of it, provide bold and exciting opportunities for the built environment professions. But, what in reality will those opportunities be, and what are the challenges? This research, conducted from 2015-2016, seeks to address those questions and focuses on the city level.The research focuses on a technocratic approach to use of data in smart cities, and how we can make this accessible to built environment stakeholders. We explore the extent to which the built environment sector is engaging with the smart city at ‘programme’ scale (i.e. city-wide) and ‘project’ scale (i.e. urban data platform and other applications). To do this we compare four smart city programmes to pose three primary research questions:How have smart city programmes and projects evolved in these cities? What has shaped this evolution? What is the nature and extent of the built environment sector’s role in such programmes and projects?The research consisted of interviews in four case studies in Bristol, Milton Keynes, Amsterdam and Taipei and a UK expert workshop."
    Keywords: Big data; New Technology; Open Data; Smart City; Urban Studies
    JEL: R3
    Date: 2018–01–01
    URL: http://d.repec.org/n?u=RePEc:arz:wpaper:eres2018_332&r=big
  6. By: Terry Gregory; Anna Salomons; Ulrich Zierahn
    Abstract: A fast-growing literature shows that digital technologies are displacing labor from routine tasks, raising concerns that labor is racing against the machine. We develop a task-based framework to estimate the aggregate labor demand and employment effects of routine-replacing technological change (RRTC), along with the underlying mechanisms. We show that while RRTC has indeed had strong displacement effects in the European Union between 1999 and 2010, it has simultaneously created new jobs through increased product demand, outweighing displacement effects and resulting in net employment growth. However, we also show that this finding depends on the distribution of gains from technological progress.
    Keywords: labor demand, employment, routine-replacing technological change, tasks, local demand spillovers
    JEL: E24 J23 J24 O33
    Date: 2018
    URL: http://d.repec.org/n?u=RePEc:ces:ceswps:_7247&r=big
  7. By: Ramsey, Steven M.; Bergtold, Jason S.; Heier Stamm, Jessica
    Keywords: Research Methods/Econometrics/Stats, Production Economics, Food and Agricultural Policy Analysis
    Date: 2018–06–20
    URL: http://d.repec.org/n?u=RePEc:ags:aaea18:274395&r=big
  8. By: Kiyohiko G. Nishimura (National Graduate Institute for Policy Studies (GRIPS) and CARF, University of Tokyo); Seisho Sato (Gradtuate School of Economics and CARF, University of Tokyo); Akihiko Takahashi (Gradtuate School of Economics and CARF, University of Tokyo)
    Abstract: This work develops and estimates a three-factor term structure model with ex-plicit sentiment factors in a period including the global financial crisis, where market confidence was said to erode considerably. It utilizes a large text data of real time, relatively high-frequency market news and takes account of the dfficulties in incor-porating market sentiment into the models. To the best of our knowledge, this is the first attempt to use this category of data in term-structure models. Although market sentiment or market confidence is often regarded as an important driver of asset markets, it is not explicitly incorporated in traditional empirical factor models for daily yield curve data because they are unobservable. To overcome this problem, we use a text mining approach to generate observable variables which are driven by otherwise unobservable sentiment factors. Then, applying the Monte Carlo filter as a filtering method in a state space Bayesian filtering approach, we estimate the dynamic stochastic structure of these latent factors from observable variables driven by these latent variables. As a result, the three-factor model with text mining is able to distinguish (1) a spread-steepening factor which is driven by pessimists¡Ç view and explaining the spreads related to ultra-long term yields from (2) a spread-flattening factor which is driven by optimists¡Ç view and influencing the long and medium term spreads. Also, the three-factor model with text mining has better fitting to the observed yields than the model without text mining. Moreover, we collect market participants¡Ç views about specific spreads in the term structure and find that the movement of the identified sentiment factors are consistent with the market participants¡Ç views, and thus market sentiment.
    Date: 2018–10
    URL: http://d.repec.org/n?u=RePEc:cfi:fseres:cf446&r=big
  9. By: Cheng, Guo; Dharmasena, Senarath
    Abstract: Nuts such as almonds, pecans, walnuts, and pistachios are available in the U.S. market in different forms and brands. There are well-known national brands as well as not-so well-known private label and store brands. Nut producing firms compete for market share and strategically price, brand, advertise and position products in the market. Conventional brand-level analysis of such markets is achieved through calculation of market power and price cost margins assuming the presence of pure strategy Bertrand-Nash Equilibrium in prices. This is supported by a set of prior assumptions with regards to the structure of the market and oftentimes these are too restrictive, because pricing decisions are made in a complex multivariate situation with numerous interactions between variables that determine the prices and prices themselves. In this study, using 2015 Nielsen scanner data for nut products, complex causal relationships among brand level prices are estimated using cutting-edge machine learning algorithms. Also within this method, the concept of Markov Blankets is used to identify specific brands that are immediately important for a given brand. Several national brands were identified as a direct cause of the price of store brands. Even though store brands were associated with the highest market share, they had no influence on any other brands’ pricing decision and strategy.
    Keywords: Agribusiness, Industrial Organization, Marketing
    Date: 2018–01–16
    URL: http://d.repec.org/n?u=RePEc:ags:saea18:266567&r=big
  10. By: Sevvandi Kandanaarachchi; Mario A Munoz; Rob J Hyndman; Kate Smith-Miles
    Abstract: This paper demonstrates that the performance of various outlier detection methods depends sensitively on both the data normalization schemes employed, as well as characteristics of the datasets. Recasting the challenge of understanding these dependencies as an algorithm selection problem, we perform the first instance space analysis of outlier detection methods. Such analysis enables the strengths and weaknesses of unsupervised outlier detection methods to be visualized and insights gained into which method and normalization scheme should be selected to obtain the most likely best performance for a given dataset.
    Date: 2018
    URL: http://d.repec.org/n?u=RePEc:msh:ebswps:2018-16&r=big
  11. By: Yun, Seong Do; Gramig, Benjamin M.
    Abstract: Due to the recent popularity of raster imagery data (high resolution grid cell data), the demand for weather, soil/land and related data for research and applied decision support is increasing rapidly. Agro-Climatic Data by County (ACDC) is designed to provide the most widely-used variables extracted from the most popular high resolution gridded data sources to end users of agro-climatic variables who may not be equipped to process large geospatial datasets from multiple publicly available sources that are provided in different data formats and spatial scales. Annual county level crop yield data in USDA NASS for 1981-2015 are provided for corn, soybeans, upland cotton and winter wheat yields, and customizable growing degree days (GDDs) and cumulative precipitation for two groups of months (March-August and April-October) to capture different growing season periods for the crops from the PRISM weather data. Soil characteristic data in gSSURGO are also included for each county in the data set. All weather and soil data are processed based using NLCD land cover/land use data to exclude data for land that is not being used for non-forestry agricultural uses. This paper explains the numerical and geocomputational methods and data generating processes employed in ACDC.
    Keywords: Production Economics, Research Methods/ Statistical Methods, Resource /Energy Economics and Policy
    Date: 2018–01–16
    URL: http://d.repec.org/n?u=RePEc:ags:saea18:266575&r=big
  12. By: DUDLEY, Leonard; RAUH, Christopher
    Abstract: Over three centuries ago, a new technology suddenly increased the amount and frequency of available information. Might such «Big Data» have disrupted the causal relationships linking economic growth and innovation? Previous research has affirmed that a society’s economic success during the Industrial Revolution depended on its institutions. Here we examine the hypothesis that by allowing people to cooperate more easily with one another, language standardization raised a society’s rate of innovation. As a result, the region could attract the resources needed to grow more rapidly. Empirical tests with 117 innovations and 251 Western cities suggest that the presence of a standardized tongue helps to explain the burst of innovation and growth observed between 1700 and 1850. Moreover, once one has accounted for language standardization, institutional quality has little further power to explain economic progress.
    Date: 2018
    URL: http://d.repec.org/n?u=RePEc:mtl:montde:2018-14&r=big
  13. By: Andres Liberman; Christopher Neilson; Luis Opazo; Seth Zimmerman
    Abstract: This paper exploits a large-scale natural experiment to study the equilibrium effects of information restrictions in credit markets. In 2012, Chilean credit bureaus were forced to stop reporting defaults for 2.8 million individuals (21% of the adult population). We show that the effects of information deletion on aggregate borrowing and total surplus are theoretically ambiguous and depend on the pre-deletion demand and cost curves for defaulters and non-defaulters. Using panel data on the universe of bank borrowers in Chile combined with the deleted registry information, we implement machine learning techniques to measure changes in lenders' cost predictions following deletion. Deletion reduces (raises) predicted costs the most for poorer defaulters (non-defaulters) with limited borrowing histories. Using a difference-in-differences design, we find that individuals exposed to increases in predicted costs reduce borrowing by 6.4%, while those exposed to decreases raise borrowing by 11.8% following the deletion, for a 3.5% aggregate drop in borrowing. Using the difference-in-difference estimates as inputs into the theoretical framework, we find evidence that deletion reduced aggregate welfare under a variety of assumptions about lenders' pricing strategies.
    JEL: D14 D82 G20
    Date: 2018–09
    URL: http://d.repec.org/n?u=RePEc:nbr:nberwo:25097&r=big
  14. By: Dharmasena, Senarath; Bessler, David A.
    Abstract: The causes and consequences of food environment factors such as food insecurity, poverty, unemployment and obesity in the United States are complex. Once causality patterns with regards to these variables are identified, it is important to recognize front-door (Pearl, 2000) and back-door paths (Pearl, 2000) associated with these variables to make sensible and credible policy decisions. These policy interventions are known as performing do-Calculus (Pearl 2000, Spirtes et al., 2000) in causality literature. In this study we use the complex interactions of four food environment variables in the United States (food insecurity, poverty, unemployment and obesity) estimated using artificial intelligence and directed acyclic graphs by Dharmasena, Bessler and Capps (2016) and perform several policy interventions, recognizing front-door and back-door paths. Such policy simulations are vital for agencies not only to design appropriate policies for food assistance, poverty alleviation, combating food insecurity and obesity, but also to recognize effects of policy prior to the desired intervention. Preliminary analysis shows that there are two front-door paths from income to food insecurity, via poverty and via unemployment. Also, there is a front-door path from poverty to food insecurity, while there is an important back- door path from poverty to food insecurity via unemployment.
    Keywords: Agricultural and Food Policy, Food Consumption/Nutrition/Food Safety, Food Security and Poverty, Health Economics and Policy, Research Methods/ Statistical Methods
    Date: 2018–01–16
    URL: http://d.repec.org/n?u=RePEc:ags:saea18:266569&r=big
  15. By: Félix, Luiz; Kräussl, Roman; Stork, Philip
    Abstract: This paper investigates how biases in macroeconomic forecasts are associated with economic surprises and market responses across asset classes around US data announcements. We find that the skewness of the distribution of economic forecasts is a strong predictor of economic surprises, suggesting that forecasters behave strategically (rational bias) and possess private information. Our results also show that consensus forecasts of US macroeconomic releases embed anchoring. Under these conditions, both economic surprises and the returns of assets that are sensitive to macroeconomic conditions are predictable. Our findings indicate that local equities and bond markets are more predictable than foreign markets, currencies and commodities. Economic surprises are found to link to asset returns very distinctively through the stages of the economic cycle, whereas they strongly depend on economic releases being inflation- or growth-related. Yet, when forecasters fail to correctly forecast the direction of economic surprises, regret becomes a relevant cognitive bias to explain asset price responses. We find that the behavioral and rational biases encountered in US economic forecasting also exists in Continental Europe, the United Kingdom and Japan, albeit, to a lesser extent.
    Keywords: anchoring,rational bias,economic surprises,predictability,stocks,bonds,currencies,commodities,machine learning
    JEL: G14 F47 E44
    Date: 2018
    URL: http://d.repec.org/n?u=RePEc:zbw:cfswop:596&r=big

This nep-big issue is ©2018 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.