nep-big New Economics Papers
on Big Data
Issue of 2018‒04‒30
nine papers chosen by
Tom Coupé
University of Canterbury

  1. Economic predictions with big data: the illusion of sparsity By Giannone, Domenico; Lenza, Michele; Primiceri, Giorgio E.
  2. Using Massive Online Choice Experiments to Measure Changes in Well-being By Erik Brynjolfsson; Felix Eggers; Avinash Gannamaneni
  3. Automatic image analysis and real estate By David Koch; Matthias Zeppelzauer; Miroslav Despotovic; Mario Döller
  4. Forest Degradation and Economic Growth in Nepal, 2003–2010 By Jean-Marie Baland; François Libois; Dilip Mookherjee
  5. R2 bounds for predictive models: what univariate properties tell us about multivariate predictability By Stephen Wright; James Mitchell; Donald Robertson
  6. “A regional perspective on the accuracy of machine learning forecasts of tourism demand based on data characteristics” By Oscar Claveria; Enric Monte; Salvador Torra
  7. Opportunities of external data within companies in promoting new business and sustainability on the real estate and construction sector By Antti Säynäjoki
  8. Managerial implications in real time information system applied for a paper mill By Amaury Gayet; Sylvain Rubat Du Mérac
  9. Big Data im Controlling: Chancen und Risiken By Tröbs, Marcel; Mengen, Andreas

  1. By: Giannone, Domenico (Federal Reserve Bank of New York); Lenza, Michele (European Central Bank and ECARES); Primiceri, Giorgio E. (Northwestern University, CEPR, and NBER)
    Abstract: We compare sparse and dense representations of predictive models in macroeconomics, microeconomics, and finance. To deal with a large number of possible predictors, we specify a prior that allows for both variable selection and shrinkage. The posterior distribution does not typically concentrate on a single sparse or dense model, but on a wide set of models. A clearer pattern of sparsity can only emerge when models of very low dimension are strongly favored a priori.
    Keywords: model selection; shrinkage; high dimensional data
    JEL: C11 C53
    Date: 2018–04–01
  2. By: Erik Brynjolfsson; Felix Eggers; Avinash Gannamaneni
    Abstract: GDP and derived metrics (e.g., productivity) have been central to understanding economic progress and well-being. In principle, the change in consumer surplus (compensating expenditure) provides a superior, and more direct, measure of the change in well-being, especially for digital goods, but in practice, it has been difficult to measure. We explore the potential of massive online choice experiments to measure consumers’ willingness to accept compensation for losing access to various digital goods and thereby estimate the consumer surplus generated from these goods. We test the robustness of the approach and benchmark it against established methods, including incentive compatible choice experiments that require participants to give up Facebook for a certain period in exchange for compensation. The proposed choice experiments show convergent validity and are massively scalable. Our results indicate that digital goods have created large gains in well-being that are missed by conventional measures of GDP and productivity. By periodically querying a large, representative sample of goods and services, including those which are not priced in existing markets, changes in consumer surplus and other new measures of well-being derived from these online choice experiments have the potential for providing cost-effective supplements to existing national income and product accounts.
    JEL: E01 O0 O4
    Date: 2018–04
  3. By: David Koch; Matthias Zeppelzauer; Miroslav Despotovic; Mario Döller
    Abstract: The year of construction (age) of a property, as well as the period of construction has an essential influence on the structure and the value of a building. Current automatic classification models of properties apply hedonic approaches that are mostly based on location (address). An additional automatic classification based on the age and/or period of construction in real estate valuations is still missing.Driven by this observation, the aim is to undertake fundamental research in the jointly interdisciplinary fields of image analysis and real estate evaluation in order to develop novel automatic visual analysis methods for the estimation of age/period of construction of buildings. We employ photographs that show the face of family houses to predict the period of construction and the coarse age as well as the region the building resides in.Image analysis has a long research tradition and is today applied in many different domains. In the domain of real estates, however, the major focus of image analysis lies in the area of satellite image analysis for the classification of land cover. Detained building information cannot be extracted from satellite images. In contrast to other existing approaches, we employ unconstrained photographs of buildings (e.g. by brokers, owners and real estate experts) as an input to visually extract information about the building. For this purpose our business partner provides a large database of real estate valuations. These valuations contain detailed object property descriptions such as year of construction, condition, amenities, address, value, etc., as well as several images per object in different views.In our presentation we will first give a literature overview bout existing papers which are in context real estate (building) an image analysis. In the following we would like to present first results from our research. Therefore we apply the method from (Lee, Maisonneuve, Crandall, Efros, & Sivic, 2015) and our first results are satisfactory.Source: Lee, S., Maisonneuve, N., Crandall, D., Efros, A. A., & Sivic, J. (2015, April 24). Linking Past to Present: Discovering Style in Two Centuries of Architecture. IEEE International Conference on Computational Photography.
    Keywords: aliterature overview; facade segmentation; image analysis; year of construction
    JEL: R3
    Date: 2017–07–01
  4. By: Jean-Marie Baland (CRED - Centre de Recherche en Economie du Developpement - Facultés Universitaires Notre Dame de la Paix (FUNDP) - Namur, CEPR - Center for Economic Policy Research - CEPR, BREAD); François Libois (PJSE - Paris Jourdan Sciences Economiques - UP1 - Université Panthéon-Sorbonne - ENS Paris - École normale supérieure - Paris - INRA - Institut National de la Recherche Agronomique - EHESS - École des hautes études en sciences sociales - ENPC - École des Ponts ParisTech - CNRS - Centre National de la Recherche Scientifique, PSE - Paris School of Economics, CRED - Centre de Recherche en Economie du Developpement - Facultés Universitaires Notre Dame de la Paix (FUNDP) - Namur); Dilip Mookherjee (BU - Boston University [Boston], BREAD)
    Abstract: We investigate the relation between economic growth, household firewood collection and forest conditions in Nepal between 2003 and 2010. Co-movements in these are examined at the household and village levels, combining satellite imagery and household (Nepal Living Standard Measurement Survey) data. Projections of the impact of economic growth based on Engel curves turn out to be highly inaccurate: forest conditions remained stable despite considerable growth in household consumption and income. Firewood collections at the village level remained stable, as effects of demographic growth were offset by substantial reductions in per-household collections. Households substituted firewood by alternative energy sources, particularly when livestock and farm based occupations declined in importance. Engel curve specifications which include household productive assets (a proxy for occupational patterns) provide more accurate predictions. Hence structural changes accompanying economic growth play an important role in offsetting adverse environmental consequences of growth.
    Keywords: Deforestation,Growth,Environmental Kuznets Curve,Nepal
    Date: 2018–04
  5. By: Stephen Wright (Birkbeck, University of London); James Mitchell (Warwick Business School); Donald Robertson (University of Cambridge)
    Abstract: A longstanding puzzle in macroeconomic forecasting has been that a wide variety of multivariate models have struggled to out-predict univariate models consistently. We seek an explanation for this puzzle in terms of population properties. We derive bounds for the predictive R2 of the true, but unknown, multivariate model from univariate ARMA parameters alone. These bounds can be quite tight, implying little forecasting gain even if we knew the true multivariate model. We illustrate using CPI inflation data and the Eurozone in a specification motivated by a preferred-habitat model to test for monetary policy transmission domestically and internationally. Our findings suggest an impact of monetary policy on variance processes only and provides evidence for an international channel of monetary transmission on both money and capital markets. This is, to our knowledge, the first attempt to use search-engine data in the context of monetary policy.
    Keywords: attention, internet search, Google, monetary policy, ECB, FED, international financial markets, macro-finance, sovereign bonds, international finance, bond markets, preferred habitat models.
    JEL: C22 C32 C53 E37
    Date: 2018–04
  6. By: Oscar Claveria (AQR-IREA, University of Barcelona); Enric Monte (Polytechnic University of Catalunya (UPC)); Salvador Torra (Riskcenter-IREA, University of Barcelona)
    Abstract: In this work we assess the role of data characteristics in the accuracy of machine learning (ML) tourism forecasts from a spatial perspective. First, we apply a seasonal-trend decomposition procedure based on non-parametric regression to isolate the different components of the time series of international tourism demand to all Spanish regions. This approach allows us to compute a set of measures to describe the features of the data. Second, we analyse the performance of several ML models in a recursive multiple-step-ahead forecasting experiment. In a third step, we rank all seventeen regions according to their characteristics and the obtained forecasting performance, and use the rankings as the input for a multivariate analysis to evaluate the interactions between time series features and the accuracy of the predictions. By means of dimensionality reduction techniques we summarise all the information into two components and project all Spanish regions into perceptual maps. We find that entropy and dispersion show a negative relation with accuracy, while the effect of other data characteristics on forecast accuracy is heavily dependent on the forecast horizon.
    Keywords: STL decomposition; non-parametric regression; time series features; forecast accuracy; machine learning; tourism demand; regional analysis JEL classification: C45; C51; C53; C63; E27; L83
    Date: 2018–04
  7. By: Antti Säynäjoki
    Abstract: Digitalization is the current trend across all sectors, which is expected to provide new business opportunities, resource efficiency and climate change mitigation. One of the new key resources brought by digitalization is an improved knowledge of clients or users through new data. This knowledge is usually created with business intelligence, i.e. data gathering and analysis. Business intelligence has been the most important success factor for companies in several industries, in particular information technology. Data is gathered, distributed and analysed by platforms, which also promote external innovation. In most successful cases this has lead to successful ecosystems where owners, users and complementors innovate and develop content for the platform in cooperation with each other. However, similar platforms and ecosystems have not yet emerged in a large scale to the real estate and construction (REC) sector.On REC sector smart buildings provide an excellent opportunity to gather data of conditions, indoor environment and user behaviour. This data can be used for developing the current products and services as well as new ones. However, currently only a small share of the companies on the REC sector actually have an access to the data provided by smart buildings. Additionally, the data is seldom distributed outside the company borders. Although the commercialization of external data is currently a hot topic in the academic literature, the companies on REC sector have not been utilizing it yet. External distribution of the data would also enable companies who do not have access to the data to develop their products and services accordingly.Large scale distribution and utilization of data would promote cost and resource efficiency in current products and services as well as new business oppotunities. In addition to building specific data, big data generated by large number of smart buildings would rapidly provide valuable information for example to building design, construction material manufacturing, energy production and insurance sectors just to name a few. The transformation to external data commercialization and distribution would require some stakeholder reorganization in the sector . The roles in the ecosystems benefiting of data distribution have already been discussed in the literature. Lerning from the roles and platforms on other industries, the REC sector should be able to better benefit of the new opportunities of digitalization.
    Keywords: Big data; business intelligence; Digitalization; platform; smart building
    JEL: R3
    Date: 2017–07–01
  8. By: Amaury Gayet (CEROS - Centre d'Etudes et de Recherches sur les Organisations et la Stratégie - UPN - Université Paris Nanterre); Sylvain Rubat Du Mérac
    Abstract: Currently, methods of management control are calculating the cost ex post. The measurement of differences that allow corrective actions is delayed. The acceleration of the processing time associated with real-time sensors and big data can provide real-time measurement of differences and immediate corrective action. Our study case wants to present the conceptual model of data applied. It should allow to manage these informational flows by a managing applied tool. This tool is taking the form of a cross dynamic grid. We will develop also the managerial implications linked to this tool. Our results will explain the contribution of this built tool to the decision making process. In conclusion, we will explain why the real-time appear to be a major innovation in accountability.
    Abstract: Actuellement, les méthodes de contrôle de gestion calculent des coûts ex-post. La mesure des écarts qui permettent des actions correctives est donc tardive. L'accélération des temps de traitement lié aux capteurs temps réel et au big data peut permettre la mesure temps réel des écarts et des actions correctives immédiates. Notre étude de cas consiste à présenter le modèle conceptuel de données appliquées. Il doit permettre de gérer ces flux informationnels par un outil de gestion adapté. Cet outil prend la forme d'un Tableau Croisé Dynamique. Nous développerons également les implications managériales liées à cet outil. Nos résultats expliqueront l'apport à la prise de décision de l'outil construit. En conclusion nous expliquerons pourquoi le temps réel paraît être une innovation comptable majeure.
    Date: 2016–06–15
  9. By: Tröbs, Marcel; Mengen, Andreas
    Abstract: Bereits heute sehen sich Unternehmen mit der Herausforderung konfrontiert, die für sie relevanten Informationen aus dem riesigen Datenmeer herauszufischen und neue Datenquellen zu erschließen. Der Controller als Navigator der Unternehmensführung ist daher mehr denn je gefragt, dem Management den richtigen Kurs bei diesen Themen zu weisen. Gleichzeitig bietet Big Data den Controllern neue Möglichkeiten und Werkzeuge zur besseren Aufgabenerfüllung (vgl. Gadatsch, 2013, S. 23, 28). Aber besitzen Controller auch die erforderliche Kompetenz, diese neuen Werkzeuge zu beherrschen? Oder besteht sogar die Gefahr, dass die Präzision moderner Analysetools die Controlling-Funktion bald überflüssig machen wird? Um diese Fragen beantworten zu können, ist es Gegenstand dieser Arbeit herauszufinden, welche Auswirkungen Big Data auf die Arbeit des Controllers haben wird. Insbesondere die Identifikation und Analyse von Chancen und Risiken der damit verbundenen Entwicklungen soll dazu dienen, geeignete Handlungsempfehlungen für Controller beim Thema Big-Data auszusprechen.
    Date: 2018

This nep-big issue is ©2018 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.