nep-big New Economics Papers
on Big Data
Issue of 2021‒03‒29
34 papers chosen by
Tom Coupé
University of Canterbury

  1. Using Machine Learning and Qualitative Interviews to Design a Five-Question Women's Agency Index By Jayachandran, Seema; Biradavolu, Monica; Cooper, Jan
  2. Demand for AI skills in jobs: Evidence from online job postings By Mariagrazia Squicciarini; Heike Nachtigall
  3. The power of text-based indicators in forecasting the Italian economic activity By Valentina Aprigliano; Simone Emiliozzi; Gabriele Guaitoli; Andrea Luciani; Juri Marcucci; Libero Monteforte
  4. Estimating the Long-Term Effects of Novel Treatments By Keith Battocchi; Eleanor Dillon; Maggie Hei; Greg Lewis; Miruna Oprescu; Vasilis Syrgkanis
  5. Deep Prediction Of Investor Interest: a Supervised Clustering Approach By Baptiste Barreau; Laurent Carlier; Damien Challet
  6. Learning from revisions: a tool for detecting potential errors in banks' balance sheet statistical reporting By Francesco Cusano; Giuseppe Marinelli; Stefano Piermattei
  7. Sentiment analysis of the Spanish Financial Stability Report By Ángel Iván Moreno Bernal; Carlos González Pedraz
  8. Online Learning with Radial Basis Function Networks By Gabriel Borrageiro; Nick Firoozye; Paolo Barucca
  9. The novel Artificial Neural Network assisted models: A review By Srivastav, Bhanu
  10. Environnement big data et décision : l’étape de contre la montre du tour de France 2017 By Jordan Vazquez Llana; Cécile Godé; Jean-Fabrice Lebraty
  11. Uncovering the Hidden Effort Problem By Azi Ben-Rephael; Bruce I. Carlin; Zhi Da; Ryan D. Israelsen
  12. Sample Calibration of the Online CFM Survey By Marie-Hélène Felt; David Laferrière
  13. Big Data in der makroökonomischen Analyse By Ademmer, Martin; Beckmann, Joscha; Bode, Eckhardt; Boysen-Hogrefe, Jens; Funke, Manuel; Hauber, Philipp; Heidland, Tobias; Hinz, Julian; Jannsen, Nils; Kooths, Stefan; Söder, Mareike; Stamer, Vincent; Stolzenburg, Ulrich
  14. Data-Driven Incentive Alignment in Capitation Schemes By Mark Braverman; Sylvain Chassang
  15. Forecasting the Stability and Growth Pact compliance using Machine Learning By Kea Baret; Amélie Barbier-Gauchard; Theophilos Papadimitriou
  16. Measuring Inequality using Geospatial Data By Jaqueson K. Galimberti; Stefan Pichler; Regina Pleninger
  17. Man Versus Machine? Self-Reports Versus Algorithmic Measurement of Publications By Xuan Jiang; Wan-Ying Chang; Bruce A. Weinberg
  18. A new approach for evaluation of the economic impact of decentralized electrification projects By Jean-Claude Berthélemy; Mathilde Maurel
  19. Mitigating the impact of bad rainy seasons in poor agricultural regions to tackle deforestation By Antoine Leblois
  20. A Growth Model of the Data Economy By Maryam Farboodi; Laura Veldkamp
  21. Big data environments and decision making: The time trial stage of the 2017 Tour de France By Jordan Vazquez Llana; Cécile Godé; Jean-Fabrice Lebraty
  22. The Voice of Monetary Policy By Yuriy Gorodnichenko; Tho Pham; Oleksandr Talavera
  23. Gendered cities: Studying urban gender bias through street names By Oto-Peralías, Daniel; Gutiérrez Mora, Dolores
  24. Seeing Beyond the Trees: Using Machine Learning to Estimate the Impact of Minimum Wages on Labor Market Outcomes By Doruk Cengiz; Arindrajit Dube; Attila S. Lindner; David Zentler-Munro
  25. Evidence-Based Policy Learning By Jann Spiess; Vasilis Syrgkanis
  26. Отговорността при инциденти на работното място, свързани с употреба на изкуствен интелект – съвременни предизвикателства пред законодателя By Andreeva, Andriyana; Yolova, Galina
  27. ICO Analysts By Andreas Barth; Valerie Laturnus; Sasan Mansouri; Alexander F. Wagner
  28. Price setting in Chile: Micro evidence from consumer on-line prices during the social outbreak and Covid-19 By J. Peña; E. Prades
  29. Learning about Farming: Innovation and Social Networks in a Resettled Community in Brazil By Margherita Comola; Carla Inguaggiato; Mariapia Mendola
  30. Comparing classic time series models and the LSTM recurrent neural network: An application to S&P 500 stocks By Javier Oliver Muncharaz
  31. Exploiting payments to track Italian economic activity: the experience at Banca d’Italia By Valentina Aprigliano; Guerino Ardizzi; Alessia Cassetta; Alessandro Cavallero; Simone Emiliozzi; Alessandro Gambini; Nazzareno Renzi; Roberta Zizza
  32. Price setting in Chile: Micro evidence from consumer on-line prices during the social outbreak and Covid-19. By Jennifer Peña; Elvira Prades
  33. Facial emotion expressions in human-robot interaction: A survey By Rawal, Niyati; Stock-Homburg, Ruth
  34. Letter to Advisory Committee on Data for Evidence Building By Paul Decker

  1. By: Jayachandran, Seema (Northwestern University); Biradavolu, Monica (QualAnalytics); Cooper, Jan (Harvard University)
    Abstract: We propose a new method to design a short survey measure of a complex concept such as women's agency. The approach combines mixed-methods data collection and machine learning. We select the best survey questions based on how strongly correlated they are with a "gold standard" measure of the concept derived from qualitative interviews. In our application, we measure agency for 209 women in Haryana, India, first, through a semi-structured interview and, second, through a large set of close-ended questions. We use qualitative coding methods to score each woman's agency based on the interview, which we treat as her true agency. To identify the close-ended questions most predictive of the "truth," we apply statistical algorithms that build on LASSO and random forest but constrain how many variables are selected for the model (five in our case). The resulting five-question index is as strongly correlated with the coded qualitative interview as is an index that uses all of the candidate questions. This approach of selecting survey questions based on their statistical correspondence to coded qualitative interviews could be used to design short survey modules for many other latent constructs.
    Keywords: women's empowerment, survey design, feature selection, psychometrics
    JEL: C83 D13 J16 O12
    Date: 2021–03
  2. By: Mariagrazia Squicciarini; Heike Nachtigall
    Abstract: This report presents new evidence about occupations requiring artificial intelligence (AI)-related competencies, based on online job posting data and previous work on identifying and measuring developments in AI. It finds that the total number of AI-related jobs increased over time in the four countries considered – Canada, Singapore, the United Kingdom and the United States – and that a growing number of jobs require multiple AI-related skills. Skills related to communication, problem solving, creativity and teamwork gained relative importance over time, as did complementary software-related and AI-specific competencies. As expected, many AI-related jobs are posted in categories such as “professionals” and “technicians and associated professionals”, though AI-related skills are in demand, to varying degrees, across almost all sectors of the economy. In all countries considered, the sectors “Information and Communication”, “Financial and Insurance Activities” and “Professional, Scientific and Technical Activities” are the most AI job-intensive.
    Keywords: Digital, Employment, Science & Technology
    Date: 2021–03–25
  3. By: Valentina Aprigliano (Bank of Italy); Simone Emiliozzi (Bank of Italy); Gabriele Guaitoli (University of Warwick); Andrea Luciani (Bank of Italy); Juri Marcucci (Bank of Italy); Libero Monteforte (Ufficio Parlamentare di Bilancio, Bank of Italy)
    Abstract: Can we use newspaper articles to forecast economic activity? Our answer is yes and, to this end, we propose a brand new economic dictionary in Italian with valence shifters, and we apply it to a corpus of about two million articles from four popular newspapers. We produce a set of high-frequency text-based sentiment and policy uncertainty indicators (TESI and TEPU respectively), which are constantly updated, not revised and computed both for the whole economy and for specific sectors or economic topics. To test the predictive power of our text-based indicators, we propose two forecasting exercises. First, by using Bayesian Model Averaging (BMA) techniques, we show that our monthly text-based indicators greatly reduce the uncertainty surrounding the short-term forecasts of the main macroeconomic aggregates, especially during recessions. Secondly, we employ these indices in a weekly GDP growth tracker, achieving sizeable gains in forecasting accuracy in both normal and turbulent times.
    Keywords: Forecasting, Text Mining, Sentiment, Economic Policy Uncertainty, Big data, BMA.
    JEL: C11 C32 C43 C52 C55 E52 E58
    Date: 2021–03
  4. By: Keith Battocchi; Eleanor Dillon; Maggie Hei; Greg Lewis; Miruna Oprescu; Vasilis Syrgkanis
    Abstract: Policy makers typically face the problem of wanting to estimate the long-term effects of novel treatments, while only having historical data of older treatment options. We assume access to a long-term dataset where only past treatments were administered and a short-term dataset where novel treatments have been administered. We propose a surrogate based approach where we assume that the long-term effect is channeled through a multitude of available short-term proxies. Our work combines three major recent techniques in the causal machine learning literature: surrogate indices, dynamic treatment effect estimation and double machine learning, in a unified pipeline. We show that our method is consistent and provides root-n asymptotically normal estimates under a Markovian assumption on the data and the observational policy. We use a data-set from a major corporation that includes customer investments over a three year period to create a semi-synthetic data distribution where the major qualitative properties of the real dataset are preserved. We evaluate the performance of our method and discuss practical challenges of deploying our formal methodology and how to address them.
    Date: 2021–03
  5. By: Baptiste Barreau (MICS - Mathématiques et Informatique pour la Complexité et les Systèmes - CentraleSupélec, BNPP CIB GM Lab - BNP Paribas CIB Global Markets Data & AI Lab); Laurent Carlier (BNPP CIB GM Lab - BNP Paribas CIB Global Markets Data & AI Lab); Damien Challet (MICS - Mathématiques et Informatique pour la Complexité et les Systèmes - CentraleSupélec)
    Abstract: We propose a novel deep learning architecture suitable for the prediction of investor interest for a given asset in a given timeframe. This architecture performs both investor clustering and modelling at the same time. We first verify its superior performance on a simulated scenario inspired by real data and then apply it to a large proprietary database from BNP Paribas Corporate and Institutional Banking.
    Keywords: clustering,investor activity prediction,deep learning,neural networks,mixture of experts
    Date: 2021–01–07
  6. By: Francesco Cusano (Bank of Italy); Giuseppe Marinelli (Bank of Italy); Stefano Piermattei (Bank of Italy)
    Abstract: Ensuring and disseminating high-quality data is crucial for central banks to adequately support monetary analysis and the related decision-making process. In this paper we develop a machine learning process for identifying errors in banks’ supervisory reports on loans to the private sector employed in the Bank of Italy’s statistical production of Monetary and Financial Institutions’ (MFI) Balance Sheet Items (BSI). In particular, we model a “Revisions Adjusted – Quantile Regression Random Forest” (RA–QRRF) algorithm in which the predicted acceptance regions of the reported values are calibrated through an individual “imprecision rate” derived from the entire history of each bank’s reporting errors and revisions collected by the Bank of Italy. The analysis shows that our RA-QRRF approach returns very satisfying results in terms of error detection, especially for the loans to the households sector, and outperforms well-established alternative outlier detection procedures based on probit and logit models.
    Keywords: banks, balance sheet items, outlier detection, machine learning
    JEL: C63 C81 G21
    Date: 2021–03
  7. By: Ángel Iván Moreno Bernal (Banco de España); Carlos González Pedraz (Banco de España)
    Abstract: This paper presents a text mining application, to extract information from financial texts and use this information to create sentiment indices. In particular, the analysis focuses on the Banco de España’s Financial Stability Reports from 2002 to 2019 in their Spanish version and on the press reaction to these reports. To calculate the indices, a Spanish dictionary of words with a positive, negative or neutral connotation has been created, to the best of our knowledge the first within the context of financial stability. The robustness of the indices is analysed by applying them to different sections of the Report, and using different variations of the dictionary and the definition of the index. Finally, sentiment is also measured for press reports in the days following the publication of the Report. The results show that the list of words collected in the reference dictionary represents a robust sample to estimate the sentiment of these texts. This tool constitutes a valuable methodology to analyse the repercussion of financial stability reports, while objectively quantifying the sentiment conveyed in them.
    Keywords: text mining, sentiment analysis, natural language processing, central bank communications, financial stability
    JEL: C82 G28
    Date: 2020–07
  8. By: Gabriel Borrageiro; Nick Firoozye; Paolo Barucca
    Abstract: We investigate the benefits of feature selection, nonlinear modelling and online learning with forecasting in financial time series. We consider the sequential and continual learning sub-genres of online learning. Through empirical experimentation, which involves long term forecasting in daily sampled cross-asset futures, and short term forecasting in minutely sampled cash currency pairs, we find that the online learning techniques outperform the offline learning ones. We also find that, in the subset of models we use, sequential learning in time with online Ridge regression, provides the best next step ahead forecasts, and continual learning with an online radial basis function network, provides the best multi-step ahead forecasts. We combine the benefits of both in a precision weighted ensemble of the forecast errors and find superior forecast performance overall.
    Date: 2021–03
  9. By: Srivastav, Bhanu
    Abstract: Neural networks are one of the methods of artificial intelligence. It is founded on an existing knowledge and capacity to learn by illustration of the biological nervous system. Neural networks are used to solve problems that could not be modeled with conventional techniques. A neural structure can be learned, adapted, predicted, and graded. The potential of neural network parameters is very strong prediction. The findings are more reliable than standard mathematical estimation models. Therefore, it has been used in different fields. This research reviews the most recent advancement in utilizing the Artificial neural networks. The reviewed studies have been extracted from Web of Science maintained by Clarivate Analytics in 2021. We find that among the other applications of ANN, the applications on Covid-19 are on the rise.
    Keywords: ANN; Covid-19; Dust; Gas; Organic richness
    JEL: I1 I10 Q49 Y80
    Date: 2021–02–08
  10. By: Jordan Vazquez Llana (UJML - Université Jean Moulin - Lyon 3 - Université de Lyon); Cécile Godé (CRET-LOG - Centre de Recherche sur le Transport et la Logistique - AMU - Aix Marseille Université); Jean-Fabrice Lebraty (UJML - Université Jean Moulin - Lyon 3 - Université de Lyon)
    Abstract: Comme le démontrent Godé et Vazquez (Godé et Vazquez, 2017), les effectifs de la Police nationale française rencontrent fréquemment des situations inattendues qui imposent des prises de décisions rapides (Godé, 2016). Les environnements big data sont susceptibles d'affecter le processus de prise de décision des policiers. La question que nous posons ici est la suivante » Comment les experts de la sécurité publique prennent-ils des décisions en environnement big data ? ». Cette recherche s'intéresse à un évènement en particulier : l'étape de contre le montre du tour de France 2017. La ville de Marseille a accueilli le 21 juillet 2017 les coureurs du Tour de France pour une étape de contre la montre : jusqu'à 300 000 personnes étaient attendues pour l'évènement. Afin de coordonner les patrouilles de Police et les différentes compagnies de C.R.S. sur le terrain, les équipes du Centre d'Information et de Commandement (C.I.C.) de la Police de Marseille ont pu s'appuyer sur de nombreuses technologies qui constituaient leur environnement big data. Cet environnement big data permet aux décideurs de repérer des situations en contexte changeant, de réévaluer des situations non familières et d'envisager des solutions de retrait pour sécuriser les actions des équipes sur le terrain.
    Date: 2021–03–11
  11. By: Azi Ben-Rephael; Bruce I. Carlin; Zhi Da; Ryan D. Israelsen
    Abstract: We use machine learning to analyze minute-by-minute Bloomberg online status data and study how the effort provision of top executives in public corporations affects firm value. While executives likely spend most of their time doing other activities, Bloomberg usage data allows us to characterize their work habits. We document a positive effect of effort on unexpected earnings, cumulative abnormal returns following firm earnings announcements, and credit default swap spreads. We form long-short, calendar-time, effort portfolios and show that they earn significant average daily returns. Finally, we revisit several agency issues that have received attention in the prior academic literature on executive compensation.
    JEL: D22 D82 G32 M52
    Date: 2021–02
  12. By: Marie-Hélène Felt; David Laferrière
    Abstract: The Bank of Canada’s Currency Department has used the Canadian Financial Monitor (CFM) survey since 2009 to track Canadians’ cash usage, payment card ownership and usage, and the adoption of payment innovations. A new online CFM survey was launched in 2018. Because it uses non-probability sampling for data collection, selection bias is very likely. We outline various methods for obtaining survey weights and discuss the associated conditions necessary for these weights to eliminate selection bias. In the end, we obtain calibration weights for the 2018 and 2019 online CFM samples. Our final weights improve upon the default weights provided by the survey company in several ways: (i) we choose the calibration variables based on a fully documented selection procedure that employs machine learning techniques; (ii) we use very up-to-date calibration totals; (iii) for each survey year we obtain two sets of weights, one for the full yearly sample of CFM respondents, the other for the sub-sample of CFM respondents who also filled in the methods-of-payment module of the survey.
    Keywords: Econometric and statistical methods
    JEL: C C8 C81 C83
    Date: 2020
  13. By: Ademmer, Martin; Beckmann, Joscha; Bode, Eckhardt; Boysen-Hogrefe, Jens; Funke, Manuel; Hauber, Philipp; Heidland, Tobias; Hinz, Julian; Jannsen, Nils; Kooths, Stefan; Söder, Mareike; Stamer, Vincent; Stolzenburg, Ulrich
    Abstract: Unter dem Schlagwort Big Data werden neue und in Abgrenzung zur üblichen Wirtschaftsstatistik unkonventionelle Datenquellen zusammengefasst. Sie sind sehr umfangreich und sehr zeitnah sowie in hoher Frequenz verfügbar. Allerdings weisen diese neuen Daten eine hohe Bandbreite und Komplexität auf, weil sie nicht für die Analyse von ökonomischen Fragestellungen erhoben werden, sondern vielmehr als Nebenprodukt unterschiedlicher Anwendungen anfallen. Vor diesem Hintergrund stellen die Autoren die Anwendungsfelder und Potenziale verschiedener Datenquellen aus dem Bereich Big Data in einem vergleichbaren Rahmen vor. Sie zeigen zudem mögliche zukünftige Potenziale von Big Data auf, die derzeit noch nicht nutzbar sind, weil beispielsweise die dafür notwendigen Daten noch nicht systematisch gesammelt oder erfasst werden. Sie schlussfolgern, dass Big Data in vielen Anwendungsfeldern vor allem komplementär zu den Daten der konventionellen Wirtschaftsstatistik zum Einsatz kommen werden.
    Keywords: Big Data,makroökonomische Analyse,Konjunktur,Konjunktur Deutschland,MachineLearning
    Date: 2021
  14. By: Mark Braverman; Sylvain Chassang
    Abstract: This paper explores whether Big Data, taking the form of extensive high dimensional records, can reduce the cost of adverse selection by private service providers in government-run capitation schemes, such as Medicare Advantage. We argue that using data to improve the ex ante precision of capitation regressions is unlikely to be helpful. Even if types become essentially observable, the high dimensionality of covariates makes it infeasible to precisely estimate the cost of serving a given type: Big Data makes types observable, but not necessarily interpretable. This gives an informed private operator scope to select types that are relatively cheap to serve. Instead, we argue that data can be used to align incentives by forming unbiased and non-manipulable ex post estimates of a private operator’s gains from selection.
    JEL: C55 D82 H51 I11 I13
    Date: 2021–02
  15. By: Kea Baret (BETA - Bureau d'Économie Théorique et Appliquée - UL - Université de Lorraine - UNISTRA - Université de Strasbourg - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement); Amélie Barbier-Gauchard (BETA - Bureau d'Économie Théorique et Appliquée - UL - Université de Lorraine - UNISTRA - Université de Strasbourg - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement); Theophilos Papadimitriou (DUTH - Democritus University of Thrace)
    Abstract: Since the reinforcement of the Stability and Growth Pact (1996), the European Commission closely monitors public finance in the EU members. A failure to comply with the 3% limit rule on the public deficit by a country triggers an audit. In this paper, we present a Machine Learning based forecasting model for the compliance with the 3% limit rule. To do so, we use data spanning the period from 2006 to 2018 (a turbulent period including the Global Financial Crisis and the Sovereign Debt Crisis) for the 28 EU Member States. A set of eight features are identified as predictors from 141 variables through a feature selection procedure. The forecasting is performed using the Support Vector Machines (SVM). The proposed model reached 91.7% forecasting accuracy and outperformed the Logit model that we used as benchmark.
    Keywords: Fiscal Rules,Fiscal Compliance,Stability and Growth Pact,Machine learning
    Date: 2021–01–26
  16. By: Jaqueson K. Galimberti (Auckland University of Technology); Stefan Pichler (ETH Zurich, Switzerland); Regina Pleninger (ETH Zurich, Switzerland)
    Abstract: The main challenge in studying economic inequality is limited data availability, which is particularly problematic in developing countries. We construct a measure of economic inequality for 234 countries/territories from 1992 to 2013 using satellite data on night lights and gridded population data. Key methodological innovations include the use of varying levels of data aggregation, and a calibration of the lights-prosperity relationship to match traditional inequality measures based on income data. We obtain a measure that is significantly correlated with cross-country variation in income inequality. We provide three applications of the data in the fields of health economics and international finance. Our results show that light- and income-based inequality measures lead to similar results in terms of cross-country correlations, but not for the dynamics of inequality within countries. Namely, we find that the light-based inequality measure can capture more enduring features of economic activity that are not directly captured by income.
    Keywords: : Nighttime lights, inequality, gridded population
    JEL: D63 E01 I14 O11 O47 O57
    Date: 2021–03
  17. By: Xuan Jiang; Wan-Ying Chang; Bruce A. Weinberg
    Abstract: This paper uses newly available data from Web of Science on publications matched to researchers in Survey of Doctorate Recipients to compare scientific publications collected by surveys and algorithmic approaches. We aim to illustrate the different types of measurement errors in self-reported and machine-generated data by estimating how publication measures from the two approaches are related to career outcomes (e.g. salaries, placements, and faculty rankings). We find that the potential biases in the self-reports are smaller relative to the algorithmic data. Moreover, the errors in the two approaches are quite intuitive: the measurement errors of the algorithmic data are mainly due to the accuracy of matching, which primarily depends on the frequency of names and the data that was available to make matches; while the noise in self reports is expected to increase over the career as researchers’ publication records become more complex, harder to recall, and less immediately relevant for career progress. This paper provides methodological suggestion on evaluating the quality and advantages of two approaches to data construction. It also provides guidance on how to use the new linked data.
    JEL: C26 J24 J3 O31
    Date: 2021–02
  18. By: Jean-Claude Berthélemy (UP1 - Université Paris 1 Panthéon-Sorbonne, FERDI - Fondation pour les Etudes et Recherches sur le Développement International); Mathilde Maurel (CES - Centre d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique, FERDI - Fondation pour les Etudes et Recherches sur le Développement International)
    Abstract: This paper proposes a new methodology for evaluating off-grid electrification projects, based upon Nighttime Light (NTL) observations, obtained by a combination of Defense Meteorological Satellite Program (DMSP) data and Visible Infrared Imaging Radiometer Suite (VIIRS) data. The methodology consists of comparing NTL data before and after the implementation of the projects. The projects are selected from FERDI's Collaborative Smart Mapping of Mini-grid Action (CoSMMA) analysis, which documents existing project evaluations reported in published papers. Such reported evaluations are of uneven quality, with few evaluations which meet scientific standards. Our results suggest that our new methodology can contribute to fill this gap. For each project, we compute the NTL deviation with respect to its counterfactual, which provides us a proxy for the off-grid electricity-induced rate of NTL growth.
    Keywords: Decentralized electrification,sustainable development,impact assessment,Nighttime Light,DMSP,VIIRS
    Date: 2021–03–02
  19. By: Antoine Leblois (CEE-M - Centre d'Economie de l'Environnement - Montpellier - UMR 5211 - UM - Université de Montpellier - CNRS - Centre National de la Recherche Scientifique - Montpellier SupAgro - Institut national d’études supérieures agronomiques de Montpellier - Institut Agro - Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement)
    Abstract: Land use changes are known to account for over 20% of human greenhouse gas emissions andtree cover losses can significantly influence land-climate dynamics. Land-climate feedbacks havebeen identified and evaluated for a long time. However, in addition to the direct effect of climatechange on forest biomes, recent sparse evidence has shown that land use changes may increaseas a result of weather shocks. In Western and Central Africa, agriculture is the main source ofincome and employment for rural populations. Economies rely on agricultural production, whichis largely rainfed, and therefore dependent predominantly upon seasonal rainfall. In this article,I explore the impact of seasonal rainfall quality on deforestation, by combining high-resolutionremotely-sensed annual tree cover loss, land cover, human activity and daily rainfall data. Ishow that in poor regions that are mainly reliant on rainfed agriculture, a bad rainy season leadsto large deforestation shocks. These shocks notably depend on the proportion of agriculturalland and on the remoteness of the areas in question, as remoteness determines the ability toimport food and the existence of alternative income sources. In areas with significant forestcover, a short rainfall season leads to a 15% increase in deforestation. In unconnected areaswith small proportions of crop area, the increase in deforestation reaches 20%. Findings suggestthat a refined understanding of the land use changes caused by rainfall shocks might be used toimprove the design and effectiveness of development, adaptation and conservation policies
    Keywords: deforestation,rainfall shocks,West Africa
    Date: 2021–01–14
  20. By: Maryam Farboodi; Laura Veldkamp
    Abstract: The rise of information technology and big data analytics has given rise to "the new economy." But are its economics new? This article constructs a growth model where firms accumulate data, instead of capital. We incorporate three key features of data: 1) Data is a by-product of economic activity; 2) data is information used for prediction, and 3) uncertainty reduction enhances firm profitability. The model can explain why data-intensive goods or services, like apps, are given away for free, why many new entrants are unprofitable and why some of the biggest firms in the economy profit primarily from selling data. While our transition dynamics differ from those of traditional growth models, the long run still features diminishing returns. Just like accumulating capital, accumulating predictive data, by itself, cannot sustain long-run growth.
    JEL: O3 O4
    Date: 2021–02
  21. By: Jordan Vazquez Llana; Cécile Godé (CRET-LOG - Centre de Recherche sur le Transport et la Logistique - AMU - Aix Marseille Université); Jean-Fabrice Lebraty
    Abstract: As demonstrated by Godé and Vazquez (2017), French National Police teams often encounter unexpected events (Godé, 2016) which compel them to make quick decisions. Big data environments can have an impact on their decision-making processes. The research question of this article is: "How are public safety decisions taken in big data environments?" This research focuses on a specific event: the time trial stage of the 2017 Tour de France, which took place in Marseille in 2017. The city of Marseille thus hosted the famous cyclists on July 21st, 2017 during this special stage of the popular annual French cycling race: up to 300,000 spectators were expected. In order to coordinate the numerous police patrols, the decision-makers of the Center for Information and Command (CIC) were able to rely on the set of technologies that constitute their big data environment. This new informational context is exploited by police decision-makers to identify risky situations, reassess a situation when an unexpected event occurs, and secure the operations of the teams on the ground.
    Keywords: decision-making,intuition,big data environment,unexpected events,police,2017 Tour de France
    Date: 2021–03–11
  22. By: Yuriy Gorodnichenko (University of California, Berkeley); Tho Pham (University of Reading); Oleksandr Talavera (University of Birmingham)
    Abstract: We develop a deep learning model to detect emotions embedded in press conferences after the meetings of the Federal Open Market Committee and examine the influence of the detected emotions on financial markets. We find that, after controlling for the Fed's actions and the sentiment in policy texts, positive tone in the voices of Fed Chairs leads to statistically significant and economically large increases in share prices. In other words, how policy messages are communicated can move the stock market. In contrast, the bond market appears to take few vocal cues from the Chairs. Our results provide implications for improving the effectiveness of central bank communications.
    Keywords: monetary policy, communication, voice, emotion, text sentiment, stock market, bond market.
    JEL: E31 E58 G12 D84
    Date: 2021–02
  23. By: Oto-Peralías, Daniel (Universidad Pablo de Olavide); Gutiérrez Mora, Dolores
    Abstract: This paper uses text analysis to measure gender bias in cities through the use of street names. Focusing on the case of Spain, we collect data on 15 million street names to analyze gender inequality in urban toponyms. We calculate for each Spanish municipality and each year from 2001 to 2020 a variable measuring the percentage of streets with female names over the total number of streets with male and female names. Our results reveal a strong gender imbalance in Spanish cities: the percentage of streets named after women over the total named after men and women is only 12% in 2020. We also observe that there are substantial differences across the Spanish regions, and concerning new streets, gender bias is lower but still far from parity. The second part of the paper analyzes the correlation of our indicator of gender bias in street names with the cultural factor it is supposed to capture, with the results suggesting that it constitutes a useful cultural measure of gender inequality at the city level. This research has policy implications since it helps to measure a relevant phenomenon, given the strong symbolic power attributed to street names, which has been elusive to quantify so far.
    Date: 2021–03–05
  24. By: Doruk Cengiz; Arindrajit Dube; Attila S. Lindner; David Zentler-Munro
    Abstract: We assess the effect of the minimum wage on labor market outcomes such as employment, unemployment, and labor force participation for most workers affected by the policy. We apply modern machine learning tools to construct demographically-based treatment groups capturing around 75% of all minimum wage workers—a major improvement over the literature which has focused on fairly narrow subgroups where the policy has a large bite (e.g., teens). By exploiting 172 prominent minimum wages between 1979 and 2019 we find that there is a very clear increase in average wages of workers in these groups following a minimum wage increase, while there is little evidence of employment loss. Furthermore, we find no indication that minimum wage has a negative effect on the unemployment rate, on the labor force participation, or on the labor market transitions. Furthermore, we detect no employment or participation responses even for sub-groups that are likely to have a high extensive margin labor supply elasticity—such as teens, older workers, or single mothers. Overall, these findings provide little evidence for changing search effort in response to a minimum wage increase.
    JEL: J08 J2 J3 J38 J8 J88
    Date: 2021–01
  25. By: Jann Spiess; Vasilis Syrgkanis
    Abstract: The past years have seen seen the development and deployment of machine-learning algorithms to estimate personalized treatment-assignment policies from randomized controlled trials. Yet such algorithms for the assignment of treatment typically optimize expected outcomes without taking into account that treatment assignments are frequently subject to hypothesis testing. In this article, we explicitly take significance testing of the effect of treatment-assignment policies into account, and consider assignments that optimize the probability of finding a subset of individuals with a statistically significant positive treatment effect. We provide an efficient implementation using decision trees, and demonstrate its gain over selecting subsets based on positive (estimated) treatment effects. Compared to standard tree-based regression and classification tools, this approach tends to yield substantially higher power in detecting subgroups with positive treatment effects.
    Date: 2021–03
  26. By: Andreeva, Andriyana; Yolova, Galina
    Abstract: The report examines in general term the thematic of the liability in Labour law in case of incidents, connected to the use of artificial intellect. The examination analyses the traditional labour-law institute liability, but in its contemporary light connected to the possibility of responsibility in case of incidents on the workplace, in context of the challenges, generated through the incorporation of automated systems in the working process. The accent is on the regulatory norms and principles, laid down in the frame of European documents, outlining the tendencies in the development of the institute. Based on the analysis are made conclusions and marked tendencies of the development of the institute in the new conditions of digital revolution.
    Keywords: legal liability in Labour law, incidents on the workplace, artificial intellect, financial liability of the employer
    JEL: K31
    Date: 2020
  27. By: Andreas Barth (Goethe University Frankfurt - Department of Finance); Valerie Laturnus (Goethe University Frankfurt - Department of Finance); Sasan Mansouri (Goethe University Frankfurt - Department of Finance); Alexander F. Wagner (University of Zurich - Department of Banking and Finance; Centre for Economic Policy Research (CEPR); European Corporate Governance Institute (ECGI); Swiss Finance Institute)
    Abstract: Freelancing human experts play an important role in Initial Coin Offerings (ICOs). Expert ratings partially reflect the reciprocal network of ICO members and analysts. Ratings predict ICO success, but highly imperfectly. Favorably rated ICOs tend to fail when more ratings reciprocate prior ratings. Failure despite strong ratings is also frequent when analysts have a history of optimism, and when reviews strike a particulary positive tone. These findings help illuminate the workings of ICOs for funding new ventures, and the rich data also yield insights pertinent to the literature on equity analysts and rating agencies.
    Keywords: Analysts, Asymmetric Information, FinTech, Initial Coin Offering (ICO)
    JEL: G14 G24 L26 D82 D83
    Date: 2021–03
  28. By: J. Peña; E. Prades
    Abstract: In this paper we analyze the price setting behavior in Chile by using scraped data from public websites of the main retailers including supermarkets, a pharmacy retailer and car dealerships. The data collection started in July 2019 and the dataset covers two major recent events: (1) the social outbreak and (2) the state of emergency declaration due to Covid-19, both episodes led to disruptions in the economy. With information on product varieties that accounts for 22% of the CPI basket, we document several empirical findings as regards price setting behaviour in terms of stickiness, that is, frequency, implied duration and the size of price adjustments. We find that in spite of facing large shocks, prices adjusted very little, at a lower frequency and at a smaller size than prior to these two events. We also find that there was a reduction on product variety availability on-line, a typical feature that also has been found during natural disasters such as earthquakes. The reduction in product availability poses additional difficulties to construct CPI indexes and to properly capture price rigidities, which are relevant for monetary policy.
    Date: 2021–03
  29. By: Margherita Comola (University Paris-Saclay and Paris School of Economics); Carla Inguaggiato (University of Bern, Centre for Development and Environment); Mariapia Mendola (University of Milano{Bicocca and IZA)
    Abstract: We study the role of social learning in the diffusion of cash crops in a resettled village economy in northeastern Brazil. We combine detailed geo-coded data on farming plots with dyadic data on social ties among settlers, and we leverage natural exogenous variation in network formation induced by the land occupation movement and the agrarian reform. By using longitudinal data on farming decisions over 15 years we find consistent evidence of significant peer effects in the decision to farm new cash fruits (pineapple and passion fruit). Our results suggest that social diffusion is heterogeneous along observed plot and crop characteristics, i.e. farmers growing water-sensitive crop are more likely to respond to the actions of peers with similar water access conditions.
    Keywords: Technology Adoption, Agrarian Reform, Social Networks, Peer Effects, Brazil
    JEL: C45 D85 J15 O33 Q15
    Date: 2021–02–09
  30. By: Javier Oliver Muncharaz (Universidad Politécnica de Valencia)
    Abstract: In the financial literature, there is great interest in the prediction of stock prices. Stock prediction is necessary for the creation of different investment strategies, both speculative and hedging ones. The application of neural networks has involved a change in the creation of predictive models. In this paper, we analyze the capacity of recurrent neural networks, in particular the long short-term recurrent neural network (LSTM) as opposed to classic time series models such as the Exponential Smooth Time Series (ETS) and the Arima model (ARIMA). These models have been estimated for 284 stocks from the S&P 500 stock market index, comparing the MAE obtained from their predictions. The results obtained confirm a significant reduction in prediction errors when LSTM is applied. These results are consistent with other similar studies applied to stocks included in other stock market indices, as well as other financial assets such as exchange rates.
    Abstract: En la literatura financiera existe un gran interés por la predicción de precios bursátiles que es necesario para la creación de diferentes estrategias de inversion, tanto especulativas como de cobertura. La aplicación de las redes neuronales ha supuesto un cambio en la creación de modelos de predicción. En este trabajo se analiza la capacidad que tienen las redes neuronales recurrentes, en concreto la long shortterm recurrent neural network (LSTM) frente a modelos de series temporales clásicos como el Exponential Smooth Time Series (ETS) y el modelo Arima (ARIMA). Para ello se ha estimado dichos modelos para 284 acciones pertenecientes al índice bursátil S&P 500, comparando el MAE obtenido de sus predicciones, con el modelo LSTM. Los resultados obtenidos confirman una reducción importante de los errores de predicción. Estos resultados son coincidentes con otros estudios similares aplicados a acciones de otros índices bursátiles así como a otros activos financieros como los tipos de cambio.
    Keywords: S&P 500,Long short-term neural network,Recurrent Neural Network,Arima,Redes neuronales recurrentes
    Date: 2020
  31. By: Valentina Aprigliano (Bank of Italy); Guerino Ardizzi (Bank of Italy); Alessia Cassetta (Bank of Italy); Alessandro Cavallero (Bank of Italy); Simone Emiliozzi (Bank of Italy); Alessandro Gambini (Bank of Italy); Nazzareno Renzi (Bank of Italy); Roberta Zizza (Bank of Italy)
    Abstract: This paper provides an overview of how information on payments has been recently exploited by Banca d’Italia staff for the purposes of tracking economic activity and forecasting. In particular, the payment data used for this work are drawn from the payment systems managed by Banca d’Italia (BI-COMP and TARGET2) and from the Anti-Money Laundering Aggregate Reports submitted by banks and by Poste Italiane to the Banca d’Italia’s Financial Intelligence Unit (Unità di Informazione Finanziaria, UIF). We show that indicators drawn from these sources can improve forecasting accuracy; in particular, those available at a higher frequency have proved crucial to properly assessing the state of the economy during the pandemic. Moreover, these indicators make it possible to assess changes in agents’ behaviour, notably with reference to payment habits, and, thanks to their granularity, to delve deeper into the macroeconomic trends, exploring heterogeneity by sector and geography.
    Keywords: short term forecasting, high-frequency data, payment systems, TARGET2, money laundering, COVID-19
    JEL: C53 E17 E27 E32 E37 E42
    Date: 2021–03
  32. By: Jennifer Peña (Central Bank of Chile); Elvira Prades (Banco de España)
    Abstract: In this paper we analyze the price setting behavior in Chile by using scraped data from public websites of the main retailers including supermarkets, a pharmacy retailer and car dealerships. The data collection started in July 2019 and the dataset covers two major recent events: (1) the social outbreak and (2) the state of emergency declaration due to Covid-19, both episodes led to disruptions in the economy. With information on product varieties that accounts for 22% of the CPI basket, we document several empirical findings as regards price setting behaviour in terms of stickiness, that is, frequency, implied duration and the size of price adjustments. We find that in spite of facing large shocks, prices adjusted very little, at a lower frequency and at a smaller size than prior to these two events. We also find that there was a reduction on product variety availability on-line, a typical feature that also has been found during natural disasters such as earthquakes. The reduction in product availability poses additional difficulties to construct CPI indexes and to properly capture price rigidities, which are relevant for monetary policy.
    Keywords: on-line price data, CPI, prices stickiness, retail distribution
    JEL: E01 E31 L81
    Date: 2021–01
  33. By: Rawal, Niyati; Stock-Homburg, Ruth
    Date: 2021
  34. By: Paul Decker
    Abstract: In this letter to the Advisory Committee on Data for Evidence Building, Paul Decker describes the challenges and opportunities inherent in using data and evidence to support informed decision making.
    Keywords: evidence building, data, decision making

This nep-big issue is ©2021 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.