nep-big New Economics Papers
on Big Data
Issue of 2022‒11‒14
25 papers chosen by
Tom Coupé
University of Canterbury

  1. Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning By Milusheva,Svetoslava Petkova; Marty,Robert Andrew; Bedoya Arguelles,Guadalupe; Williams,Sarah Elizabeth; Resor,Elizabeth Landsdowne; Legovini,Arianna
  2. Learners in the loop: hidden human skills in machine intelligence By Paola Tubaro
  3. DeepVol: Volatility Forecasting from High-Frequency Data with Dilated Causal Convolutions By Fernando Moreno-Pino; Stefan Zohren
  4. How the risk of job automation in the UK has changed over time By Darke, Matthew James
  5. Classification based credit risk analysis: The case of Lending Club By Aadi Gupta; Priya Gulati; Siddhartha P. Chakrabarty
  6. Effect or Treatment Heterogeneity? Policy Evaluation with Aggregated and Disaggregated Treatments By Heiler, Phillip; Knaus, Michael C.
  7. Do Workfare Programs Live Up to Their Promises ? Experimental Evidence from Côte d’Ivoire By Bertrand,Marianne; Crepon,Bruno Jacques Jean Philippe; Marguerie,Alicia Charlene; Premand,Patrick
  8. From Rules to Regs: A Structural Topic Model of Collusion Research By W. Benedikt Schmal
  9. Welfare estimations from imagery. A test of domain experts ability to rate poverty from visual inspection of satellite imagery By Wahab Ibrahim; Ola Hall
  10. Machine Learning in International Trade Research : Evaluating the Impact of Trade Agreements By Breinlich,Holger; Corradi,Valentina; Rocha,Nadia; Ruta,Michele; Santos Silva,J.M.C.; Zylkin,Tom
  11. Understanding the Requirements for Surveys to Support Satellite-Based Crop Type Mapping : Evidence from Sub-Saharan Africa By Azzari,George; Jain,Shruti; Jeffries,Graham; Kilic,Talip; Murray,Siobhan
  12. Artificial Intelligence, Ethics, and Intergenerational Responsibility By Victor Klockmann; Alicia von Schenk; Marie Claire Villeval
  13. "AI, Skill, and Productivity: The Case of Taxi Drivers" By Kyogo Kanazawa; Daiji Kawaguchi; Hitoshi Shigeoka; Yasutora Watanabe
  14. Using Twitter to Evaluate the Perception of Service Delivery in Data-Poor Environments By Braley,Alia Anne; Fraiberger,Samuel Paul; Tas,Emcet Oktay
  15. Social Media and Newsroom Production Decisions By Julia Cagé; Nicolas Hervé; Béatrice Mazoyer
  16. Central Bank Mandates and Monetary Policy Stances: through the Lens of Federal Reserve Speeches By Bertsch, Christoph; Hull, Isaiah; Lumsdaine, Robin L.; Zhang, Xin
  17. ESPAREL. A look at the relationship between population and territory in Spain in historical perspective By Francisco J. Beltran Tapia; Alfonso Diez Minguela; Victor Fernandez Modrego; Alicia Gomez Tello; Julio Martinez-Galarraga; Daniel A. Tirado Fabregat
  18. Research of an optimization model for servicing a network of ATMs and information payment terminals By G. A. Nigmatulin; O. B. Chaganova
  19. A Clustering Algorithm for Correlation Quickest Hub Discovery Mixing Time Evolution and Random Matrix Theory By Alejandro Rodriguez Dominguez; David Stynes
  20. DyFEn: Agent-Based Fee Setting in Payment Channel Networks By Kiana Asgari; Aida Afshar Mohammadian; Mojtaba Tefagh
  21. The role of central bank communication in inflation-targeting Eastern European emerging economies By Valerio Astuti; Alessio Ciarlone; Alberto Coco
  22. Development Research at High Geographic Resolution : An Analysis of Night Lights, Firms, and Poverty in India Using the SHRUG Open Data Platform By Asher,Sam; Lunt,Tobias; Matsuura,Ryu; Novosad,Paul Michael
  23. Effect of typhoons on economic activities in Vietnam: Evidence using satellite imagery By Etienne ESPAGNE; Yen Boi HA; Kenneth HOUNGBEDJI; Thanh NGO-DUC
  24. Tracking Economic Activity in Response to the COVID-19 Crisis Using Nighttime Lights — The Case of Morocco By Roberts,Mark
  25. Lights Out ? COVID-19 Containment Policies and Economic Activity By Beyer,Robert Carl Michael; Jain,Tarun; Sinha,Sonalika

  1. By: Milusheva,Svetoslava Petkova; Marty,Robert Andrew; Bedoya Arguelles,Guadalupe; Williams,Sarah Elizabeth; Resor,Elizabeth Landsdowne; Legovini,Arianna
    Abstract: With all the recent attention focused on big data, it is easy to overlook that basic vital statistics remain difficult to obtain in most of the world. This project set out to test whether an openly available dataset (Twitter) could be transformed into a resource for urban planning and development. The hypothesis is tested by creating road traffic crash location data, which are scarce in most resource-poor environments but essential for addressing the number one cause of mortality for children over age five and young adults. The research project scraped 874,588 traffic-related tweets in Nairobi, Kenya, applied a machine learning model to capture the occurrence of a crash, and developed an improved geoparsing algorithm to identify its location. The project geolocated 32,991 crash reports in Twitter for 2012-20 and clustered them into 22,872 unique crashes to produce one of the first crash maps for Nairobi. A motorcycle delivery service was dispatched in real-time to verify a subset of crashes, showing 92 percent accuracy. Using a spatial clustering algorithm, portions of the road network (less than 1 percent) were identified where 50 percent of the geolocated crashes occurred. Even with limitations in the representativeness of the data, the results can provide urban planners useful information to target road safety improvements where resources are limited.
    Keywords: ICT Applications,Disease Control&Prevention,Public Health Promotion,Road Safety,Intelligent Transport Systems,Transport Services,Crime and Society
    Date: 2020–12–04
  2. By: Paola Tubaro (CREST - Centre de Recherche en Économie et Statistique - ENSAI - Ecole Nationale de la Statistique et de l'Analyse de l'Information [Bruz] - X - École polytechnique - ENSAE Paris - École Nationale de la Statistique et de l'Administration Économique - CNRS - Centre National de la Recherche Scientifique, LSQ - Laboratoire de sociologie quantitative - Centre de Recherche en Économie et STatistique (CREST), MSH Paris-Saclay - Maison des Sciences de l'Homme - Paris Saclay - UVSQ - Université de Versailles Saint-Quentin-en-Yvelines - Université Paris-Saclay - CNRS - Centre National de la Recherche Scientifique - ENS Paris Saclay - Ecole Normale Supérieure Paris-Saclay, LISN - Laboratoire Interdisciplinaire des Sciences du Numérique - CentraleSupélec - Université Paris-Saclay - CNRS - Centre National de la Recherche Scientifique, TAU - TAckling the Underspecified - Inria Saclay - Ile de France - Inria - Institut National de Recherche en Informatique et en Automatique - LISN - Laboratoire Interdisciplinaire des Sciences du Numérique - CentraleSupélec - Université Paris-Saclay - CNRS - Centre National de la Recherche Scientifique)
    Abstract: Today's artificial intelligence, largely based on data-intensive machine learning algorithms, relies heavily on the digital labour of invisibilized and precarized humans-in-the-loop who perform multiple functions of data preparation, verification of results, and even impersonation when algorithms fail. Using original quantitative and qualitative data, the present article shows that these workers are highly educated, engage significant (sometimes advanced) skills in their activity, and earnestly learn alongside machines. However, the loop is one in which human workers are at a disadvantage as they experience systematic misrecognition of the value of their competencies and of their contributions to technology, the economy, and ultimately society. This situation hinders negotiations with companies, shifts power away from workers, and challenges the traditional balancing role of the salary institution.
    Keywords: misrecognition,Spanish-speaking countries,Digital labour platforms,artificial intelligence,skills,learning
    Date: 2022
  3. By: Fernando Moreno-Pino; Stefan Zohren
    Abstract: Volatility forecasts play a central role among equity risk measures. Besides traditional statistical models, modern forecasting techniques, based on machine learning, can readily be employed when treating volatility as a univariate, daily time-series. However, econometric studies have shown that increasing the number of daily observations with high-frequency intraday data helps to improve predictions. In this work, we propose DeepVol, a model based on Dilated Causal Convolutions to forecast day-ahead volatility by using high-frequency data. We show that the dilated convolutional filters are ideally suited to extract relevant information from intraday financial data, thereby naturally mimicking (via a data-driven approach) the econometric models which incorporate realised measures of volatility into the forecast. This allows us to take advantage of the abundance of intraday observations, helping us to avoid the limitations of models that use daily data, such as model misspecification or manually designed handcrafted features, whose devise involves optimising the trade-off between accuracy and computational efficiency and makes models prone to lack of adaptation into changing circumstances. In our analysis, we use two years of intraday data from NASDAQ-100 to evaluate DeepVol's performance. The reported empirical results suggest that the proposed deep learning-based approach learns global features from high-frequency data, achieving more accurate predictions than traditional methodologies, yielding to more appropriate risk measures.
    Date: 2022–09
  4. By: Darke, Matthew James (University of Warwick)
    Abstract: Developments in Artificial Intelligence and Machine Learning technologies have had massive implications for labour automation. This paper builds on the task-based methodology first adopted by Frey and Osborne (2013) to predict how the risk of automation evolved in the UK labour between 2012 and 2017 using data from the UK Skills and Employment Survey. The analysis accounts for technological progress, making use of two sets of experts’ assessments for 70 occupations. The probability of automation is predicted for each individual using a set of self-reported job skills. It finds that the proportion of jobs at high-risk from automation has risen from 10.6% to 23.4%, and that this is largely due to better technology rather than changing job skill requirements. It also identifies sectors experiencing the greatest increase in automation risk between the two periods and, in contrast, those which appear complementary to technology, drawing on occupational case studies as evidence.
    Keywords: Employment ; Skills Demand ; Technology JEL Classification: J01 ; J21 ; J24 ; J62 ; O33
    Date: 2022
  5. By: Aadi Gupta; Priya Gulati; Siddhartha P. Chakrabarty
    Abstract: In this paper, we performs a credit risk analysis, on the data of past loan applicants of a company named Lending Club. The calculation required the use of exploratory data analysis and machine learning classification algorithms, namely, Logistic Regression and Random Forest Algorithm. We further used the calculated probability of default to design a credit derivative based on the idea of a Credit Default Swap, to hedge against an event of default. The results on the test set are presented using various performance measures.
    Date: 2022–10
  6. By: Heiler, Phillip (Aarhus University); Knaus, Michael C. (University of Tübingen)
    Abstract: Binary treatments are often ex-post aggregates of multiple treatments or can be disaggregated into multiple treatment versions. Thus, effects can be heterogeneous due to either effect or treatment heterogeneity. We propose a decomposition method that uncovers masked heterogeneity, avoids spurious discoveries, and evaluates treatment assignment quality. The estimation and inference procedure based on double/debiased machine learning allows for high-dimensional confounding, many treatments and extreme propensity scores. Our applications suggest that heterogeneous effects of smoking on birthweight are partially due to different smoking intensities and that gender gaps in Job Corps effectiveness are largely explained by differences in vocational training.
    Keywords: causal inference, causal machine learning, double machine learning, heterogeneous treatment effects, overlap, treatment versions
    JEL: C14 C21
    Date: 2022–09
  7. By: Bertrand,Marianne; Crepon,Bruno Jacques Jean Philippe; Marguerie,Alicia Charlene; Premand,Patrick
    Abstract: Workfare programs are one of the most popular social protection and employment policy instruments in the developing world. They evoke the promise of efficient targeting, as well as immediate and lasting impacts on participants’ employment, earnings, skills and behaviors. This paper evaluates contemporaneous and post-program impacts of a public works intervention in Côte d’Ivoire. The program was randomized among urban youths who self-selected to participate and provided seven months of employment at the formal minimum wage. Randomized subsets of beneficiaries also received complementary training on basic entrepreneurship or job search skills. During the program, results show limited impacts on the likelihood of employment, but a shift toward wage jobs, higher earnings and savings, as well as changes in work habits and behaviors. Fifteen months after the program ended, savings stock remain higher, but there are no lasting impacts on employment or behaviors, and only limited impacts on earnings. Machine learning techniques are applied to assess whether program targeting can improve. Significant heterogeneity in impacts on earnings is found during the program but not post-program. Departing from self-targeting improves performance: a range of practical targeting mechanisms achieve impacts close to a machine learning benchmark by maximizing contemporaneous impacts without reducing post-program impacts. Impacts on earnings remain substantially below program costs even under improved targeting.
    Keywords: Labor Policies,Rural Labor Markets,Employment and Unemployment,Labor Markets
    Date: 2021–04–05
  8. By: W. Benedikt Schmal
    Abstract: Collusive practices of firms continue to be a major threat to competition and consumer welfare. Academic research on this topic aims at understanding the economic drivers and behavioral patterns of cartels, among others, to guide competition authorities on how to tackle them. Utilizing topical machine learning techniques in the domain of natural language processing enables me to analyze the publications on this issue over more than 20 years in a novel way. Coming from a stylized oligopoly-game theory focus, researchers recently turned toward empirical case studies of bygone cartels. Uni- and multivariate time series analyses reveal that the latter did not supersede the former but filled a gap the decline in rule-based reasoning has left. Together with a tendency towards monocultures in topics covered and an endogenous constriction of the topic variety, the course of cartel research has changed notably: The variety of subjects included has grown, but the pluralism in economic questions addressed is in descent. It remains to be seen whether this will benefit or harm the cartel detection capabilities of authorities in the future.
    Date: 2022–10
  9. By: Wahab Ibrahim; Ola Hall
    Abstract: The present study uses domain experts to estimate welfare levels and indicators from high-resolution satellite imagery. We use the wealth quintiles from the 2015 Tanzania DHS dataset as ground truth data. We analyse the performance of the visual estimation of relative wealth at the cluster level and compare these with wealth rankings from the DHS survey of 2015 for that country using correlations, ordinal regressions and multinomial logistic regressions. Of the 608 clusters, 115 received the same ratings from human experts and the independent DHS rankings. For 59 percent of the clusters, experts ratings were slightly lower. On the one hand, significant positive predictors of wealth are the presence of modern roofs and wider roads. For instance, the log odds of receiving a rating in a higher quintile on the wealth rankings is 0.917 points higher on average for clusters with buildings with slate or tile roofing compared to those without. On the other hand, significant negative predictors included poor road coverage, low to medium greenery coverage, and low to medium building density. Other key predictors from the multinomial regression model include settlement structure and farm sizes. These findings are significant to the extent that these correlates of wealth and poverty are visually readable from satellite imagery and can be used to train machine learning models in poverty predictions. Using these features for training will contribute to more transparent ML models and, consequently, explainable AI.
    Date: 2022–10
  10. By: Breinlich,Holger; Corradi,Valentina; Rocha,Nadia; Ruta,Michele; Santos Silva,J.M.C.; Zylkin,Tom
    Abstract: Modern trade agreements contain a large number of provisions besides tariff reductions, inareas as diverse as services trade, competition policy, trade-related investment measures, or public procurement.Existing research has struggled with overfitting and severe multicollinearity problems when trying to estimate theeffects of these provisions on trade flows. This paper builds on recent developments in the machine learning andvariable selection literature to propose novel data-driven methods for selecting the most important provisions andquantifying their impact on trade flows. The proposed methods have the advantage of not requiring ad hocassumptions on how to aggregate individual provisions and offer improved selection accuracy over the standard lasso.The analysis finds that provisions related to technical barriers to trade, antidumping, trade facilitation,subsidies, and competition policy are associated with enhancing the trade-increasing effect of trade agreements.
    Keywords: International Trade and Trade Rules,De Facto Governments,Economics and Finance of Public Institution Development,State Owned Enterprise Reform,Public Sector Administrative and Civil Service Reform,Public Sector Administrative & Civil Service Reform,Democratic Government,Competition Policy,Competitiveness and Competition Policy,Trade Facilitation,Health and Sanitation
    Date: 2021–04–13
  11. By: Azzari,George; Jain,Shruti; Jeffries,Graham; Kilic,Talip; Murray,Siobhan
    Abstract: With the surge in publicly available high-resolution satellite imagery, satellite-based monitoring of smallholder agricultural outcomes is gaining momentum. This paper provides recommendations on how large-scale household surveys should be conducted to generate the data needed to train models for satellite-based crop type mapping in smallholder farming systems. The analysis focuses on maize cultivation in Malawi and Ethiopia, and leverages rich, georeferenced plot-level data from national household surveys that were conducted in 2018–20 and that are integrated with Sentinel-2 satellite imagery and complementary geospatial data. To identify the approach to survey data collection that yields optimal data for training remote sensing models, 26,250 in silico experiments are simulated within a machine learning framework. The best model is then applied to map seasonal maize cultivation from 2016 to 2019 at 10-meter resolution in both countries. The analysis reveals that smallholder plots with maize cultivation can be identified with up to 75 percent accuracy. However, the predictive accuracy varies with the approach to georeferencing plot locations and the number of observations in the training data. Collecting full plot boundaries or complete plot corner points provides the best quality of information for model training. Classification performance peaks with slightly less than 60 percent of the training data. Seemingly small erosion in accuracy under less preferable approaches to georeferencing plots results in total area under maize cultivation being overestimated by 0.16 to 0.47 million hectares (8 to 24 percent) in Malawi.
    Keywords: Food Security,Labor&Employment Law,Climate Change and Agriculture,Crops and Crop Management Systems,Natural Disasters,Trade Facilitation
    Date: 2021–04–01
  12. By: Victor Klockmann (Goethe-University Frankfurt am Main, University of Würzburg = Universität Würzburg , Max Planck Institute for Human Development - Max-Planck-Gesellschaft); Alicia von Schenk (Goethe-University Frankfurt am Main, University of Würzburg = Universität Würzburg , Max Planck Institute for Human Development - Max-Planck-Gesellschaft); Marie Claire Villeval (GATE Lyon Saint-Étienne - Groupe d'analyse et de théorie économique - ENS Lyon - École normale supérieure - Lyon - UL2 - Université Lumière - Lyon 2 - UCBL - Université Claude Bernard Lyon 1 - Université de Lyon - UJM - Université Jean Monnet [Saint-Étienne] - Université de Lyon - CNRS - Centre National de la Recherche Scientifique)
    Abstract: In the future, artificially intelligent algorithms will make more and more decisions on behalf of humans that involve humans' social preferences. They can learn these preferences through the repeated observation of human behavior in social encounters. In such a context, do individuals adjust the selfishness or prosociality of their behavior when it is common knowledge that their actions produce various externalities through the training of an algorithm? In an online experiment, we let participants' choices in dictator games train an algorithm. Thereby, they create an externality on future decision making of an intelligent system that affects future participants. We show that individuals who are aware of the consequences of their training on the payoffs of a future generation behave more prosocially, but only when they bear the risk of being harmed themselves by future algorithmic choices. In that case, the externality of artificially intelligence training increases the share of egalitarian decisions in the present.
    Keywords: Artificial Intelligence,Morality,Prosociality,Generations,Externalities
    Date: 2022
  13. By: Kyogo Kanazawa (Faculty of Economics, The University of Tokyo); Daiji Kawaguchi (Faculty of Economics, The University of Tokyo, RIETI and IZA); Hitoshi Shigeoka (Faculty of Economics, The University of Tokyo, Simon Fraser University, IZA, and NBER); Yasutora Watanabe (Faculty of Economics, The University of Tokyo)
    Abstract: We examine the impact of Artificial Intelligence (AI) on productivity in the context of taxi drivers. The AI we study assists drivers with finding customers by suggesting routes along which the demand is predicted to be high. We find that AI improves drivers' productivity by shortening the cruising time, and such gain is accrued only to low-skilled drivers, narrowing the productivity gap between high- and low-skilled drivers by 14%. The result indicates that AI's impact on human labor is more nuanced and complex than a job displacement story, which was the primary focus of existing studies.
    Date: 2022–10
  14. By: Braley,Alia Anne; Fraiberger,Samuel Paul; Tas,Emcet Oktay
    Abstract: Evaluating service delivery needs in data-poor environments presents a particularly difficult problem for policymakers. The places where the need for social services are most acute are often the very same places where assessing policy interventions is the most challenging. This paper uses Twitter data to gain insights into service delivery needs in a data-poor environment. Specifically, it examines the development priorities of citizens in the north- western region of Pakistan between 2007 and 2020, using natural language processing techniques (NLP) and sentiment analysis of 9.5 million tweets generated by 20,000 unique Twitter users. The analysis reveals that service delivery priorities in this context are centered on access to education, healthcare, food, and clean water. The findings provide baseline data for future on-the-ground research and development initiatives. In addition, the methodology used in this paper demonstrates both current resources and areas in need of future work in the use of NLP techniques in analyzing social media data in other contexts.
    Keywords: ICT Applications,Hydrology,Food Security,Nutrition,Educational Sciences,Information Technology
    Date: 2021–03–10
  15. By: Julia Cagé (ECON - Département d'économie (Sciences Po) - Sciences Po - Sciences Po - CNRS - Centre National de la Recherche Scientifique, CEPR - Center for Economic Policy Research - CEPR); Nicolas Hervé (INA - Institut National de l'Audiovisuel); Béatrice Mazoyer (Médialab - Médialab (Sciences Po) - Sciences Po - Sciences Po)
    Abstract: Social media affects not only the way we consume news, but also the way news is produced, including by traditional media outlets. In this paper, we study the propagation of information from social media to mainstream media, and investigate whether news editors' editorial decisions are influenced by the popularity of news stories on social media To do so, we build a novel dataset including a representative sample of all the tweets produced in French between August 1st 2018 and July 31st 2019 (1.8 billion tweets, around 70% of all tweets in French) and the content published online by 200 mainstream media outlets. We then develop novel algorithms to identify and link events on social and mainstream media. To isolate the causal impact of popularity, we rely on the structure of the Twitter network and propose a new instrument based on the interaction between measures of user centrality and "social media news pressure" at the time of the event. We show that story popularity has a positive effect on media coverage, and that this effect varies depending on the media outlets' characteristics, in particular on whether they use a paywall. Finally, we investigate consumers' reaction to a surge in social media popularity. Our findings shed new light on our understanding of how editors decide on the coverage for stories, and question the welfare effects of social media.
    Keywords: Internet,Information spreading,News editors,Network analysis,Social media,Twitter,Text analysis
    Date: 2022–05–31
  16. By: Bertsch, Christoph (Research Department, Central Bank of Sweden); Hull, Isaiah (Finance Department, BI Norwegian Business School); Lumsdaine, Robin L. (Kogod School of Business, American University; Erasmus University Rotterdam; National Bureau of Economic Research (NBER); Tinbergen Institute; Center for Financial Stability); Zhang, Xin (BIS Innovation Hub Nordic Centre)
    Abstract: When does the Federal Reserve deviate from its dual mandate of pursuing the economic goals of maximum employment and price stability and what are the consequences? We assemble the most comprehensive collection of Federal Reserve speeches to-date and apply state-of-the-art natural language processing methods to extract a variety of textual features from each paragraph of each speech. We find that the periodic emergence of non-dual mandate related discussions is an important determinant of time-variations in the historical conduct of monetary policy with implications for asset returns. The period from mid-1996 to late 2010 stands out as the time with the narrowest focus on balancing the dual mandate. Prior to the 1980s there was a outsized attention to employment and output growth considerations, while non dual-mandate discussions centered around financial stability considerations emerged after the Great Financial Crisis. Forward-looking financial stability concerns are a particularly important driver of a less accommodative monetary policy stance when Fed officials link these concerns to monetary policy, rather than changes in banking regulation. Conversely, discussions about current financial crises and monetary policy in the context of inflation-employment themes are associated with a more accommodative policy stance.
    Keywords: Natural Language Processing; Machine Learning; Central Bank Communication; Financial Stability; Zero Shot Classification; Extractive Question Answering; Semantic Textual Similarity
    JEL: C63 D84 E32 E70
    Date: 2022–10–01
  17. By: Francisco J. Beltran Tapia (Norwegian University of Science and Technology); Alfonso Diez Minguela (Universitat de Valencia); Victor Fernandez Modrego (Universitat de Valencia); Alicia Gomez Tello (Universitat de Valencia); Julio Martinez-Galarraga (Universitat de Barcelona); Daniel A. Tirado Fabregat (Universitat de Valencia)
    Abstract: This document presents ESPAREL (“España, del Antiguo Régimen al Estado Liberal†), a project in the field of digital humanities. The main objective of ESPAREL has been to generate a spatial data infrastructure (SDI) that allows linking the territorial structure of the Ancien Régime with that of the Liberal State at the end of the 19th century and with the current one, linking the existing population entities in (1) the Census of 1787 (CP1787), (2) the Nomenclator of Spain of 1887 (NE1887) and (3) the Basic General Nomenclator of Spain (NGBE). Firstly, the NE1887 (106,491 population entities) was digitised and converted into data format using optical character recognition techniques (OCR) and machine learning algorithm programming. The main entities of the NE1887 were then linked to the existing entities (NGBE), and given that the NGBE includes the geographical coordinates of the entities, this made it possible to geolocate the NE1887, opening the door to its processing by means of Geographical Information Systems (GIS). Once this work had been carried out, CP 1787 (20,236 entities organised into towns, villages, places, hamlets, etc.) was linked to this database. The results of this project, which can be consulted openly on the ESPAREL platform (, will allow progress to be made in a number of areas of historical research. These include the study of changes in settlement patterns over time and the depopulation that has taken place in a significant part of Spain. By way of example, the second part of the text presents a case study, based on the Comunitat Valenciana, which, by going beyond the municipalities, shows the possibilities offered by ESPAREL to improve our knowledge of the origins of depopulation, with a level of territorial detail not achieved until now.
    Keywords: digital humanities, population entities, nomenclator, census of 1787
    JEL: C8 J1 H1 N9 O1 R1
    Date: 2022–10
  18. By: G. A. Nigmatulin; O. B. Chaganova
    Abstract: The steadily high demand for cash contributes to the expansion of the network of Bank payment terminals. To optimize the amount of cash in payment terminals, it is necessary to minimize the cost of servicing them and ensure that there are no excess funds in the network. The purpose of this work is to create a cash management system in the network of payment terminals. The article discusses the solution to the problem of determining the optimal amount of funds to be loaded into the terminals, and the effective frequency of collection, which allows to get additional income by investing the released funds. The paper presents the results of predicting daily cash withdrawals at ATMs using a triple exponential smoothing model, a recurrent neural network with long short-term memory, and a model of singular spectrum analysis. These forecasting models allowed us to obtain a sufficient level of correct forecasts with good accuracy and completeness. The results of forecasting cash withdrawals were used to build a discrete optimal control model, which was used to develop an optimal schedule for adding funds to the payment terminal. It is proved that the efficiency and reliability of the proposed model is higher than that of the classical Baumol-Tobin inventory management model: when tested on the time series of three ATMs, the discrete optimal control model did not allow exhaustion of funds and allowed to earn on average 30% more than the classical model.
    Date: 2022–10
  19. By: Alejandro Rodriguez Dominguez; David Stynes
    Abstract: We present a geometric version of Quickest Change Detection (QCD) and Quickest Hub Discovery (QHD) tests in correlation structures that allows us to include and combine new information with distance metrics. The topic falls within the scope of sequential, nonparametric, high-dimensional QCD and QHD, from which state-of-the-art settings developed global and local summary statistics from asymptotic Random Matrix Theory (RMT) to detect changes in random matrix law. These settings work only for uncorrelated pre-change variables. With our geometric version of the tests via clustering, we can test the hypothesis that we can improve state-of-the-art settings for QHD, by combining QCD and QHD simultaneously, as well as including information about pre-change time-evolution in correlations. We can work with correlated pre-change variables and test if the time-evolution of correlation improves performance. We prove test consistency and design test hypothesis based on clustering performance. We apply this solution to financial time series correlations. Future developments on this topic are highly relevant in finance for Risk Management, Portfolio Management, and Market Shocks Forecasting which can save billions of dollars for the global economy. We introduce the Diversification Measure Distribution (DMD) for modeling the time-evolution of correlations as a function of individual variables which consists of a Dirichlet-Multinomial distribution from a distance matrix of rolling correlations with a threshold. Finally, we are able to verify all these hypotheses.
    Date: 2022–10
  20. By: Kiana Asgari; Aida Afshar Mohammadian; Mojtaba Tefagh
    Abstract: In recent years, with the development of easy to use learning environments, implementing and reproducible benchmarking of reinforcement learning algorithms has been largely accelerated by utilizing these frameworks. In this article, we introduce the Dynamic Fee learning Environment (DyFEn), an open-source real-world financial network model. It can provide a testbed for evaluating different reinforcement learning techniques. To illustrate the promise of DyFEn, we present a challenging problem which is a simultaneous multi-channel dynamic fee setting for off-chain payment channels. This problem is well-known in the Bitcoin Lightning Network and has no effective solutions. Specifically, we report the empirical results of several commonly used deep reinforcement learning methods on this dynamic fee setting task as a baseline for further experiments. To the best of our knowledge, this work proposes the first virtual learning environment based on a simulation of blockchain and distributed ledger technologies, unlike many others which are based on physics simulations or game platforms.
    Date: 2022–10
  21. By: Valerio Astuti (Banca d'Italia); Alessio Ciarlone (Banca d'Italia); Alberto Coco (Banca d'Italia)
    Abstract: In this paper, we analyze whether central bank communication can be an additional tool to provide guidance on monetary policy, drive private agents’ inflation expectations and financial asset prices in the main countries of Central and Eastern Europe. By applying natural language processing techniques to monetary policy statements and minutes, we first derive a series of salient topics on which central bank communications focused over the last two decades, and then develop indices of tone to gauge their respective degrees of hawkishness (dovishness) about the economic outlook. By using these indices in an econometric set-up, we find that a more hawkish (dovish) tone – reflecting a more positive (negative) assessment of the economic outlook – anticipates a more restrictive (accommodative) monetary policy decision, raises (lowers) short-term inflation expectations of private sector agents, increases (reduces) market interest rates across different maturities, and drives share prices down (up). Overall, our analysis suggests that communication may be a complementary and effective monetary policy tool available to central banks in emerging economies.
    Keywords: central banks, communication, natural language processing, Taylor rule, inflation expectations, financial markets, CEE-3
    JEL: C22 C25 C45 E44 E52 E58
    Date: 2022–10
  22. By: Asher,Sam; Lunt,Tobias; Matsuura,Ryu; Novosad,Paul Michael
    Abstract: The SHRUG is an open data platform describing multidimensional socioeconomic development across 600,000 villages and towns in India. This paper presents three illustrative analyses only possible with high-resolution data. First, it confirms that nighttime lights are highly significant proxies for population, employment, per-capita consumption, and electrification at very local levels. However, elasticities between night lights and these variables are far lower in time series than in cross section, and vary widely across context and level of aggregation. Next, this study shows that the distribution of manufacturing employment across villages follows a power law: the majority of rural Indians have considerably less access to manufacturing employment than is suggested by aggregate data. Third, a poverty mapping exercise explores local heterogeneity in living standards and estimates the potential targeting improvement from allocating programs at the village- rather than at the district-level. The SHRUG can serve as a model for open high-resolution data in developing countries.
    Keywords: Energy Policies&Economics,Business Cycles and Stabilization Policies,General Manufacturing,Plastics&Rubber Industry,Pulp&Paper Industry,Textiles, Apparel&Leather Industry,Construction Industry,Common Carriers Industry,Food&Beverage Industry,Inequality,ICT Policy and Strategies,ICT Legal and Regulatory Framework
    Date: 2021–02–09
  23. By: Etienne ESPAGNE; Yen Boi HA; Kenneth HOUNGBEDJI; Thanh NGO-DUC
    Abstract: This paper investigates the effect of typhoons on economic activities in Vietnam. During the period covered by our analysis, 1992-2013, we observed 63 typhoons affecting different locations of the country in different years with varying intensity. Using measures of the intensity of nightlight from satellite imagery as a proxy for the level of economic activity, we study how the nighttime light brightness varies across locations that were variably affected by the tropical cyclones. The results suggest that typhoons have on average dimmed nighttime luminosity of the places hit by 5 ± 5.8% or 8 ± 7.8% depending on the specifications we made.
    Keywords: Vietnam
    JEL: Q
    Date: 2022–10–13
  24. By: Roberts,Mark
    Abstract: Over the past decade, nighttime lights have become a widely used proxy for measuring economic activity. This paper examines the potential for high frequency nighttime lights data to provide “near real-time†tracking of the economic impacts of the COVID-19 crisis in Morocco. At the national level, there exists a strong correlation between quarterly movements in Morocco’s overall nighttime light intensity and movements in its real GDP. This finding supports the use of lights data to track the economic impacts of the COVID-19 crisis at higher temporal frequencies and at the subnational level, for which GDP data are unavailable. Consistent with large economic impacts of the crisis, Morocco experienced a large drop in the overall intensity of its lights in March 2020, from which it has subsequently struggled to recover, following the country’s first COVID-19 case and the introduction of strict lockdown measures. At the subnational level, while all regions shared in March’s national decline in nighttime light intensity, Rabat – Salé – Kénitra, Tanger – Tetouan – Al Hoceima, and Fès – Meknès suffered much larger declines than others. Since then, the relative effects of the COVID-19 shock across regions have largely persisted. Overall, the results suggest that, at least for Morocco, changes in nighttime lights can help to detect the timing of changes in the direction of real GDP, but caution is needed in using lights data to derive precise quantitative estimates of changes in real GDP.
    Keywords: Disaster Management,Social Risk Management,Hazard Risk Management,Industrial Economics,Economic Theory&Research,Economic Growth,Food Security,Inequality
    Date: 2021–02–04
  25. By: Beyer,Robert Carl Michael; Jain,Tarun; Sinha,Sonalika
    Abstract: This paper estimates the impact of a differential relaxation of COVID-19 containment policies on aggregate economic activity in India. Following a uniform national lockdown, the Government of India classified all districts into three zones with varying containment measures in May 2020. Using a difference-in-differences approach, the paper estimates the impact of these restrictions on nighttime light intensity, a standard high-frequency proxy for economic activity. To conduct this analysis, pandemic-era, district-level data from a range of novel sources are combined -- monthly nighttime lights from global satellites; Facebook’s mobility data from individual smartphone locations; and high-frequency, household-level survey data on income and consumption, supplemented with data from the Indian Census and the Reserve Bank of India. The analysis finds that nighttime light intensity in May was 12.4 percent lower for districts with the most severe restrictions and 1.7 percent lower for districts with intermediate restrictions, compared with districts with the least restrictions. The differences were largest in May, when the different policies were in place, and slowly tapered in June and July. Restricted mobility and lower household income are plausible channels for these results. Stricter containment measures had larger impacts in districts with greater population density of older residents, as well as more services employment and bank credit.
    Keywords: Public Health Promotion,Transport Services,Health Care Services Industry
    Date: 2020–11–30

This nep-big issue is ©2022 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.