nep-cmp New Economics Papers
on Computational Economics
Issue of 2021‒04‒26
twenty-one papers chosen by

  1. Estimating complex ecological variables at high resolution in heterogeneous terrain using multivariate matching algorithms By Renne, Rachel; Schlaepfer, Daniel; Palmquist, Kyle; Lauenroth, William; Bradford, John
  2. Aiding Long-Term Investment Decisions with XGBoost Machine Learning Model By Ekaterina Zolotareva
  3. Accuracies of Model Risks in Finance using Machine Learning By Berthine Nyunga Mpinda; Jules Sadefo Kamdem; Salomey Osei; Jeremiah Fadugba
  4. Accuracies of some Learning or Scoring Models for Credit Risk Measurement By Salomey Osei; Berthine Nyunga Mpinda; Jules Sadefo Kamdem; Jeremiah Fadugba
  5. Interpretability in deep learning for finance: a case study for the Heston model By Damiano Brigo; Xiaoshan Huang; Andrea Pallavicini; Haitz Saez de Ocariz Borde
  6. “Nowcasting and forecasting GDP growth with machine-learning sentiment indicators” By Oscar Claveria; Enric Monte; Salvador Torra
  7. Advances in the Agent-Based Modeling of Economic and Social Behavior By Steinbacher, Mitja; Raddant, Matthias; Karimi, Fariba; Camacho-Cuena, Eva; Alfarano, Simone; Iori, Giulia; Lux, Thomas
  9. Automatic Double Machine Learning for Continuous Treatment Effects By Sylvia Klosin
  10. Predicting Inflation with Neural Networks By Paranhos, Livia
  11. Deep Reinforcement Learning in a Monetary Model By Mingli Chen; Andreas Joseph; Michael Kumhof; Xinlei Pan; Rui Shi; Xuan Zhou
  12. CATE meets ML - The Conditional Average Treatment Effect and Machine Learning By Daniel Jacob
  13. Assessing the Impact of COVID-19 on Trade: a Machine Learning Counterfactual Analysis By Dueñas, Marco; Ortiz, Víctor; Riccaboni, Massimo; Serti, Francesco
  14. A Machine Learning Approach to Analyze and Support Anti-Corruption Policy By Elliott Ash; Sergio Galletta; Tommaso Giommoni
  15. Combining microsimulation and optimization to identify optimal flexible tax-transfer rule By Ugo Colombino; Nizamul Islam
  16. Neural basis expansion analysis with exogenous variables: Forecasting electricity prices with NBEATSx By Kin G. Olivares; Cristian Challu; Grzegorz Marcjasz; Rafal Weron; Artur Dubrawski
  17. Calibrating an adaptive Farmer-Joshi agent-based model for financial markets By Ivan Jericevich; Murray McKechnie; Tim Gebbie
  18. A Structural Model of a Multitasking Salesforce: Multidimensional Incentives and Plan Design By Minkyung Kim; K. Sudhir; Kosuke Uetake
  19. The decision to enrol in higher education By Hügle, Dominik
  20. Applications of Machine Learning in Mental Healthcare By Davcheva, Elena
  21. Micro-Estimates of Wealth for all Low- and Middle-Income Countries By Guanghua Chi; Han Fang; Sourav Chatterjee; Joshua E. Blumenstock

  1. By: Renne, Rachel; Schlaepfer, Daniel; Palmquist, Kyle; Lauenroth, William; Bradford, John
    Abstract: 1. Simulation models are valuable tools for estimating ecosystem structure and function under various climatic and environmental conditions and disturbance regimes, and are particularly relevant for investigating the potential impacts of climate change on ecosystems. However, because computational requirements can restrict the number of feasible simulations, they are often run at coarse scales or for representative points. These results can be difficult to use in decision-making, particularly in topographically complex regions. 2. We present methods for interpolating multivariate and time series simulation output to high resolution maps. First, we developed a method for applying k-means clustering to optimize selection of simulation sites to maximize the area represented for a given number of simulations. Then, we used multivariate matching to interpolate simulation results to high-resolution maps for the represented area. The methods rely on a user-defined set of matching variables that are assigned weights such that matched sites will be within a prescribed range for each variable. We demonstrate the methods with case studies using an individual-based plant simulation model to illustrate site selection and an ecosystem water balance simulation model for interpolation. 3. For the site-selection case study, our approach optimized the location of 200 simulation sites and accurately represented 96% of a large study area (1.12 x 106 km2) at a 30-arcsecond resolution. For the interpolation case study, we generated high-resolution (30-arcsecond) maps across 4.38 x 106 km2 of drylands in western North America from simulated sites representing a 10 x 10 km grid. Our estimates of interpolation errors using leave-one-out cross validation were low (<10% of the range of each variable). 4. Our point selection and interpolation methods provide a means of generating high-resolution maps of complex simulation output (e.g., multivariate and time-series) at scales relevant for local conservation planning and can help resolve the effects of topography that are lost in simulations at coarse scales or for representative points. These methods are flexible and allow the user to identify relevant matching criteria for an area of interest to balance quality of matching with areal coverage to enhance inference and decision-making in heterogenous terrain.
    Date: 2021–04–16
  2. By: Ekaterina Zolotareva
    Abstract: The ability to identify stock market trends has obvious advantages for investors. Buying stock on an upward trend (as well as selling it in case of downward movement) results in profit. Accordingly, the start and end-points of the trend are the optimal points for entering and leaving the market. The research concentrates on recognizing stock market long-term upward and downward trends. The key results are obtained with the use of gradient boosting algorithms, XGBoost in particular. The raw data is represented by time series with basic stock market quotes with periods labelled by experts as Trend or Flat. The features are then obtained via various data transformations, aiming to catch implicit factors resulting in a change of stock direction. Modelling is done in two stages: stage one aims to detect endpoints of tendencies (i.e. sliding windows), stage two recognizes the tendency itself inside the window. The research addresses such issues as imbalanced datasets and contradicting labels, as well as the need for specific quality metrics to keep up with practical applicability. The model can be used to design an investment strategy though further research in feature engineering and fine calibration is required.This paper is the full text of the research, presented at the 20th International Conference on Artificial Intelligence and Soft Computing Web System (ICAISC 2021)
    Date: 2021–04
  3. By: Berthine Nyunga Mpinda; Jules Sadefo Kamdem (MRE - Montpellier Recherche en Economie - UM - Université de Montpellier); Salomey Osei; Jeremiah Fadugba
    Abstract: There is increasing interest in using Artificial Intelligence (AI) and machine learning techniques to enhance risk management from credit risk to operational risk. Moreover, recent applications of machine learning models in risk management have proved efficient. That notwithstanding, while using machine learning techniques can have considerable benefits, they also can introduce risk of their own, when the models are wrong. Therefore, machine learning models must be tested and validated before they can be used. The aim of this work is to explore some existing machine learning models for operational risk, by comparing their accuracies. Because a model should add value and reduce risk, particular attention is paid on how to evaluate it's performance, robustness and limitations. After using the existing machine learning and deep learning methods for operational risk, particularly on risk of fraud, we compared accuracies of these models based on the following metrics: accuracy, F1-Score, AUROC curve and precision. We equally used quantitative validation such as Back-testing and Stress-testing for performance analysis of the model on historical data, and the sensibility of the model for extreme but plausible scenarios like the Covid-19 period. Our results show that, Logistic regression out performs all deep learning models considered for fraud detection
    Keywords: Machine Learning,Model Risk,Credit Card Fraud,Decisions Support,Stress-Testing
    Date: 2021–04–07
  4. By: Salomey Osei (AMMI - African Masters of Machine Intelligence); Berthine Nyunga Mpinda (AMMI - African Masters of Machine Intelligence); Jules Sadefo Kamdem (MRE - Montpellier Recherche en Economie - UM - Université de Montpellier); Jeremiah Fadugba (AMMI - African Masters of Machine Intelligence)
    Abstract: Given the role played by banks in the financial system as well, risks are subject to regulatory attention, and Credit risk is one of the major financial risks faced by banks. According to Basel I to III, banks have the responsibility to implement the credit risk strategy. Nowadays, machine learning techniques have attracted an important interest for different applications to financial institutions and its applications have received much attention from investors and researchers. Hence in this paper, we discuss existing literature by shedding more light on a number of techniques and examine machine learning models for Credit risk by focusing on Multi-Layer Perceptron (MLP) and Convolutional Neural Networks (CNN) for credit risk. Different test performances of these models such as back-testing and stress-testing have been done using Home Credit historical data and simulated data respectively. We realized that the MLP and CNN models were able to predict well with an accuracy of 91% and 67% respectively for back-testing. To test our models in stress scenarios and extreme scenarios, we consider a generated imbalanced data with 80% of defaults and 20% of non-default. Using the same model trained on Home Credit data, we perform a stress-test on the simulated data and we realized that the MLP model did not perform well compared to the CNN model, with an accuracy of 43% as against 89% obtained during the training. Thus, the CNN model was able to perform better during stressed situations for accuracy and for other metrics such as ROC AUC curve, recall, and precision.
    Keywords: Model Accuracy,Machine Learning,Credit Risk,Basel III,Risk Management
    Date: 2021–03
  5. By: Damiano Brigo; Xiaoshan Huang; Andrea Pallavicini; Haitz Saez de Ocariz Borde
    Abstract: Deep learning is a powerful tool whose applications in quantitative finance are growing every day. Yet, artificial neural networks behave as black boxes and this hinders validation and accountability processes. Being able to interpret the inner functioning and the input-output relationship of these networks has become key for the acceptance of such tools. In this paper we focus on the calibration process of a stochastic volatility model, a subject recently tackled by deep learning algorithms. We analyze the Heston model in particular, as this model's properties are well known, resulting in an ideal benchmark case. We investigate the capability of local strategies and global strategies coming from cooperative game theory to explain the trained neural networks, and we find that global strategies such as Shapley values can be effectively used in practice. Our analysis also highlights that Shapley values may help choose the network architecture, as we find that fully-connected neural networks perform better than convolutional neural networks in predicting and interpreting the Heston model prices to parameters relationship.
    Date: 2021–04
  6. By: Oscar Claveria (AQR-IREA, University of Barcelona); Enric Monte (Polytechnic University of Catalunya); Salvador Torra (Riskcenter-IREA, University of Barcelona)
    Abstract: We apply the two-step machine-learning method proposed by Claveria et al. (2021) to generate country-specific sentiment indicators that provide estimates of year-on-year GDP growth rates. In the first step, by means of genetic programming, business and consumer expectations are evolved to derive sentiment indicators for 19 European economies. In the second step, the sentiment indicators are iteratively re-computed and combined each period to forecast yearly growth rates. To assess the performance of the proposed approach, we have designed two out-of-sample experiments: a nowcasting exercise in which we recursively generate estimates of GDP at the end of each quarter using the latest survey data available, and an iterative forecasting exercise for different forecast horizons We found that forecasts generated with the sentiment indicators outperform those obtained with time series models. These results show the potential of the methodology as a predictive tool.
    Keywords: Forecasting, Economic growth, Business and consumer expectations, Symbolic regression, Evolutionary algorithms, Genetic programming. JEL classification: C51, C55, C63, C83, C93
    Date: 2021–02
  7. By: Steinbacher, Mitja; Raddant, Matthias; Karimi, Fariba; Camacho-Cuena, Eva; Alfarano, Simone; Iori, Giulia; Lux, Thomas
    Abstract: In this survey we discuss advances in the agent-based modeling of economic and social systems. We present the state of the art in the heuristic design of agents and the connections to the results from laboratory experiments on agent behavior. We further discuss how large-scale social and economic systems can be modeled and highlight novel methods and data sources. At last we present an overview of estimation techniques to calibrate and validate agent-based models.
    Keywords: agent-based models, heuristic design, model calibration, behavioral economics, computational social science, computational economics
    JEL: B41 C60 D90 G17 L20
    Date: 2021–01
  8. By: Denis Shibitov (Bank of Russia, Russian Federation); Mariam Mamedli (Bank of Russia, Russian Federation)
    Abstract: We show, how the forecasting performance of models varies, when certain inaccuracies in the pseudo real-time experiment take place. We consider the case of Russian CPI forecasting and estimate several models on not seasonally adjusted data vintages. Particular attention is paid to the availability of the variables at the moment of forecast: we take into account the release timing of the series and the corresponding release delays, in order to reconstruct the forecasting in real-time. In the series of experiments, we quantify how each of these issues affect the out-of-sample error. We illustrate, that the neglect of the release timing generally lowers the errors. The same is true for the use of seasonally adjusted data. The impact of the data vintages depends on the model and forecasting period. The overall effect of all three inaccuracies varies from 8% to 17% depending on the forecasting horizon. This means, that the actual forecasting error can be significantly underestimated, when inaccurate pseudo real-time experiment is run. We underline the need to take these aspects into account, when the real-time forecasting is considered.
    Keywords: inflation, pseudo real-time forecasting, data vintages, machine learning, neural networks.
    JEL: C14 C45 C51 C53
    Date: 2021–04
  9. By: Sylvia Klosin
    Abstract: In this paper, we introduce and prove asymptotic normality for a new nonparametric estimator of continuous treatment effects. Specifically, we estimate the average dose-response function - the expected value of an outcome of interest at a particular level of the treatment level. We utilize tools from both the double debiased machine learning (DML) and the automatic double machine learning (ADML) literatures to construct our estimator. Our estimator utilizes a novel debiasing method that leads to nice theoretical stability and balancing properties. In simulations our estimator performs well compared to current methods.
    Date: 2021–04
  10. By: Paranhos, Livia (University of Warwick)
    Abstract: This paper applies neural network models to forecast inflation. The use of a particular recurrent neural network, the long-short term memory model, or LSTM, that summarizes macroeconomic information into common components is a major contribution of the paper. Results from an exercise with US data indicate that the estimated neural nets usually present better forecasting performance than standard benchmarks, especially at long horizons. The LSTM in particular is found to outperform the traditional feed-forward network at long horizons, suggesting an advantage of the recurrent model in capturing the long-term trend of inflation. This finding can be rationalized by the so called long memory of the LSTM that incorporates relatively old information in the forecast as long as accuracy is improved, while economizing in the number of estimated parameters. Interestingly, the neural nets containing macroeconomic information capture well the features of inflation during and after the Great Recession, possibly indicating a role for nonlinearities and macro information in this episode. The estimated common components used in the forecast seem able to capture the business cycle dynamics, as well as information on prices.
    Keywords: forecasting ; inflation ; neural networks ; deep learning ; LSTM model
    Date: 2021
  11. By: Mingli Chen; Andreas Joseph; Michael Kumhof; Xinlei Pan; Rui Shi; Xuan Zhou
    Abstract: We propose using deep reinforcement learning to solve dynamic stochastic general equilibrium models. Agents are represented by deep artificial neural networks and learn to solve their dynamic optimisation problem by interacting with the model environment, of which they have no a priori knowledge. Deep reinforcement learning offers a flexible yet principled way to model bounded rationality within this general class of models. We apply our proposed approach to a classical model from the adaptive learning literature in macroeconomics which looks at the interaction of monetary and fiscal policy. We find that, contrary to adaptive learning, the artificially intelligent household can solve the model in all policy regimes.
    Date: 2021–04
  12. By: Daniel Jacob
    Abstract: For treatment effects - one of the core issues in modern econometric analysis - prediction and estimation are two sides of the same coin. As it turns out, machine learning methods are the tool for generalized prediction models. Combined with econometric theory, they allow us to estimate not only the average but a personalized treatment effect - the conditional average treatment effect (CATE). In this tutorial, we give an overview of novel methods, explain them in detail, and apply them via Quantlets in real data applications. We study the effect that microcredit availability has on the amount of money borrowed and if 401(k) pension plan eligibility has an impact on net financial assets, as two empirical examples. The presented toolbox of methods contains meta-learners, like the Doubly-Robust, R-, T- and X-learner, and methods that are specially designed to estimate the CATE like the causal BART and the generalized random forest. In both, the microcredit and 401(k) example, we find a positive treatment effect for all observations but conflicting evidence of treatment effect heterogeneity. An additional simulation study, where the true treatment effect is known, allows us to compare the different methods and to observe patterns and similarities.
    Date: 2021–04
  13. By: Dueñas, Marco; Ortiz, Víctor; Riccaboni, Massimo; Serti, Francesco
    Abstract: By interpreting exporters’ dynamics as a complex learning process, this paper constitutes the first attempt to investigate the effectiveness of different Machine Learning (ML) techniques in predicting firms’ trade status. We focus on the probability of Colombian firms surviving in the export market under two different scenarios: a COVID-19 setting and a non-COVID-19 counterfactual situation. By comparing the resulting predictions, we estimate the individual treatment effect of the COVID-19 shock on firms’ outcomes. Finally, we use recursive partitioning methods to identify subgroups with differential treatment effects. We find that, besides the temporal dimension, the main factors predicting treatment heterogeneity are interactions between firm size and industry.
    Keywords: Machine Learning; International Trade; COVID-19
    JEL: F14 F17 D22 L25
    Date: 2021–04
  14. By: Elliott Ash; Sergio Galletta; Tommaso Giommoni
    Abstract: Can machine learning support better governance? In the context of Brazilian municipalities, 2001-2012, we have access to detailed accounts of local budgets and audit data on the associated fiscal corruption. Using the budget variables as predictors, we train a tree-based gradient-boosted classifier to predict the presence of corruption in held-out test data. The trained model, when applied to new data, provides a prediction-based measure of corruption that can be used for new empirical analysis or to support policy responses. We validate the empirical usefulness of this measure by replicating and extending some previous empirical evidence on corruption issues in Brazil. We then explore how the predictions can be used to support policies toward corruption. Our policy simulations show that, relative to the status quo policy of random audits, a targeted policy guided by the machine predictions could detect almost twice as many corrupt municipalities for the same audit rate. Similar gains can be achieved for a politically neutral targeting policy that equalizes audit rates across political parties.
    Keywords: algorithmic decision-making, corruption policy, local public finance
    JEL: D73 E62 K14 K42
    Date: 2021
  15. By: Ugo Colombino; Nizamul Islam
    Abstract: We use a behavioural microsimulation model embedded in a numerical optimization procedure in order to identify optimal (social welfare maximizing) tax-transfer rules. We consider the class of tax-transfer rules consisting of a universal basic income and a tax defined by a 4th degree polynomial. The rule is applied to total taxable household income. A microeconometric model of household, which simulates household labour supply decisions, is embedded into a numerical routine in order to identify – within the class defined above – the tax-transfer rule that maximizes a social welfare function. We present the results for five European countries: France, Italy, Luxembourg, Spain and United Kingdom. For most values of the inequality aversion parameter, the optimized rules provide a higher social welfare than the current rule, with the exception of Luxembourg. In France, Italy and Luxembourg the optimized rules are significantly different from the current ones and are close to a Negative Income Tax or a Universal basic income with a flat tax rate. In Spain and the UK, the optimized rules are instead close to the current rule. With the exception of Spain, the optimal rules are slightly disequalizing and the social welfare gains are due to efficiency gains. Nonetheless, the poverty gap index tends to be lower under the optimized regime.
    Keywords: empirical optimal taxation, microsimulation, microeconometrics, evaluation of tax-transfer rules.
    Date: 2020
  16. By: Kin G. Olivares; Cristian Challu; Grzegorz Marcjasz; Rafal Weron; Artur Dubrawski
    Abstract: We extend the neural basis expansion analysis (NBEATS) to incorporate exogenous factors. The resulting method, called NBEATSx, improves on a well performing deep learning model, extending its capabilities by including exogenous variables and allowing it to integrate multiple sources of useful information. To showcase the utility of the NBEATSx model, we conduct a comprehensive study of its application to electricity price forecasting (EPF) tasks across a broad range of years and markets. We observe state-of-the-art performance, significantly improving the forecast accuracy by nearly 20% over the original NBEATS model, and by up to 5% over other well established statistical and machine learning methods specialized for these tasks. Additionally, the proposed neural network has an interpretable configuration that can structurally decompose time series, visualizing the relative impact of trend and seasonal components and revealing the modeled processes' interactions with exogenous factors.
    Keywords: Deep learning; NBEATS and NBEATSx models; Interpretable neural network; Time series decomposition; Fourier series; Electricity price forecasting
    JEL: C22 C32 C45 C51 C53 Q41 Q47
    Date: 2021–04–19
  17. By: Ivan Jericevich; Murray McKechnie; Tim Gebbie
    Abstract: We replicate the contested calibration of the Farmer and Joshi agent based model of financial markets using a genetic algorithm and a Nelder-Mead with threshold accepting algorithm following Fabretti. The novelty of the Farmer-Joshi model is that the dynamics are driven by trade entry and exit thresholds alone. We recover the known claim that some important stylized facts observed in financial markets cannot be easily found under calibration -- in particular those relating to the auto-correlations in the absolute values of the price fluctuations, and sufficient kurtosis. However, rather than concerns relating to the calibration method, what is novel here is that we extended the Farmer-Joshi model to include agent adaptation using an Brock and Hommes approach to strategy fitness based on trading strategy profitability. We call this an adaptive Farmer-Joshi model: the model allows trading agents to switch between strategies by favouring strategies that have been more profitable over some period of time determined by a free-parameter fixing the profit monitoring time-horizon. In the adaptive model we are able to calibrate and recover additional stylized facts, despite apparent degeneracy's. This is achieved by combining the interactions of trade entry levels with trade strategy switching. We use this to argue that for low-frequency trading across days, as calibrated to daily sampled data, feed-backs can be accounted for by strategy die-out based on intermediate term profitability; we find that the average trade monitoring horizon is approximately two to three months (or 40 to 60 days) of trading.
    Date: 2021–04
  18. By: Minkyung Kim (UNC Chapel Hill Kenan-Flagler Business School); K. Sudhir (Cowles Foundation & School of Management, Yale University); Kosuke Uetake (School of Management, Yale University)
    Abstract: The paper broadens the focus of empirical research on salesforce management to include multitasking settings with multidimensional incentives, where salespeople have private information about customers. This allows us to ask novel substantive questions around multidimensional incentive design and job design while managing the costs and benefits of private information. To this end, the paper introduces the first structural model of a multitasking salesforce in response to multidimensional incentives. The model also accommodates (i) dynamic intertemporal tradeoffs in effort choice across the tasks and (ii) salesperson’s private information about customers. We apply our model in a rich empirical setting in microfinance and illustrate how to address various identification and estimation challenges. We extend two-step estimation methods used for unidimensional compensation plans by embedding a flexible machine learning (random forest) model in the first-stage multitasking policy function estimation within an iterative procedure that accounts for salesperson heterogeneity and private information. Estimates reveal two latent segments of salespeople- a “hunter” segment that is more efficient in loan acquisition and a “farmer” segment that is more efficient in loan collection. Counterfactuals reveal heterogeneous effects: hunters’ private information hurts the firm as they engage in adverse selection; farmers’ private information helps the firm as they use it to better collect loans. The payoff complementarity induced by multiplicative incentive aggregation softens adverse specialization by hunters relative to additive aggregation, but hurts performance among farmers. Overall, task specialization in job design for hunters (acquisition) and farmers (collection) hurts the firm as adverse selection harm overwhelms efficiency gain.
    Keywords: Salesforce compensation, Multitasking, Multidimensional incentives, Job design, Private information, Adverse selection
    JEL: C61 J33 L11 L23 L14 M31 M52 M55
    Date: 2019–09
  19. By: Hügle, Dominik
    Abstract: In this paper, I analyze how the higher education decision of young adults in Germany depends on their expected future earnings. For this, I estimate a microeconometric model in which individuals maximize life-time utility by choosing whether or not to enter higher education. To forecast individual life cycles in terms of employment, earnings, and family formation under higher education and its alternative, vocational training, I use a dynamic microsimulation model and regression techniques. I take into account that while individuals generally choose between two options, higher education and vocational training, they are aware of multiple potential realizations under both options, such as leaving higher education with a bachelor degree or taking up higher education after first having earned a vocational degree. Using the estimates from the decision model, I simulate the introduction of different tuition fee and graduate tax scenarios. I find that the impact of these education policies on the higher education decision is limited and only few individuals would change their educational decisions as a reaction to these policies.
    Keywords: Educational choice,Higher education,Dynamic microsimulation
    JEL: C53 I23
    Date: 2021
  20. By: Davcheva, Elena
    Abstract: This thesis summarizes three studies in the area of machine learning applications within mental heathcare, specifically in the area of treatments and diagnostics. Mental healthcare today is challenging to provide worldwide because of a stark rise in demand for services. Traditional healthcare structures cannot keep up with the demand and information systems have the potential to fill in this gap. The thesis explores online mental health forums as a digital mental health platform and the possibility to automate treatments and diagnostics based on user-shared information.
    Date: 2021
  21. By: Guanghua Chi; Han Fang; Sourav Chatterjee; Joshua E. Blumenstock
    Abstract: Many critical policy decisions, from strategic investments to the allocation of humanitarian aid, rely on data about the geographic distribution of wealth and poverty. Yet many poverty maps are out of date or exist only at very coarse levels of granularity. Here we develop the first micro-estimates of wealth and poverty that cover the populated surface of all 135 low and middle-income countries (LMICs) at 2.4km resolution. The estimates are built by applying machine learning algorithms to vast and heterogeneous data from satellites, mobile phone networks, topographic maps, as well as aggregated and de-identified connectivity data from Facebook. We train and calibrate the estimates using nationally-representative household survey data from 56 LMICs, then validate their accuracy using four independent sources of household survey data from 18 countries. We also provide confidence intervals for each micro-estimate to facilitate responsible downstream use. These estimates are provided free for public use in the hope that they enable targeted policy response to the COVID-19 pandemic, provide the foundation for new insights into the causes and consequences of economic development and growth, and promote responsible policymaking in support of the Sustainable Development Goals.
    Date: 2021–04

General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.