nep-cmp New Economics Papers
on Computational Economics
Issue of 2021‒02‒01
thirty-one papers chosen by
Stan Miles
Thompson Rivers University

  1. Sequential Deep Learning for Credit Risk Monitoring with Tabular Financial Data By Jillian M. Clements; Di Xu; Nooshin Yousefi; Dmitry Efimov
  2. Deep reinforcement learning for portfolio management By Gang Huang; Xiaohua Zhou; Qingyang Song
  3. Modelling Consumption and Constructing Long-Term Baselines in Final Demand By Ho, Mun; Britz, Wolfgang; Delzeit, Ruth; Leblanc, Florian; Roson, Roberto; Schuenemann, Franziska; Weitzel, Matthias
  4. Development of cloud, digital technologies and the introduction of chip technologies By Ali R. Baghirzade
  5. Search for Profits and Business Fluctuations: How Banks' Behaviour Explain Cycles? By Emanuele Ciola; Edoardo Gaffeo; Mauro Gallegati
  6. Linking Global CGE models with Sectoral Models to Generate Baseline Scenarios: Approaches, Challenges, and Opportunities By Delzeit, Ruth; Beach, Robert; Bibas, Ruben; Britz, Wolfgang; Chateau, Jean; Freund, Florian; Lefevre, Julien; Schuenemann, Franziska; Sulser, Timothy; Valin, Hugo; van Ruijven, Bas; Weitzel, Matthias; Willenbockel, Dirk; Wojtowicz, Krzysztof
  7. The potential local and regional impacts of COVID-19 in New Zealand: with a focus on tourism By Laëtitia, Leroy de Morel; Glen, Wittwer; Christina, Leung; Dion, Gämperle
  8. Exploring Narrative Economics: An Agent-Based-Modeling Platform that Integrates Automated Traders with Opinion Dynamics By Kenneth Lomas; Dave Cliff
  9. Assessing the Importance of an Attribute in a Demand SystemStructural Model versus Machine Learning By Badruddoza, Syed; Amin, Modhurima; McCluskey, Jill
  10. Residential and Industrial Energy Efficiency Improvement: A Dynamic General Equilibrium Analysis of the Rebound Effect By Sondes Kahouli; Xavier Pautrel
  11. Developing Transportation Response Strategies for Wildfire Evacuations via an Empirically Supported Traffic Simulation of Berkeley, California By Zhao, Bingyu PhD; Wong, Stephen D PhD
  12. Adversarial Estimation of Riesz Representers By Victor Chernozhukov; Whitney Newey; Rahul Singh; Vasilis Syrgkanis
  13. Achieving Reliable Causal Inference with Data-Mined Variables: A Random Forest Approach to the Measurement Error Problem By Mochen Yang; Edward McFowland III; Gordon Burtch; Gediminas Adomavicius
  14. Non-Manipulable Machine Learning: The Incentive Compatibility of Lasso By Mehmet Caner; Kfir Eliaz
  15. An expectation-maximization algorithm for the exponential-generalized inverse Gaussian regression model with varying dispersion and shape for modelling the aggregate claim amount By Tzougas, George; Jeong, Himchan
  16. Model of cunning agents By Mateusz Denys
  17. Trade sentiment and the stock market: new evidence based on big data textual analysis of Chinese media By Marlene Amstad; Leonardo Gambacorta; Chao He; Dora Xia
  18. In brief...Tackling domestic violence using machine learning By Jeffrey Grogger; Ria Ivandic; Tom Kirchmaier
  19. Whose Advice Counts More – Man or Machine? An Experimental Investigation of AI-based Advice Utilization By Mesbah, Neda; Tauchert, Christoph; Buxmann, Peter
  20. Completing the Market: Generating Shadow CDS Spreads by Machine Learning By Nan Hu; Jian Li; Alexis Meyer-Cirkel
  21. The Value Added of Machine Learning to Causal Inference: Evidence from Revisited Studies By Anna Baiardi; Andrea A. Naghi
  22. Performance Analysis of Hospitals in Australia and its Peers: A Systematic Review By Zhichao Wang; Valentin Zelenyuk
  23. Measuring national happiness with music By Benetos, Emmanouil; Ragano, Alessandro; Sgroi, Daniel; Tuckwell, Anthony
  24. Towards robust and speculation-reduction real estate pricing models based on a data-driven strategy By Vladimir Vargas-Calder\'on; Jorge E. Camargo
  25. Parenting Types By Rauh, C.; Renée, L.
  26. Using Payments Data to Nowcast Macroeconomic Variables During the Onset of COVID-19 By James Chapman; Ajit Desai
  27. Machine Learning and Perceived Age Stereotypes in Job Ads: Evidence from an Experiment By Ian Burn; Daniel Firoozi; Daniel Ladd; David Neumark
  28. What do bankrupcty prediction models tell us about banking regulation? Evidence from statistical and learning approaches By Pierre Durand; Gaëtan Le Quang
  29. Alternative Methods for Studying Consumer Payment Choice By Oz Shy
  30. Text-based recession probabilities By Ferrari, Massimo; Le Mezo, Helena
  31. Consumer Demand Estimation By Merino Troncoso, Carlos

  1. By: Jillian M. Clements; Di Xu; Nooshin Yousefi; Dmitry Efimov
    Abstract: Machine learning plays an essential role in preventing financial losses in the banking industry. Perhaps the most pertinent prediction task that can result in billions of dollars in losses each year is the assessment of credit risk (i.e., the risk of default on debt). Today, much of the gains from machine learning to predict credit risk are driven by gradient boosted decision tree models. However, these gains begin to plateau without the addition of expensive new data sources or highly engineered features. In this paper, we present our attempts to create a novel approach to assessing credit risk using deep learning that does not rely on new model inputs. We propose a new credit card transaction sampling technique to use with deep recurrent and causal convolution-based neural networks that exploits long historical sequences of financial data without costly resource requirements. We show that our sequential deep learning approach using a temporal convolutional network outperformed the benchmark non-sequential tree-based model, achieving significant financial savings and earlier detection of credit risk. We also demonstrate the potential for our approach to be used in a production environment, where our sampling technique allows for sequences to be stored efficiently in memory and used for fast online learning and inference.
    Date: 2020–12
  2. By: Gang Huang; Xiaohua Zhou; Qingyang Song
    Abstract: The objective of this paper is to verify that current cutting-edge artificial intelligence technology, deep reinforcement learning, can be applied to portfolio management. We improve on the existing Deep Reinforcement Learning Portfolio model and make many innovations. Unlike many previous studies on discrete trading signals in portfolio management, we make the agent to short in a continuous action space, design an arbitrage mechanism based on Arbitrage Pricing Theory,and redesign the activation function for acquiring action vectors, in addition, we redesign neural networks for reinforcement learning with reference to deep neural networks that process image data. In experiments, we use our model in several randomly selected portfolios which include CSI300 that represents the market's rate of return and the randomly selected constituents of CSI500. The experimental results show that no matter what stocks we select in our portfolios, we can almost get a higher return than the market itself. That is to say, we can defeat market by using deep reinforcement learning.
    Date: 2020–12
  3. By: Ho, Mun; Britz, Wolfgang; Delzeit, Ruth; Leblanc, Florian; Roson, Roberto; Schuenemann, Franziska; Weitzel, Matthias
    Abstract: Modelling and projecting consumption, investment and government demand by detailed commodities in CGE models poses many data and methodological challenges. We review the state of knowledge of modelling consumption of commodities (price and income elasticities and demographics), as well as the historical trends that we should be able to explain. We then discuss the current approaches taken in CGE models to project the trends in demand at various levels of commodity disaggregation. We examine the pros and cons of the various approaches to adjust parameters over time or using functions of time and suggest a research agenda to improve modelling and projection. We compare projections out to 2050 using LES, CES and AIDADS functions in the same CGE model to illustrate the size of the differences. In addition, we briefly discuss the allocation of total investment and government demand to individual commodities.
    Keywords: Consumption demand systems,Long-term baseline,CGE models
    JEL: D12 D58
    Date: 2020
  4. By: Ali R. Baghirzade
    Abstract: Hardly any other area of research has recently attracted as much attention as machine learning (ML) through the rapid advances in artificial intelligence (AI). This publication provides a short introduction to practical concepts and methods of machine learning, problems and emerging research questions, as well as an overview of the participants, an overview of the application areas and the socio-economic framework conditions of the research. In expert circles, ML is used as a key technology for modern artificial intelligence techniques, which is why AI and ML are often used interchangeably, especially in an economic context. Machine learning and, in particular, deep learning (DL) opens up entirely new possibilities in automatic language processing, image analysis, medical diagnostics, process management and customer management. One of the important aspects in this article is chipization. Due to the rapid development of digitalization, the number of applications will continue to grow as digital technologies advance. In the future, machines will more and more provide results that are important for decision making. To this end, it is important to ensure the safety, reliability and sufficient traceability of automated decision-making processes from the technological side. At the same time, it is necessary to ensure that ML applications are compatible with legal issues such as responsibility and liability for algorithmic decisions, as well as technically feasible. Its formulation and regulatory implementation is an important and complex issue that requires an interdisciplinary approach. Last but not least, public acceptance is critical to the continued diffusion of machine learning processes in applications. This requires widespread public discussion and the involvement of various social groups.
    Date: 2020–12
  5. By: Emanuele Ciola (Department of Management, Universita' Politecnica delle Marche (Italy)); Edoardo Gaffeo (Department of Economics and Management, Universita' degli Studi di Trento (Italy).); Mauro Gallegati (Department of Management, Universita' Politecnica delle Marche (Italy))
    Abstract: This paper develops and estimates a macroeconomic model of real-financial markets interactions in which the behaviour of banks generates endogenous business cycles. We do so in the context of a computational agent-based framework, where the channelling of funds from depositors to investors occurring through intermediaries nformation and matching frictions. Since banks compete in both deposit and credit markets, the whole dynamic is driven by endogenous fluctuations in their profits. In particular, we assume that intermediaries adopt a simple learning process, which consists of copying the strategy of the most profitable competitors while setting their interest rates. Accordingly, the emergence of strategic complementarity - mainly due to the accumulation of information capital - leads to periods of sustained growth followed by sharp recessions in the simulated economy.
    Keywords: Keywords: Agent-based macroeconomics, Simulation-based estimation, Intermediaries behaviour, Business cycles
    JEL: C15 C51 C63 E32 E44
    Date: 2021–01
  6. By: Delzeit, Ruth; Beach, Robert; Bibas, Ruben; Britz, Wolfgang; Chateau, Jean; Freund, Florian; Lefevre, Julien; Schuenemann, Franziska; Sulser, Timothy; Valin, Hugo; van Ruijven, Bas; Weitzel, Matthias; Willenbockel, Dirk; Wojtowicz, Krzysztof
    Abstract: When modeling medium and long-term challenges we need a reference path of economic development (the so-called baseline). Because sectoral models often offer a more fundamental understanding of future developments for specific sectors, many CGE modeling teams have adopted approaches for linking their models to sectoral models to generate baselines. Linked models include agricultural sector, energy sector, biophysical and macroeconomic models. We systematically compare and discuss approaches of linking CGE models to sectoral models for the baseline calibration procedure and discuss challenges and best practices. We identify different types of linking approaches which we divide into a) one-way, and b) twoway linking. These two types of linking approaches are then analyzed with respect to the degree of consistency of the linkage, information exchanged, as well as compromises in aggregations and definitions. Based on our assessment, we discuss challenges and conclude with suggestions for best practices and research recommendations.
    Keywords: Computable general equilibrium models,Model linking baseline scenario,Partial equilibrium model
    JEL: C68 D58
    Date: 2020
  7. By: Laëtitia, Leroy de Morel (New Zealand Institute of Economic Research); Glen, Wittwer (CoPS); Christina, Leung (New Zealand Institute of Economic Research); Dion, Gämperle (New Zealand Institute of Economic Research)
    Abstract: We use our CGE model to assess the potential impacts of COVID-19 on the New Zealand economy and its regions. Our regional CGE model is used to run three scenarios (phases) based on the different alert levels (1-4) imposed by the New Zealand Government. For each phase, we mostly focus on restrictions applied to the entry and movements of people as well as on labour and capital temporarily rendered idle due to the isolation and social distancing measures. We do not explicitly model any fiscal response to showcase the base against which fiscal policies can be assessed.
    Keywords: CGE modelling; COVID-19; Tourism: New Zealand economy
    JEL: C68 Z30
    Date: 2020–08–13
  8. By: Kenneth Lomas; Dave Cliff
    Abstract: In seeking to explain aspects of real-world economies that defy easy understanding when analysed via conventional means, Nobel Laureate Robert Shiller has since 2017 introduced and developed the idea of Narrative Economics, where observable economic factors such as the dynamics of prices in asset markets are explained largely as a consequence of the narratives (i.e., the stories) heard, told, and believed by participants in those markets. Shiller argues that otherwise irrational and difficult-to-explain behaviors, such as investors participating in highly volatile cryptocurrency markets, are best explained and understood in narrative terms: people invest because they believe, because they have a heartfelt opinions, about the future prospects of the asset, and they tell to themselves and others stories (narratives) about those beliefs and opinions. In this paper we describe what is, to the best of our knowledge, the first ever agent-based modelling platform that allows for the study of issues in narrative economics. We have created this by integrating and synthesizing research in two previously separate fields: opinion dynamics (OD), and agent-based computational economics (ACE) in the form of minimally-intelligent trader-agents operating in accurately modelled financial markets. We show here for the first time how long-established models in OD and in ACE can be brought together to enable the experimental study of issues in narrative economics, and we present initial results from our system. The program-code for our simulation platform has been released as freely-available open-source software on GitHub, to enable other researchers to replicate and extend our work
    Date: 2020–12
  9. By: Badruddoza, Syed (Washington State University); Amin, Modhurima (Washington State University); McCluskey, Jill (Washington State University)
    Abstract: Firms can prioritize among the product attributes based on consumer valuations using market-level data. However, a structural estimation of market demand is challenging, especially when the data are updating in real-time and instrumental variables are scarce. We find evidence that Random Forests (RF)—a machine-learning algorithm—can detect consumers’ sensitivity to product attributes similar to the structural framework of Berry-Levinsohn-Pakes (BLP). Sensitivity to an attribute is measured by the absolute value of its coefficient. We check the RF’s capacity to rank the attributes when prices are endogenous, coefficients are random, and instrumental or demographic variables are unavailable. In our simulations, the BLP estimates correlate with the RF importance factor in ranking (68%) and magnitude (79%), and the rates increase with the sample size. Consumer sensitivity to endogenous variables (price) and variables with random coefficients are overestimated by the RF approach, but ranking of variables with non-random coefficients match with BLP’s coefficients in 96% cases. These estimates are pessimistically derived by RF without parameter-tuning. We conclude that machine-learning does not replace the structural framework but provides firms with a sensible idea of consumers’ ranking of product attributes.
    Keywords: Machine-Learning; Random Forests; Demand Estimation; BLP; Discrete Choice.
    JEL: C55 D11 Q11
    Date: 2019–12–04
  10. By: Sondes Kahouli (Université de Bretagne Occidentale); Xavier Pautrel (Université d’Angers)
    Abstract: The aim of this paper is to investigate bi-directional spillovers into residential and industrial sectors induced by energy efficiency improvement (EEI) in both the short- and long-term, and the impact of nesting structure as well as the size of elasticities of substitution of production and utility functions on the magnitude and the transitional dynamic of rebound effect. Developing a dynamic general equilibrium model, we demonstrate that residential EEIs spillovers into the industrial sector through the labor supply channel and industrial EEIs spill-overs into the residential sector through the conventional income channel. Numerical simulations calibrated on the U.S. suggest that not taking into account these spillover effects could lead to mis-estimate the rebound effect especially of residential sector EEIs. We also demonstrate how the size and the duration of the rebound effect depend on the value of elasticities of substitution. Especially, the elasticity of substitution between energy and non-energy consumption in household utility and the elasticity of substitution between physical capital and labor in production play a major role. Numerical simulations suggest that alternative sets of value for the elasticities of substitution may give sizable different patterns of rebound effects in both the short- and long-term. In policy terms, our results suggest that energy effciency policies should be implemented simultaneously with rebound effect offsetting policies by considering short- and long-term wide-economy feedbacks. As a consequence, they recall for considering debates about what type of policy pathways is more effective in mitigating the rebound effect.
    Keywords: Energy Efficiency, Rebound Effect, Transitional Dynamics, Residential Energy Consumption, Industrial Energy Consumption
    JEL: D58 Q43
    Date: 2020–12
  11. By: Zhao, Bingyu PhD; Wong, Stephen D PhD
    Abstract: Government agencies must make rapid and informed decisions in wildfires to safely evacuate people. However, current evacuation simulation tools for resource-strapped agencies largely fail to compare possible transportation responses or incorporate empirical evidence from past wildfires. Consequently, we employ online survey data from evacuees of the 2017 Northern California Wildfires (n=37), the 2017 Southern California Wildfires (n=175), and the 2018 Carr Wildfire (n=254) to inform a policy-oriented traffic evacuation simulation model. We test our simulation for a hypothetical wildfire evacuation in the wildland urban interface (WUI) of Berkeley, California. We focus on variables including fire speed, departure time distribution, towing of items, transportation mode, GPS-enabled rerouting, phased evacuations (i.e., allowing higher-risk residents to leave earlier), and contraflow (i.e., switching all lanes away from danger). We found that reducing household vehicles (i.e., to 1 vehicle per household) and increasing GPS-enabled rerouting (e.g., 50% participation) lowered exposed vehicles (i.e., total vehicles in the fire frontier) by over 50% and evacuation time estimates (ETEs) by about 30% from baseline. Phased evacuations with a suitable time interval reduced exposed vehicles most significantly (over 90%) but produced a slightly longer ETE. Both contraflow (on limited links due to resource constraints) and slowing fire speed were effective in lowering exposed vehicles (around 50%), but not ETEs. Extended contraflow can reduce both exposed vehicles and ETEs. We recommend agencies develop a communication and parking plan to reduce evacuating vehicles, create and communicate a phased evacuation plan, and build partnerships with GPS-routing services.
    Keywords: Engineering, Evacuations, Traffic Simulation, California Wildfires, Transportation Policy, Behavior, Contraflow, Phased Evacuations
    Date: 2021–01–01
  12. By: Victor Chernozhukov; Whitney Newey; Rahul Singh; Vasilis Syrgkanis
    Abstract: We provide an adversarial approach to estimating Riesz representers of linear functionals within arbitrary function spaces. We prove oracle inequalities based on the localized Rademacher complexity of the function space used to approximate the Riesz representer and the approximation error. These inequalities imply fast finite sample mean-squared-error rates for many function spaces of interest, such as high-dimensional sparse linear functions, neural networks and reproducing kernel Hilbert spaces. Our approach offers a new way of estimating Riesz representers with a plethora of recently introduced machine learning techniques. We show how our estimator can be used in the context of de-biasing structural/causal parameters in semi-parametric models, for automated orthogonalization of moment equations and for estimating the stochastic discount factor in the context of asset pricing.
    Date: 2020–12
  13. By: Mochen Yang; Edward McFowland III; Gordon Burtch; Gediminas Adomavicius
    Abstract: Combining machine learning with econometric analysis is becoming increasingly prevalent in both research and practice. A common empirical strategy involves the application of predictive modeling techniques to 'mine' variables of interest from available data, followed by the inclusion of those variables into an econometric framework, with the objective of estimating causal effects. Recent work highlights that, because the predictions from machine learning models are inevitably imperfect, econometric analyses based on the predicted variables are likely to suffer from bias due to measurement error. We propose a novel approach to mitigate these biases, leveraging the ensemble learning technique known as the random forest. We propose employing random forest not just for prediction, but also for generating instrumental variables to address the measurement error embedded in the prediction. The random forest algorithm performs best when comprised of a set of trees that are individually accurate in their predictions, yet which also make 'different' mistakes, i.e., have weakly correlated prediction errors. A key observation is that these properties are closely related to the relevance and exclusion requirements of valid instrumental variables. We design a data-driven procedure to select tuples of individual trees from a random forest, in which one tree serves as the endogenous covariate and the other trees serve as its instruments. Simulation experiments demonstrate the efficacy of the proposed approach in mitigating estimation biases and its superior performance over three alternative methods for bias correction.
    Date: 2020–12
  14. By: Mehmet Caner; Kfir Eliaz
    Abstract: We consider situations where a user feeds her attributes to a machine learning method that tries to predict her best option based on a random sample of other users. The predictor is incentive-compatible if the user has no incentive to misreport her covariates. Focusing on the popular Lasso estimation technique, we borrow tools from high-dimensional statistics to characterize sufficient conditions that ensure that Lasso is incentive compatible in large samples. In particular, we show that incentive compatibility is achieved if the tuning parameter is kept above some threshold. We present simulations that illustrate how this can be done in practice.
    Date: 2021–01
  15. By: Tzougas, George; Jeong, Himchan
    Abstract: This article presents the Exponential–Generalized Inverse Gaussian regression model with varying dispersion and shape. The EGIG is a general distribution family which, under the adopted modelling framework, can provide the appropriate level of flexibility to fit moderate costs with high frequencies and heavy-tailed claim sizes, as they both represent significant proportions of the total loss in non-life insurance. The model’s implementation is illustrated by a real data application which involves fitting claim size data from a European motor insurer. The maximum likelihood estimation of the model parameters is achieved through a novel Expectation Maximization (EM)-type algorithm that is computationally tractable and is demonstrated to perform satisfactorily.
    Keywords: Exponential–Generalized Inverse Gaussian Distribution; EM Algorithm; regression models for the mean; dispersion and shape parameters; non-life insurance; heavy-tailed losses
    JEL: C1
    Date: 2021–01–08
  16. By: Mateusz Denys
    Abstract: A numerical agent-based spin model of financial markets, based on the Potts model from statistical mechanics, with a novel interpretation of the spin variable (as regards financial-market models) is presented. In this model, a value of the spin variable is only the agent's opinion concerning current market situation, which he communicates to his nearest neighbors. Instead, the agent's action (i.e., buying, selling, or staying inactive) is connected with a change of the spin variable. Hence, the agents can be considered as cunning in this model. That is, these agents encourage their neighbors to buy stocks if the agents have an opportunity to sell them, and the agents encourage their neighbors to sell stocks if the agents have a reversed opportunity. Predictions of the model are in good agreement with empirical data from various real-life financial markets. The model reproduces the shape of the usual and absolute-value autocorrelation function of returns as well as the distribution of times between superthreshold losses.
    Date: 2020–12
  17. By: Marlene Amstad; Leonardo Gambacorta; Chao He; Dora Xia
    Abstract: Trade tensions between China and US have played an important role in swinging global stock markets but effects are difficult to quantify. We develop a novel trade sentiment index (TSI) based on textual analysis and machine learning applied on a big data pool that assesses the positive or negative tone of the Chinese media coverage, and evaluates its capacity to explain the behaviour of 60 global equity markets. We find the TSI to contribute around 10% of model capacity to explain the stock price variability from January 2018 to June 2019 in countries that are more exposed to the China-US value chain. Most of the contribution is given by the tone extracted from social media (9%), while that obtained from traditional media explains only a modest part of stock price variability (1%). No equity market benefits from the China-US trade war, and Asian markets tend to be more negatively affected. In particular, we find that sectors most affected by tariffs such as information technology related ones are particularly sensitive to the tone in trade tension.
    Keywords: stock returns, trade, sentiment, big data, neural network, machine learning
    JEL: F13 F14 G15 D80 C45 C55
    Date: 2021–01
  18. By: Jeffrey Grogger; Ria Ivandic; Tom Kirchmaier
    Abstract: Artificial intelligence could help to protect victims of domestic violence, according to research by Jeffrey Grogger, Ria Ivandic and Tom Kirchmaier
    Keywords: covid-19,domestic abuse,crime
    Date: 2020–07
  19. By: Mesbah, Neda; Tauchert, Christoph; Buxmann, Peter
    Date: 2021–01–05
  20. By: Nan Hu; Jian Li; Alexis Meyer-Cirkel
    Abstract: We compared the predictive performance of a series of machine learning and traditional methods for monthly CDS spreads, using firms’ accounting-based, market-based and macroeconomics variables for a time period of 2006 to 2016. We find that ensemble machine learning methods (Bagging, Gradient Boosting and Random Forest) strongly outperform other estimators, and Bagging particularly stands out in terms of accuracy. Traditional credit risk models using OLS techniques have the lowest out-of-sample prediction accuracy. The results suggest that the non-linear machine learning methods, especially the ensemble methods, add considerable value to existent credit risk prediction accuracy and enable CDS shadow pricing for companies missing those securities.
    Keywords: Credit default swap;Machine learning;Credit risk;Credit ratings;Stock markets;WP,firm,default probability,failure intensity,firm size proxy
    Date: 2019–12–27
  21. By: Anna Baiardi; Andrea A. Naghi
    Abstract: A new and rapidly growing econometric literature is making advances in the problem of using machine learning methods for causal inference questions. Yet, the empirical economics literature has not started to fully exploit the strengths of these modern methods. We revisit influential empirical studies with causal machine learning methods and identify several advantages of using these techniques. We show that these advantages and their implications are empirically relevant and that the use of these methods can improve the credibility of causal analysis.
    Date: 2021–01
  22. By: Zhichao Wang (School of Economics and Centre for Efficiency and Productivity Analysis (CEPA) at The University of Queensland, Australia); Valentin Zelenyuk (School of Economics and Centre for Efficiency and Productivity Analysis (CEPA) at The University of Queensland, Australia)
    Abstract: Research about the productivity and efficiency of hospitals in providing healthcare services has developed substantially in the last few decades. How does this topic proceed in Australia, its peer countries and regions who share a similar healthcare system? In this article, we conduct a systematic review and a series of bibliometric analyses of the research about the efficiency of hospitals, which are the core organizations in the the healthcare system, in order to obtain a broad perspective of this topic in Australia and its peers. Among others, a random forests model was trained to evaluate the impact of features of an article on the scientific influence of the research. We used bibliometric data in Scopus from 1970 to 2020 and extracted the review pool by a peer-review process. Besides identifying the productive authors and most cited publication sources, the bibliometric analysis also indicated a shifting of topics over time. Through the training process of the random forests classification model, the most influential features of an article were also identified.
    Keywords: Performance analysis, efficiency, Australia, hospital, systematic review, bibliometric analysis, random forests, machine learning
    JEL: C14 C61 D24 I11
    Date: 2021–01
  23. By: Benetos, Emmanouil (Queen Mary University of London and The Alan Turing Institute.); Ragano, Alessandro (University College Dublin.); Sgroi, Daniel (University of Warwick, ESRC CAGE Centre and IZA Bonn.); Tuckwell, Anthony (University of Warwick and ESRC CAGE Centre.)
    Abstract: We propose a new measure for national happiness based on the emotional content of a country’s most popular songs. Using machine learning to detect the valence of the UK’s chart-topping song of each year since the 1970s, we find that it reliably predicts the leading survey-based measure of life satisfaction. Moreover, we find that music valence is better able to predict life satisfaction than a recently-proposed measure of happiness based on the valence of words in books (Hills et al., 2019). Our results have implications for the role of music in society, and at the same time validate a new use of music as a measure of public sentiment. JEL codes: N30, Z11, Z13
    Keywords: subjective wellbeing ; life satisfaction ; national happiness ; music information ; retrieval, machine learning. JEL Classification: N30 ; Z11 ; Z13
    Date: 2021
  24. By: Vladimir Vargas-Calder\'on; Jorge E. Camargo
    Abstract: In many countries, real estate appraisal is based on conventional methods that rely on appraisers' abilities to collect data, interpret it and model the price of a real estate property. With the increasing use of real estate online platforms and the large amount of information found therein, there exists the possibility of overcoming many drawbacks of conventional pricing models such as subjectivity, cost, unfairness, among others. In this paper we propose a data-driven real estate pricing model based on machine learning methods to estimate prices reducing human bias. We test the model with 178,865 flats listings from Bogot\'a, collected from 2016 to 2020. Results show that the proposed state-of-the-art model is robust and accurate in estimating real estate prices. This case study serves as an incentive for local governments from developing countries to discuss and build real estate pricing models based on large data sets that increases fairness for all the real estate market stakeholders and reduces price speculation.
    Date: 2020–11
  25. By: Rauh, C.; Renée, L.
    Abstract: In this paper we measure parenting behavior through unsupervised machine learning in a panel following children from age 5 to 29 months. The algorithm classifies parents into two distinct behavioral types: "active" and "laissez-faire". Parents of the active type tend to respond to their children's expressions and describe to children features of their environment, while parents of the laissez-faire type are less likely to engage with their children. We find that parents' types are persistent over time and are systematically related to socio-economic characteristics. More-over, children of active parents see their human capital improve relative to children of parents of the laissez-faire type.
    Keywords: Parenting styles, human capital, latent Dirichlet allocation, inequality, machine learning
    Date: 2021–01–22
  26. By: James Chapman; Ajit Desai
    Abstract: The COVID-19 pandemic and the resulting public health mitigation have caused large-scale economic disruptions globally. During this time, there is an increased need to predict the macroeconomy’s short-term dynamics to ensure the effective implementation of fiscal and monetary policy. However, economic prediction during a crisis is challenging because of the unprecedented economic impact, which increases the unreliability of traditionally used linear models that use lagged data. We help address these challenges by using timely retail payments system data in linear and nonlinear machine learning models. We find that compared to a benchmark, our model has a roughly 15 to 45% reduction in Root Mean Square Error when used for macroeconomic nowcasting during the global financial crisis. For nowcasting during the COVID-19 shock, our model predictions are much closer to the official estimates.
    Keywords: Econometric and statistical methods; Payment clearing and settlement systems
    JEL: C55 E52
    Date: 2021–01
  27. By: Ian Burn; Daniel Firoozi; Daniel Ladd; David Neumark
    Abstract: We explore whether ageist stereotypes in job ads are detectable using machine learning methods measuring the linguistic similarity of job-ad language to ageist stereotypes identified by industrial psychologists. We then conduct an experiment to evaluate whether this language is perceived as biased against older workers. We find that language classified by the machine learning algorithm as closely related to ageist stereotypes is perceived as ageist by experimental subjects. The scores assigned to the language related to ageist stereotypes are larger when responses are incentivized by rewarding participants for guessing how other respondents rated the language. These methods could potentially help enforce anti-discrimination laws by using job ads to predict or identify employers more likely to be engaging in age discrimination.
    JEL: J14 J71 K31
    Date: 2021–01
  28. By: Pierre Durand; Gaëtan Le Quang
    Abstract: Prudential regulation is supposed to strengthen financial stability and banks' resilience to new economic shocks. We tackle this issue by evaluating the impact of leverage, capital, and liquidity ratios on banks default probability. To this aim, we use logistic regression, random forest classification, and artificial neural networks applied on the United-States and European samples over the 2000-2018 period. Our results are based on 4707 banks in the US and 3529 banks in Europe, among which 454 and 205 defaults respectively. We show that, in the US sample, capital and equity ratios have strong negative impact on default probability. Liquidity ratio has a positive effect which can be justified by the low returns associated with liquid assets. Overall, our investigation suggests that fewer prudential rules and higher leverage ratio should reinforce the banking system's resilience. Because of the lack of official failed banks list in Europe, our findings on this sample are more delicate to interpret.
    Keywords: Banking regulation ; Capital requirements ; Basel III ; Logistic ; Statistical learning classification ; Bankruptcy prediction models.
    JEL: C44 G21 G28
    Date: 2021
  29. By: Oz Shy
    Abstract: Using machine learning techniques applied to consumer diary survey data, the author of this working paper examines methods for studying consumer payment choice. These techniques, especially when paired with regression analyses, provide useful information for understanding and predicting the payment choices consumers make.
    Keywords: studying consumer payment choice; point of sale; statistical learning; machine learning
    JEL: C19 E42
    Date: 2020–06–23
  30. By: Ferrari, Massimo; Le Mezo, Helena
    Abstract: This paper proposes a new methodology based on textual analysis to forecast U.S. recessions. Specifically, the paper develops an index in the spirit of Baker et al. (2016) and Caldara and Iacoviello (2018) which tracks developments in U.S. real activity. When used in a standard recession probability model, the index outperforms the yield curve based forecast, a standard method to forecast recessions, at medium horizons, up to 8 months. Moreover, the index contains information not included in yield data that are useful to understand recession episodes. When included as an additional control to the slope of the yield curve, it improves the forecast accuracy by 5% to 30% depending on the horizon. These results are stable to a number of different robustness checks, including changes to the estimation method, the definition of recessions and controlling for asset purchases by major central banks. Yield and textual analysis data also outperform other popular leading indicators for the U.S. business cycle such as PMIs, consumers' surveys or employment data. JEL Classification: E17, E47, E37, C25, C53
    Keywords: forecast, textual analysis, U.S. recessions
    Date: 2021–01
  31. By: Merino Troncoso, Carlos
    Abstract: In this chapter I will review the main methodologies used in economics for demand estimation, focusing on recent trends such as the structural approach and machine learning techniques. As one can imagine the literature review is extensive so due to space limitations I can only provide a summarized view of each theory. Nevertheless, the interested reader has a comprehensive bibliography at the end of the chapter for extensions and examples. There is also another barrier when explaining any concept in economics. Economics is widely based on Mathematics, Statistics and Econometrics so it is not possible to explain it without its usage. As it is not possible review econometrics and mathematics in this chapter I will refer to specific texts, and an appendix will give the reader a brief summary of the main concepts. Demand is usually the first step in the study of a market. Intuitively, suppliers only start production when they identify consumer interest in a particular good. All models reviewed try to solve the problems that traditionally have embarrassed demand estimation: identification, endogeneity and simultaneity. There is no perfect solution to them, each model has its advantages and limitations and are based on assumptions that are often irreal, so the model in itself is in all cases only an approximation of demand.
    Keywords: Demand, Consumer, Elasticity, Random Coefficient Model
    JEL: L40 L44
    Date: 2021–01–07

This nep-cmp issue is ©2021 by Stan Miles. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.