nep-cmp New Economics Papers
on Computational Economics
Issue of 2019‒03‒18
eleven papers chosen by



  1. Shapley regressions: a framework for statistical inference on machine learning models By Joseph, Andreas
  2. Shapley regressions: A framework for statistical inference on machine learning models By Andreas Joseph
  3. A micro-simulation model of irrigation farms in the southern Murray-Darling Basin By Huong Dinh; Manannan Donoghoe; Neal Hughes; Tim Goesch
  4. Nowcasting Recessions using the SVM Machine Learning Algorithm By Alexander James; Yaser S. Abu-Mostafa; Xiao Qiao
  5. Spatial inequality, geography and economic activity By Sandra Achten; Christian Leßmann
  6. Capital misallocation and secular stagnation By Andrea Caggese; Ander Pérez-Orive
  7. Pro et Contra of Agriculture Land Reform in South Africa: A Policy Brief By Mkhabela, T.; Ntombela, S.; Mazibuko, N.
  8. Derivative of a Conic Problem with a Unique Solution By Enzo Busseti; Walaa M. Moursi; Stephen Boyd
  9. Financial Applications of Gaussian Processes and Bayesian Optimization By Joan Gonzalvez; Edmond Lezmi; Thierry Roncalli; Jiali Xu
  10. 'Whatever it Takes' to Change Belief: Evidence from Twitter By Michael Stiefel; Rémi Vivès
  11. Optimal Probabilistic Record Linkage: Best Practice for Linking Employers in Survey and Administrative Data By John M. Abowd; Joelle Abramowitz; Margaret C. Levenstein; Kristin McCue; Dhiren Patki; Trivellore Raghunathan; Ann M. Rodgers; Matthew D. Shapiro; Nada Wasi

  1. By: Joseph, Andreas (Bank of England)
    Abstract: Machine learning models often excel in the accuracy of their predictions but are opaque due to their non-linear and non-parametric structure. This makes statistical inference challenging and disqualifies them from many applications where model interpretability is crucial. This paper proposes the Shapley regression framework as an approach for statistical inference on non-linear or non-parametric models. Inference is performed based on the Shapley value decomposition of a model, a pay-off concept from cooperative game theory. I show that universal approximators from machine learning are estimation consistent and introduce hypothesis tests for individual variable contributions, model bias and parametric functional forms. The inference properties of state-of-the-art machine learning models — like artificial neural networks, support vector machines and random forests — are investigated using numerical simulations and real-world data. The proposed framework is unique in the sense that it is identical to the conventional case of statistical inference on a linear model if the model is linear in parameters. This makes it a well-motivated extension to more general models and strengthens the case for the use of machine learning to inform decisions.
    Keywords: Machine learning; statistical inference; Shapley values; numerical simulations; macroeconomics; time series
    JEL: C45 C52 C71 E47
    Date: 2019–03–08
    URL: http://d.repec.org/n?u=RePEc:boe:boeewp:0784&r=all
  2. By: Andreas Joseph
    Abstract: Machine learning models often excel in the accuracy of their predictions but are opaque due to their non-linear and non-parametric structure. This makes statistical inference challenging and disqualifies them from many applications where model interpretability is crucial. This paper proposes the Shapley regression framework as an approach for statistical inference on non-linear or non-parametric models. Inference is performed based on the Shapley value decomposition of a model, a pay-off concept from cooperative game theory. I show that universal approximators from machine learning are estimation consistent and introduce hypothesis tests for individual variable contributions, model bias and parametric functional forms. The inference properties of state-of-the-art machine learning models - like artificial neural networks, support vector machines and random forests - are investigated using numerical simulations and real-world data. The proposed framework is unique in the sense that it is identical to the conventional case of statistical inference on a linear model if the model is linear in parameters. This makes it a well-motivated extension to more general models and strengthens the case for the use of machine learning to inform decisions.
    Date: 2019–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1903.04209&r=all
  3. By: Huong Dinh; Manannan Donoghoe; Neal Hughes; Tim Goesch
    Abstract: This paper presents a farm level irrigation microsimulation model of the southern Murray-Darling Basin. The model leverages detailed ABARES survey data to estimate a series of input demand and output supply equations, derived from a normalised quadratic profit function. The parameters from this estimation are then used to simulate the impact on total cost, revenue and profit of a hypothetical 30 per cent increase in the price of water. The model is still under development, with several potential improvements suggested in the conclusion. This is a working paper, provided for the purpose of receiving feedback on the analytical approach to improve future iterations of the microsimulation model.
    Date: 2019–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1903.05781&r=all
  4. By: Alexander James; Yaser S. Abu-Mostafa; Xiao Qiao
    Abstract: We introduce a novel application of Support Vector Machines (SVM), an important Machine Learning algorithm, to determine the beginning and end of recessions in real time. Nowcasting, "forecasting" a condition about the present time because the full information about it is not available until later, is key for recessions, which are only determined months after the fact. We show that SVM has excellent predictive performance for this task, and we provide implementation details to facilitate its use in similar problems in economics and finance.
    Date: 2019–02
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1903.03202&r=all
  5. By: Sandra Achten; Christian Leßmann
    Abstract: We study the effect of spatial inequality on economic activity. Given that the relationship is highly simultaneous in nature, we use exogenous variation in geographic features to construct an instrument for spatial inequality, which is independent from any man-made factors. Inequality measures and instruments are calculated based on grid-level data for existing countries as well as for artificial countries. In the construction of the instrumental variable, we use both a parametric regression analysis as well as a random forest classification algorithm. Our IV regressions show a significant negative relationship between spatial inequality and economic activity. This result holds if we control for country-level averages of different geographic variables. Therefore, we conclude that geographic heterogeneity is an important determinant of economic activity.
    Keywords: regional inequality, spatial inequality, economic activity, development, geography, machine learning
    JEL: R12 O15
    Date: 2019
    URL: http://d.repec.org/n?u=RePEc:ces:ceswps:_7547&r=all
  6. By: Andrea Caggese; Ander Pérez-Orive
    Abstract: The widespread emergence of intangible technologies in recent decades may have significantly hurt output growth–even when these technologies replaced considerably less productive tangible technologies–because of low interest rates. After a shift toward intangible capital in production, the corporate sector becomes a net saver because intangible capital has a low collateral value. Firms’ ability to purchase intangible capital is impaired by low interest rates because low rates slow down the accumulation of savings and increase the price of capital, worsening capital misallocation. Our model simulations reproduce key trends in the U.S. in the period from 1980 to 2015.
    Keywords: Intangible capital, borrowing constraints, capital reallocation, secular stagnation
    JEL: E22 E43 E44
    Date: 2018–07
    URL: http://d.repec.org/n?u=RePEc:upf:upfgen:1637&r=all
  7. By: Mkhabela, T.; Ntombela, S.; Mazibuko, N.
    Abstract: The simulation results presented in this paper provided nuanced policy options for redistribution in South Africa in the face of the looming expropriation of land without compensation. The simulation done through Computable General Equilibrium approach using the modified University of Pretoria General Equilibrium Model (UPGEM) which is solved using a GEMPACK solution software. The simulation revealed that there will be adjustment costs regardless of the option(s) chosen. The Inclusive Scenario came up as the most suitable policy option in terms of minimal adjustment costs and allowing the sector to continue to grow, albeit at a lower rate compared to the status quo.
    Keywords: Land Economics/Use
    Date: 2018–09–25
    URL: http://d.repec.org/n?u=RePEc:ags:aeas18:284782&r=all
  8. By: Enzo Busseti; Walaa M. Moursi; Stephen Boyd
    Abstract: We view a conic optimization problem that has a unique solution as a map from its data to its solution. If sufficient regularity conditions hold at a solution point, namely that the implicit function theorem applies to the normalized residual function of [Busseti et al, 2018], the problem solution map is differentiable. We obtain the derivative, in the form of an abstract linear operator. This applies to any convex optimization problem in conic form, while a previous result [Amos et al, 2016] studied strictly convex quadratic programs. Such differentiable problems can be used, for example, in machine learning, control, and related areas, as a layer in an end-to-end learning and control procedure, for backpropagation. We accompany this note with a lightweight Python implementation which can handle problems with the cone constraints commonly used in practice.
    Date: 2019–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1903.05753&r=all
  9. By: Joan Gonzalvez; Edmond Lezmi; Thierry Roncalli; Jiali Xu
    Abstract: In the last five years, the financial industry has been impacted by the emergence of digitalization and machine learning. In this article, we explore two methods that have undergone rapid development in recent years: Gaussian processes and Bayesian optimization. Gaussian processes can be seen as a generalization of Gaussian random vectors and are associated with the development of kernel methods. Bayesian optimization is an approach for performing derivative-free global optimization in a small dimension, and uses Gaussian processes to locate the global maximum of a black-box function. The first part of the article reviews these two tools and shows how they are connected. In particular, we focus on the Gaussian process regression, which is the core of Bayesian machine learning, and the issue of hyperparameter selection. The second part is dedicated to two financial applications. We first consider the modeling of the term structure of interest rates. More precisely, we test the fitting method and compare the GP prediction and the random walk model. The second application is the construction of trend-following strategies, in particular the online estimation of trend and covariance windows.
    Date: 2019–03
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1903.04841&r=all
  10. By: Michael Stiefel (Department of Economics, University of Zurich); Rémi Vivès (Aix-Marseille Univ., CNRS, EHESS, Centrale Marseille, AMSE)
    Abstract: The sovereign debt literature emphasizes the possibility of avoiding a self-fulfilling default crisis if markets anticipate the central bank to act as lender of last resort. This paper investigates the extent to which changes in belief about an intervention of the European Central Bank (ECB) explain the sudden reduction of government bond spreads for the distressed countries in summer 2012. We study Twitter data and extract belief using machine learning techniques. We find evidence of strong increases in the perceived likelihood of ECB intervention and show that those increases explain subsequent decreases in the bond spreads of the distressed countries.
    Keywords: self-fulfilling default crisis, unconventional monetary policy, Twitter data
    JEL: E44 E58 D83 F34
    Date: 2019–02
    URL: http://d.repec.org/n?u=RePEc:aim:wpaimx:1907&r=all
  11. By: John M. Abowd; Joelle Abramowitz; Margaret C. Levenstein; Kristin McCue; Dhiren Patki; Trivellore Raghunathan; Ann M. Rodgers; Matthew D. Shapiro; Nada Wasi
    Abstract: This paper illustrates an application of record linkage between a household-level survey and an establishment-level frame in the absence of unique identifiers. Linkage between frames in this setting is challenging because the distribution of employment across firms is highly asymmetric. To address these difficulties, this paper uses a supervised machine learning model to probabilistically link survey respondents in the Health and Retirement Study (HRS) with employers and establishments in the Census Business Register (BR) to create a new data source which we call the CenHRS. Multiple imputation is used to propagate uncertainty from the linkage step into subsequent analyses of the linked data. The linked data reveal new evidence that survey respondents’ misreporting and selective nonresponse about employer characteristics are systematically correlated with wages.
    Keywords: Probabilistic record linkage; survey data; administrative data; multiple imputation; measurement error; nonresponse
    Date: 2019–03
    URL: http://d.repec.org/n?u=RePEc:cen:wpaper:19-08&r=all

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.