nep-big New Economics Papers
on Big Data
Issue of 2017‒08‒27
ten papers chosen by
Tom Coupé
University of Canterbury

  1. Inside Job or Deep Impact? Using Extramural Citations to Assess Economic Scholarship By Joshua Angrist; Pierre Azoulay; Glenn Ellison; Ryan Hill; Susan Feng Lu
  2. Fake News in Social Networks By Christoph Aymanns; Jakob Foerster; Co-Pierre Georg
  3. Heterogeneous Employment Effects of Job Search Programmes: A Machine Learning Approach By Knaus, Michael C.; Lechner, Michael; Strittmatter, Anthony
  4. Improving the forecasts of European regional banks' profitability with machine learning algorithms By Haskamp, Ulrich
  5. Forecasting day-ahead electricity prices in Europe: the importance of considering market integration By Jesus Lago; Fjo De Ridder; Peter Vrancx; Bart De Schutter
  6. Importance of the long-term seasonal component in day-ahead electricity price forecasting revisited: Neural network models By Grzegorz Marcjasz; Bartosz Uniejewski; Rafal Weron
  7. On the Use of the Lasso for Instrumental Variables Estimation with Some Invalid Instruments By Windmeijer, F.; Farbmacher, H.; Davies, N.; Davey Smith, G.;
  8. DGM: A deep learning algorithm for solving partial differential equations By Justin Sirignano; Konstantinos Spiliopoulos
  9. Heterogeneous Employment Effects of Job Search Programmes: A Machine Learning Approach By Knaus, Michael C.; Lechner, Michael; Strittmatter, Anthony
  10. Special Report: The Potential of Wastewater Testing for Public Health and Safety By Aparna Keshaviah; Editor

  1. By: Joshua Angrist; Pierre Azoulay; Glenn Ellison; Ryan Hill; Susan Feng Lu
    Abstract: Does academic economic research produce material of scientific value, or are academic economists writing only for clients and peers? Is economics scholarship uniquely insular? We address these questions by quantifying interactions between economics and other disciplines. Changes in the impact of economic scholarship are measured here by the way other disciplines cite us. We document a clear rise in the extramural influence of economic research, while also showing that economics is increasingly likely to reference other social sciences. A breakdown of extramural citations by economics fields shows broad field impact. Differentiating between theoretical and empirical papers classified using machine learning, we see that much of the rise in economics’ extramural influence reflects growth in citations to empirical work. This parallels a growing share of empirical cites within economics. At the same time, the disciplines of computer science and operations research are mostly influenced by economic theory.
    JEL: A11 A12 A13 A14 B41 C18
    Date: 2017–08
    URL: http://d.repec.org/n?u=RePEc:nbr:nberwo:23698&r=big
  2. By: Christoph Aymanns; Jakob Foerster; Co-Pierre Georg
    Abstract: We model the spread of news as a social learning game on a network. Agents can either endorse or oppose a claim made in a piece of news, which itself may be either true or false. Agents base their decision on a private signal and their neighbors' past actions. Given these inputs, agents follow strategies derived via multi-agent deep reinforcement learning and receive utility from acting in accordance with the veracity of claims. Our framework yields strategies with agent utility close to a theoretical, Bayes optimal benchmark, while remaining flexible to model re-specification. Optimized strategies allow agents to correctly identify most false claims, when all agents receive unbiased private signals. However, an adversary's attempt to spread fake news by targeting a subset of agents with a biased private signal can be successful. Even more so when the adversary has information about agents' network position or private signal. When agents are aware of the presence of an adversary they re-optimize their strategies in the training stage and the adversary's attack is less effective. Hence, exposing agents to the possibility of fake news can be an effective way to curtail the spread of fake news in social networks. Our results also highlight that information about the users' private beliefs and their social network structure can be extremely valuable to adversaries and should be well protected.
    Date: 2017–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1708.06233&r=big
  3. By: Knaus, Michael C.; Lechner, Michael; Strittmatter, Anthony
    Abstract: We systematically investigate the effect heterogeneity of job search programmes for unemployed workers. To investigate possibly heterogeneous employment effects, we combine non-experimental causal empirical models with Lasso-type estimators. The empirical analyses are based on rich administrative data from Swiss social security records. We find considerable heterogeneities only during the first six months after the start of training. Consistent with previous results of the literature, unemployed persons with fewer employment opportunities profit more from participating in these programmes. Furthermore, we also document heterogeneous employment effects by residence status. Finally, we show the potential of easy-to-implement programme participation rules for improving average employment effects of these active labour market programmes.
    Keywords: active labour market policy; conditional average treatment effects; individualized treatment effects; Machine Learning
    JEL: C21 H43 J68
    Date: 2017–08
    URL: http://d.repec.org/n?u=RePEc:cpr:ceprdp:12224&r=big
  4. By: Haskamp, Ulrich
    Abstract: Regional banks as savings and cooperative banks are widespread in continental Europe. In the aftermath of the financial crisis, however, they had problems keeping their profitability which is an important quantitative indicator for the health of a bank and the banking sector overall. We use a large data set of bank-level balance sheet items and regional economic variables to forecast protability for about 2,000 regional banks. Machine learning algorithms are able to beat traditional estimators as ordinary least squares as well as autoregressive models in forecasting performance.
    Keywords: profitability,regional banking,forecasting,machine learning
    JEL: C53 G21
    Date: 2017
    URL: http://d.repec.org/n?u=RePEc:zbw:rwirep:705&r=big
  5. By: Jesus Lago; Fjo De Ridder; Peter Vrancx; Bart De Schutter
    Abstract: Motivated by the increasing integration among electricity markets, in this paper we propose three different methods to incorporate market integration in electricity price forecasting and to improve the predictive performance. First, we propose a deep neural network that considers features from connected markets to improve the predictive accuracy in a local market. To measure the importance of these features, we propose a novel feature selection algorithm that, by using Bayesian optimization and functional analysis of variance, analyzes the effect of the features on the algorithm performance. In addition, using market integration, we propose a second model that, by simultaneously predicting prices from two markets, improves even further the forecasting accuracy. Finally, we present a third model to predict the probability of price spikes; then, we use it as an input in the other two forecasters to detect spikes. As a case study, we consider the electricity market in Belgium and the improvements in forecasting accuracy when using various French electricity features. In detail, we show that the three proposed models lead to improvements that are statistically significant. Particularly, due to market integration, predictive accuracy is improved from 15.7% to 12.5% sMAPE (symmetric mean absolute percentage error). In addition, we also show that the proposed feature selection algorithm is able to perform a correct assessment, i.e. to discard the irrelevant features.
    Date: 2017–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1708.07061&r=big
  6. By: Grzegorz Marcjasz; Bartosz Uniejewski; Rafal Weron
    Abstract: In day-ahead electricity price forecasting the daily and weekly seasonalities are always taken into account, but the long-term seasonal component was believed to add unnecessary complexity and in most studies ignored. The recent introduction of the Seasonal Component AutoRegressive (SCAR) modeling framework has changed this viewpoint. However, the latter is based on linear models estimated using Ordinary Least Squares. Here we show that considering non-linear neural network-type models with the same inputs as the corresponding SCAR model can lead to a yet better performance. While individual Seasonal Component Artificial Neural Network (SCANN) models are generally worse than the corresponding SCAR-type structures, we provide empirical evidence that committee machines of SCANN networks can significantly outperform the latter.
    Keywords: Electricity spot price; Forecasting; Day-ahead market; Long-term seasonal component; Neural network; Committee machine
    JEL: C14 C22 C45 C51 C53 Q47
    Date: 2017–07–29
    URL: http://d.repec.org/n?u=RePEc:wuu:wpaper:hsc1703&r=big
  7. By: Windmeijer, F.; Farbmacher, H.; Davies, N.; Davey Smith, G.;
    Abstract: We investigate the behaviour of the Lasso for selecting invalid instruments in linear instrumental variables models for estimating causal effects of exposures on outcomes, as proposed recently by Kang, Zhang, Cai and Small (2016, Journal of the American Statistical Association).Invalid instruments are such that they fail the exclusion restriction and enter the model as explanatory variables. We show that for this setup, the Lasso may not select all invalid instruments in large samples if they are relatively strong. Consistent selection also depends on the correlation structure of the instruments. We propose a median estimator that is consistent when less than 50% of the instruments are invalid, but its consistency does not depend on the relative strength of the instruments or their correlation structure. This estimator can therefore be used for adaptive Lasso estimation. The methods are applied to a Mendelian randomisation study to estimate the causal effect of BMI on diastolic blood pressure using data on individuals from the UK Biobank, with 96 single nucleotide polymorphisms as potential instruments for BMI.
    Keywords: causal inference; instrumental variables estimation; invalid instruments; Lasso; Mendelian randomisation;
    Date: 2017–08
    URL: http://d.repec.org/n?u=RePEc:yor:hectdg:17/22&r=big
  8. By: Justin Sirignano; Konstantinos Spiliopoulos
    Abstract: High-dimensional PDEs have been a longstanding computational challenge. We propose a deep learning algorithm similar in spirit to Galerkin methods, using a deep neural network instead of linear combinations of basis functions. The PDE is approximated with a deep neural network, which is trained on random batches of spatial points to satisfy the differential operator and boundary conditions. The algorithm is mesh-less, which is key since meshes become infeasible in higher dimensions. Instead of forming a mesh, sequences of spatial points are randomly sampled. We implement the approach for American options (a type of free-boundary PDE which is widely used in finance) in up to 100 dimensions. We call the algorithm a "Deep Galerkin Method (DGM)".
    Date: 2017–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1708.07469&r=big
  9. By: Knaus, Michael C.; Lechner, Michael; Strittmatter, Anthony
    Abstract: We systematically investigate the effect heterogeneity of job search programmes for unemployed workers. To investigate possibly heterogeneous employment effects, we combine non-experimental causal empirical models with Lasso-type estimators. The empirical analyses are based on rich administrative data from Swiss social security records. We find considerable heterogeneities only during the first six months after the start of training. Consistent with previous results of the literature, unemployed persons with fewer employment opportunities profit more from participating in these programmes. Furthermore, we also document heterogeneous employment effects by residence status. Finally, we show the potential of easy-to-implement programme participation rules for improving average employment effects of these active labour market programmes.
    Keywords: Machine learning, individualized treatment effects, conditional average treatment effects, active labour market policy
    JEL: J68 H43 C21
    Date: 2017–08
    URL: http://d.repec.org/n?u=RePEc:usg:econwp:2017:11&r=big
  10. By: Aparna Keshaviah; Editor
    Abstract: This report synthesizes research and recommendations from Mathematica’s symposium on “The Potential of Wastewater Testing for Public Health and Safety."
    Keywords: wastewater, testing, substance use, opioids, Arnold Foundation, Advanced analytics, Machine learning, Public health, public safety
    JEL: I
    URL: http://d.repec.org/n?u=RePEc:mpr:mprres:5a867fbc382040a1af74f957b565fd98&r=big

This nep-big issue is ©2017 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.