|
on Big Data |
By: | Joshua Angrist; Pierre Azoulay; Glenn Ellison; Ryan Hill; Susan Feng Lu |
Abstract: | Does academic economic research produce material of scientific value, or are academic economists writing only for clients and peers? Is economics scholarship uniquely insular? We address these questions by quantifying interactions between economics and other disciplines. Changes in the impact of economic scholarship are measured here by the way other disciplines cite us. We document a clear rise in the extramural influence of economic research, while also showing that economics is increasingly likely to reference other social sciences. A breakdown of extramural citations by economics fields shows broad field impact. Differentiating between theoretical and empirical papers classified using machine learning, we see that much of the rise in economics’ extramural influence reflects growth in citations to empirical work. This parallels a growing share of empirical cites within economics. At the same time, the disciplines of computer science and operations research are mostly influenced by economic theory. |
JEL: | A11 A12 A13 A14 B41 C18 |
Date: | 2017–08 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:23698&r=big |
By: | Christoph Aymanns; Jakob Foerster; Co-Pierre Georg |
Abstract: | We model the spread of news as a social learning game on a network. Agents can either endorse or oppose a claim made in a piece of news, which itself may be either true or false. Agents base their decision on a private signal and their neighbors' past actions. Given these inputs, agents follow strategies derived via multi-agent deep reinforcement learning and receive utility from acting in accordance with the veracity of claims. Our framework yields strategies with agent utility close to a theoretical, Bayes optimal benchmark, while remaining flexible to model re-specification. Optimized strategies allow agents to correctly identify most false claims, when all agents receive unbiased private signals. However, an adversary's attempt to spread fake news by targeting a subset of agents with a biased private signal can be successful. Even more so when the adversary has information about agents' network position or private signal. When agents are aware of the presence of an adversary they re-optimize their strategies in the training stage and the adversary's attack is less effective. Hence, exposing agents to the possibility of fake news can be an effective way to curtail the spread of fake news in social networks. Our results also highlight that information about the users' private beliefs and their social network structure can be extremely valuable to adversaries and should be well protected. |
Date: | 2017–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1708.06233&r=big |
By: | Knaus, Michael C.; Lechner, Michael; Strittmatter, Anthony |
Abstract: | We systematically investigate the effect heterogeneity of job search programmes for unemployed workers. To investigate possibly heterogeneous employment effects, we combine non-experimental causal empirical models with Lasso-type estimators. The empirical analyses are based on rich administrative data from Swiss social security records. We find considerable heterogeneities only during the first six months after the start of training. Consistent with previous results of the literature, unemployed persons with fewer employment opportunities profit more from participating in these programmes. Furthermore, we also document heterogeneous employment effects by residence status. Finally, we show the potential of easy-to-implement programme participation rules for improving average employment effects of these active labour market programmes. |
Keywords: | active labour market policy; conditional average treatment effects; individualized treatment effects; Machine Learning |
JEL: | C21 H43 J68 |
Date: | 2017–08 |
URL: | http://d.repec.org/n?u=RePEc:cpr:ceprdp:12224&r=big |
By: | Haskamp, Ulrich |
Abstract: | Regional banks as savings and cooperative banks are widespread in continental Europe. In the aftermath of the financial crisis, however, they had problems keeping their profitability which is an important quantitative indicator for the health of a bank and the banking sector overall. We use a large data set of bank-level balance sheet items and regional economic variables to forecast protability for about 2,000 regional banks. Machine learning algorithms are able to beat traditional estimators as ordinary least squares as well as autoregressive models in forecasting performance. |
Keywords: | profitability,regional banking,forecasting,machine learning |
JEL: | C53 G21 |
Date: | 2017 |
URL: | http://d.repec.org/n?u=RePEc:zbw:rwirep:705&r=big |
By: | Jesus Lago; Fjo De Ridder; Peter Vrancx; Bart De Schutter |
Abstract: | Motivated by the increasing integration among electricity markets, in this paper we propose three different methods to incorporate market integration in electricity price forecasting and to improve the predictive performance. First, we propose a deep neural network that considers features from connected markets to improve the predictive accuracy in a local market. To measure the importance of these features, we propose a novel feature selection algorithm that, by using Bayesian optimization and functional analysis of variance, analyzes the effect of the features on the algorithm performance. In addition, using market integration, we propose a second model that, by simultaneously predicting prices from two markets, improves even further the forecasting accuracy. Finally, we present a third model to predict the probability of price spikes; then, we use it as an input in the other two forecasters to detect spikes. As a case study, we consider the electricity market in Belgium and the improvements in forecasting accuracy when using various French electricity features. In detail, we show that the three proposed models lead to improvements that are statistically significant. Particularly, due to market integration, predictive accuracy is improved from 15.7% to 12.5% sMAPE (symmetric mean absolute percentage error). In addition, we also show that the proposed feature selection algorithm is able to perform a correct assessment, i.e. to discard the irrelevant features. |
Date: | 2017–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1708.07061&r=big |
By: | Grzegorz Marcjasz; Bartosz Uniejewski; Rafal Weron |
Abstract: | In day-ahead electricity price forecasting the daily and weekly seasonalities are always taken into account, but the long-term seasonal component was believed to add unnecessary complexity and in most studies ignored. The recent introduction of the Seasonal Component AutoRegressive (SCAR) modeling framework has changed this viewpoint. However, the latter is based on linear models estimated using Ordinary Least Squares. Here we show that considering non-linear neural network-type models with the same inputs as the corresponding SCAR model can lead to a yet better performance. While individual Seasonal Component Artificial Neural Network (SCANN) models are generally worse than the corresponding SCAR-type structures, we provide empirical evidence that committee machines of SCANN networks can significantly outperform the latter. |
Keywords: | Electricity spot price; Forecasting; Day-ahead market; Long-term seasonal component; Neural network; Committee machine |
JEL: | C14 C22 C45 C51 C53 Q47 |
Date: | 2017–07–29 |
URL: | http://d.repec.org/n?u=RePEc:wuu:wpaper:hsc1703&r=big |
By: | Windmeijer, F.; Farbmacher, H.; Davies, N.; Davey Smith, G.; |
Abstract: | We investigate the behaviour of the Lasso for selecting invalid instruments in linear instrumental variables models for estimating causal effects of exposures on outcomes, as proposed recently by Kang, Zhang, Cai and Small (2016, Journal of the American Statistical Association).Invalid instruments are such that they fail the exclusion restriction and enter the model as explanatory variables. We show that for this setup, the Lasso may not select all invalid instruments in large samples if they are relatively strong. Consistent selection also depends on the correlation structure of the instruments. We propose a median estimator that is consistent when less than 50% of the instruments are invalid, but its consistency does not depend on the relative strength of the instruments or their correlation structure. This estimator can therefore be used for adaptive Lasso estimation. The methods are applied to a Mendelian randomisation study to estimate the causal effect of BMI on diastolic blood pressure using data on individuals from the UK Biobank, with 96 single nucleotide polymorphisms as potential instruments for BMI. |
Keywords: | causal inference; instrumental variables estimation; invalid instruments; Lasso; Mendelian randomisation; |
Date: | 2017–08 |
URL: | http://d.repec.org/n?u=RePEc:yor:hectdg:17/22&r=big |
By: | Justin Sirignano; Konstantinos Spiliopoulos |
Abstract: | High-dimensional PDEs have been a longstanding computational challenge. We propose a deep learning algorithm similar in spirit to Galerkin methods, using a deep neural network instead of linear combinations of basis functions. The PDE is approximated with a deep neural network, which is trained on random batches of spatial points to satisfy the differential operator and boundary conditions. The algorithm is mesh-less, which is key since meshes become infeasible in higher dimensions. Instead of forming a mesh, sequences of spatial points are randomly sampled. We implement the approach for American options (a type of free-boundary PDE which is widely used in finance) in up to 100 dimensions. We call the algorithm a "Deep Galerkin Method (DGM)". |
Date: | 2017–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1708.07469&r=big |
By: | Knaus, Michael C.; Lechner, Michael; Strittmatter, Anthony |
Abstract: | We systematically investigate the effect heterogeneity of job search programmes for unemployed workers. To investigate possibly heterogeneous employment effects, we combine non-experimental causal empirical models with Lasso-type estimators. The empirical analyses are based on rich administrative data from Swiss social security records. We find considerable heterogeneities only during the first six months after the start of training. Consistent with previous results of the literature, unemployed persons with fewer employment opportunities profit more from participating in these programmes. Furthermore, we also document heterogeneous employment effects by residence status. Finally, we show the potential of easy-to-implement programme participation rules for improving average employment effects of these active labour market programmes. |
Keywords: | Machine learning, individualized treatment effects, conditional average treatment effects, active labour market policy |
JEL: | J68 H43 C21 |
Date: | 2017–08 |
URL: | http://d.repec.org/n?u=RePEc:usg:econwp:2017:11&r=big |
By: | Aparna Keshaviah; Editor |
Abstract: | This report synthesizes research and recommendations from Mathematica’s symposium on “The Potential of Wastewater Testing for Public Health and Safety." |
Keywords: | wastewater, testing, substance use, opioids, Arnold Foundation, Advanced analytics, Machine learning, Public health, public safety |
JEL: | I |
URL: | http://d.repec.org/n?u=RePEc:mpr:mprres:5a867fbc382040a1af74f957b565fd98&r=big |