|
on Big Data |
By: | Peter Tillmann (Justus-Liebig-University Giessen); Andreas Walter (Justus-Liebig-University Giessen) |
Abstract: | The present paper studies the consequences of con flicting narratives for the transmission of monetary policy shocks. We focus on con flict between the presidents of the ECB and the Bundesbank, the main protagonists of monetary policy in the euro area, who often disagreed on policy over the past two decades. This con flict received much attention on financial markets. We use over 900 speeches of both institutions' presidents since 1999 and quantify the tone conveyed in speeches and the divergence of tone among both both presidents. We find (i) a drop towards more negative tone in 2009 for both institutions and (ii) a large divergence of tone after 2009. The ECB communication becomes persistently more optimistic and less uncertain than the Bundesbank's after 2009, and this gap widens after the SMP, OMT and APP announcements. We show that long-term interest rates respond less strongly to a monetary policy shock if ECB-Bundesbank communication is more cacophonous than on average, in which case the ECB loses its ability to drive the slope of the yield curve. The weaker transmission under high divergence re ects a muted adjustment of the expectations component of long-term rates. |
Keywords: | Central bank communication, diverging tones, speeches, text analysis, monetary transmission |
JEL: | E52 E43 E32 |
Date: | 2018 |
URL: | http://d.repec.org/n?u=RePEc:mar:magkse:201820&r=big |
By: | Jos\'e Igor Morlanes |
Abstract: | We extend the empirical results published in article "Empirical Evidence on Arbitrage by Changing the Stock Exchange" by means of machine learning and advanced econometric methodologies based on Smooth Transition Regression models and Artificial Neural Networks. |
Date: | 2018–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1806.01070&r=big |
By: | Chiovelli, Giorgio; Michalopoulos, Stelios; Papaioannou, Elias |
Abstract: | Landmine contamination affects the lives of millions in many conflict-ridden countries long after the cessation of hostilities. Yet, little research exists on its impact on post-conflict recovery. In this study, we explore the economic consequences of landmine clearance in Mozambique, the only country that has moved from "heavily-contaminated" in 1992 to "mine-free" status in 2015. First, we compile a dataset detailing the evolution of clearance, collecting thousands of reports from the numerous demining actors. Second, we exploit the timing of demining to assess its impact on local economic activity, as reflected in satellite images of light density at night. The analysis reveals a moderate positive association that masks sizeable heterogeneity. Economic activity responds strongly to clearance of the transportation network, trade hubs, and more populous areas, while the demining-development association is weak in rural areas of low population density. Third, recognizing that landmine removal reconË figured the accessibility to the transportation infrastructure, we apply a "market-access" approach to quantify both its direct and indirect effects. The market-access estimates reveal substantial improvements on aggregate economic activity. The market-access benefits of demining are also present in localities without any contamination. Fourth, counterfactual policy simulations project considerable gains had the fragmented process of clearance in Mozambique been centrally coordinated, prioritizing clearance of the colonial transportation routes. |
Keywords: | Civil War; infrastructure network; landmines; post-conflict recovery; Trade |
Date: | 2018–06 |
URL: | http://d.repec.org/n?u=RePEc:cpr:ceprdp:13021&r=big |
By: | Carlos Pedro Gon\c{c}alves |
Abstract: | An artificial agent for financial risk and returns' prediction is built with a modular cognitive system comprised of interconnected recurrent neural networks, such that the agent learns to predict the financial returns, and learns to predict the squared deviation around these predicted returns. These two expectations are used to build a volatility-sensitive interval prediction for financial returns, which is evaluated on three major financial indices and shown to be able to predict financial returns with higher than 80% success rate in interval prediction in both training and testing, raising into question the Efficient Market Hypothesis. The agent is introduced as an example of a class of artificial intelligent systems that are equipped with a Modular Networked Learning cognitive system, defined as an integrated networked system of machine learning modules, where each module constitutes a functional unit that is trained for a given specific task that solves a subproblem of a complex main problem expressed as a network of linked subproblems. In the case of neural networks, these systems function as a form of an "artificial brain", where each module is like a specialized brain region comprised of a neural network with a specific architecture. |
Date: | 2018–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1806.05876&r=big |
By: | Nariyasu YAMAZAWA |
Abstract: | We present a procedure for analyzing the current business conditions and forecasting GDP growth rate by quantitative text analysis. We use text data of Economy Watcher Survey conducted by Cabinet Office. We extract words from 190 thousands sentence, and construct time series data by counting appearance rate every month. The analyses consist of four parts: (1) visualizing appearance rate by drawing graphs, (2) correlation analysis, (3) principal component analysis, and (4) forecasting GDP growth rate. First, we draw graphs of the appearance rate of words which are influenced by business conditions. We find that the graphs show the effect of policy on business conditions clearly. Second, we construct word lists which correlate business conditions by computing correlation coefficients. And we also construct lists which reversely correlate business conditions. Third, we extract principal component from 150 frequent words. We find that the 1st principal component move together with business conditions. The last, we forecast quarterly real GDP growth rate by text data. We find that forecast accuracy improved by adding the text data. It shows that text data have useful information about GDP forecasting. |
Date: | 2018–03 |
URL: | http://d.repec.org/n?u=RePEc:esj:esridp:345&r=big |
By: | Greg Kirczenow; Ali Fathi; Matt Davison |
Abstract: | This paper studies the application of machine learning in extracting the market implied features from historical risk neutral corporate bond yields. We consider the example of a hypothetical illiquid fixed income market. After choosing a surrogate liquid market, we apply the Denoising Autoencoder algorithm from the field of computer vision and pattern recognition to learn the features of the missing yield parameters from the historically implied data of the instruments traded in the chosen liquid market. The results of the trained machine learning algorithm are compared with the outputs of a point in- time 2 dimensional interpolation algorithm known as the Thin Plate Spline. Finally, the performances of the two algorithms are compared. |
Date: | 2018–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1806.01731&r=big |
By: | Akash Malhotra |
Abstract: | A measure of relative importance of variables is often desired by researchers when the explanatory aspects of econometric methods are of interest. To this end, the author briefly reviews the limitations of conventional econometrics in constructing a reliable measure of variable importance. The author highlights the relative stature of explanatory and predictive analysis in economics and the emergence of fruitful collaborations between econometrics and computer science. Learning lessons from both, the author proposes a hybrid approach based on conventional econometrics and advanced machine learning (ML) algorithms, which are otherwise, used in predictive analytics. The purpose of this article is two-fold, to propose a hybrid approach to assess relative importance and demonstrate its applicability in addressing policy priority issues with an example of food inflation in India, followed by a broader aim to introduce the possibility of conflation of ML and conventional econometrics to an audience of researchers in economics and social sciences, in general. |
Date: | 2018–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1806.04517&r=big |
By: | Antonio Lima; Hasan Bakhshi |
Abstract: | Rapid technological, social and economic change is having significant impacts on the nature of jobs. In fast-changing environments it is crucial that policymakers have a clear and timely picture of the labour market. Policymakers use standardised occupational classifications, such as the Office for National Statistics’ Standard Occupational Classification (SOC) in the UK to analyse the labour market. These permit the occupational composition of the workforce to be tracked on a consistent and transparent basis over time and across industrial sectors. However, such systems are by their nature costly to maintain, slow to adapt and not very flexible. For that reason, additional tools are needed. At the same time, policymakers over the world are revisiting how active skills development policies can be used to equip workers with the capabilities needed to meet the new labour market realities. There is in parallel a desire for more granular understandings of what skills combinations are required of occupations, in part so that policymakers are better sighted on how individuals can redeploy these skills as and when employer demands change further. In this paper, we investigate the possibility of complementing traditional occupational classifications with more flexible methods centred around employers’ characterisations of the skills and knowledge requirements of occupations as presented in job advertisements. We use data science methods to classify job advertisements as STEM or non-STEM (Science, Technology, Engineering and Mathematics) and creative or non-creative, based on the content of ads in a database of UK job ads posted online belonging to Boston-based job market analytics company, Burning Glass Technologies. In doing so, we first characterise each SOC code in terms of its skill make-up; this step allows us to describe each SOC skillset as a mathematical object that can be compared with other skillsets. Then we develop a classifier that predicts the SOC code of a job based on its required skills. Finally, we develop two classifiers that decide whether a job vacancy is STEM/non-STEM and creative/non-creative, based again on its skill requirements. |
Keywords: | labour demand, occupational classification, online job adverts, big data, machine learning, STEM, STEAM, creative economy |
JEL: | C18 J23 J24 |
Date: | 2018–07 |
URL: | http://d.repec.org/n?u=RePEc:nsr:escoed:escoe-dp-2018-07&r=big |
By: | Miruna Oprescu; Vasilis Syrgkanis; Zhiwei Steven Wu |
Abstract: | We study the problem of estimating heterogeneous treatment effects from observational data, where the treatment policy on the collected data was determined by potentially many confounding observable variables. We propose orthogonal random forest1, an algorithm that combines orthogonalization, a technique that effectively removes the confounding effect in two-stage estimation, with generalized random forests [Athey et al., 2017], a flexible method for estimating treatment effect heterogeneity. We prove a consistency rate result of our estimator in the partially linear regression model, and en route we provide a consistency analysis for a general framework of performing generalized method of moments (GMM) estimation. We also provide a comprehensive empirical evaluation of our algorithms, and show that they consistently outperform baseline approaches. |
Date: | 2018–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1806.03467&r=big |
By: | Guilhem Fabre (EHESS - Ecole des Hautes Etudes en Sciences Sociales) |
Date: | 2018–06–19 |
URL: | http://d.repec.org/n?u=RePEc:hal:wpaper:halshs-01818508&r=big |
By: | Zhan Gao; Zhentao Shi |
Abstract: | Economists specify high-dimensional models to address heterogeneity in empirical studies with complex big data. Estimation of these models calls for optimization techniques to handle a large number of parameters. Convex problems can be effectively executed in modern statistical programming languages. We complement Koenker and Mizera (2014)'s work on numerical implementation of convex optimization, with focus on high-dimensional econometric estimators. In particular, we replicate the simulation exercises in Su, Shi, and Phillips (2016) and Shi (2016) to show the robust performance of convex optimization cross platforms. Combining R and the convex solver MOSEK achieves faster speed and equivalent accuracy as in the original papers. The convenience and reliability of convex optimization in R make it easy to turn new ideas into prototypes. |
Date: | 2018–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1806.10423&r=big |
By: | Tadas Limba (Mykolas Romeris University); Aurimas Šidlauskas (Mykolas Romeris University) |
Abstract: | In view of the changes taking place in society, social progress and the achievements of science and technology, the protection of fundamental rights must be strengthened. The aim of the article is to analyse the principles and peculiarities of safe management of the personal data in social networks. In this scientific article, methods of document analysis, scientific literature review, case study and generalization are used. Consumers themselves decide how much and what kind of information to publicize on the Facebook social network. In order to use the third-party applications, users at the time of authorization must confirm that they agree to give access to their personal data otherwise the service will not be provided. Personal data of the Facebook user comprise his/her public profile including user's photo, age, gender, and other public information; a list of friends; e-mail mail; time zone records; birthday; photos; hobbies, etc. Which personal data will be requested from the user depends on the third-party application. Analysis of the legal protection of personal data in the internet social networks reveals that it is limited to the international and European Union legal regulation on protection of the personal data in the online social networks. Users who make publicly available a large amount of personal information on the Facebook social network should decide on the issue if they want to share that information with third parties for the use of their services (applications). This article presents a model for user and third party application interaction, and an analysis of risks and recommendations to ensure the security of personal data of the user. |
Keywords: | security of the data,social network,personal data,third-party applications |
Date: | 2018–03–30 |
URL: | http://d.repec.org/n?u=RePEc:hal:journl:hal-01773973&r=big |
By: | Ash, Elliott; Chen, Daniel L. |
Abstract: | Recent work in natural language processing represents language objects (words and documents) as dense vectors that encode the relations between those objects. This paper explores the application of these methods to legal language, with the goal of understanding judicial reasoning and the relations between judges. In an application to federal appellate courts, we show that these vectors encode information that distinguishes courts, time, and legal topics. The vectors do not reveal spatial distinctions in terms of political party or law school attended, but they do highlight generational differences across judges. We conclude the paper by outlining a range of promising future applications of these methods. |
Date: | 2018–07 |
URL: | http://d.repec.org/n?u=RePEc:tse:iastwp:32766&r=big |
By: | Ash, Elliott; Chen, Daniel L. |
Abstract: | Recent work in natural language processing represents language objects (words and documents) as dense vectors that encode the relations between those objects. This paper explores the application of these methods to legal language, with the goal of understanding judicial reasoning and the relations between judges. In an application to federal appellate courts, we show that these vectors encode information that distinguishes courts, time, and legal topics. The vectors do not reveal spatial distinctions in terms of political party or law school attended, but they do highlight generational differences across judges. We conclude the paper by outlining a range of promising future applications of these methods. |
Date: | 2018–07 |
URL: | http://d.repec.org/n?u=RePEc:tse:wpaper:32764&r=big |
By: | Benjamin W. Pugsley; Peter Sedlacek; Vincent Sterk |
Abstract: | Only half of all startups survive past the age of five and surviving businesses grow at vastly different speeds. Using micro data on employment in the population of U.S. Businesses, we estimate that the lion's share of these differences is driven by ex-ante heterogeneity across firms, rather than by ex-post shocks. We embed such heterogeneity in a firm dynamics model and study how ex-ante differences shape the distribution of firm size, "up-or-out" dynamics, and the associated gains in aggregate output. "Gazelles" - a small subset of startups with particularly high growth potential - emerge as key drivers of these outcomes. Analyzing changes in the distribution of ex-ante firm heterogeneity over time reveals that the birth rate and growth potential of gazelles has declined, creating substantial aggregate losses. |
Keywords: | Firm Dynamics, Startups, Macroeconomics, Big Data |
JEL: | D22 E23 E24 |
Date: | 2018–06 |
URL: | http://d.repec.org/n?u=RePEc:cen:wpaper:18-30&r=big |