|
on Big Data |
By: | Peter Tillmann (Justus-Liebig-University Giessen, Germany); Andreas Walter (Justus-Liebig-University Giessen, Germany) |
Abstract: | The present paper studies the consequences of conflicting narratives for the transmission of monetary policy shocks. We focus on conflict between the presidents of the ECB and the Bundesbank, the main protagonists of monetary policy in the euro area, who often disagreed on policy over the past two decades. This conflict received much attention on financial markets. We use over 900 speeches of both institutions’ presidents since 1999 and quantify the tone conveyed in speeches and the divergence of tone among both both presidents. We find (i) a drop towards more negative tone in 2009 for both institutions and (ii) a large divergence of tone after 2009. The ECB communication becomes persistently more optimistic and less uncertain than the Bundesbank’s after 2009, and this gap widens after the SMP, OMT and APP announcements. We show that long-term interest rates respond less strongly to a monetary policy shock if ECB-Bundesbank communication is more cacophonous than on average, in which case the ECB loses its ability to drive the slope of the yield curve. The weaker transmission under high divergence reflects a muted adjustment of the expectations component of long-term rates. |
Keywords: | Central bank communication, diverging tones, speeches, text analysis, monetary transmission |
JEL: | E52 E43 E32 |
URL: | http://d.repec.org/n?u=RePEc:cth:wpaper:gru_2018_009&r=all |
By: | Samuel Bazzi; Robert A. Blair; Christopher Blattman; Oeindrila Dube; Matthew Gudgeon; Richard Merton Peck |
Abstract: | Policymakers can take actions to prevent local conflict before it begins, if such violence can be accurately predicted. We examine the two countries with the richest available sub-national data: Colombia and Indonesia. We assemble two decades of fine-grained violence data by type, alongside hundreds of annual risk factors. We predict violence one year ahead with a range of machine learning techniques. Models reliably identify persistent, high-violence hot spots. Violence is not simply autoregressive, as detailed histories of disaggregated violence perform best. Rich socio-economic data also substitute well for these histories. Even with such unusually rich data, however, the models poorly predict new outbreaks or escalations of violence. "Best case" scenarios with panel data fall short of workable early-warning systems. |
JEL: | C52 C53 D74 |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:25980&r=all |
By: | Kate Bundorf; Maria Polyakova; Ming Tai-Seale |
Abstract: | Algorithms increasingly assist consumers in making their purchase decisions across a variety of markets; yet little is known about how humans interact with algorithmic advice. We examine how algorithmic, personalized information affects consumer choice among complex financial products using data from a randomized, controlled trial of decision support software for choosing health insurance plans. The intervention significantly increased plan switching, cost savings, time spent choosing a plan, and choice process satisfaction, particularly when individuals were exposed to an algorithmic expert recommendation. We document systematic selection - individuals who would have responded to treatment the most were the least likely to participate. A model of consumer decision-making suggests that our intervention affected consumers’ signals about both product features (learning) and utility weights (interpretation). |
JEL: | D1 D12 D8 D81 D82 D83 D9 D90 D91 G22 H51 I13 |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:25976&r=all |
By: | Maria Glenski; Tim Weninger; Svitlana Volkova |
Abstract: | Social media signals have been successfully used to develop large-scale predictive and anticipatory analytics. For example, forecasting stock market prices and influenza outbreaks. Recently, social data has been explored to forecast price fluctuations of cryptocurrencies, which are a novel disruptive technology with significant political and economic implications. In this paper we leverage and contrast the predictive power of social signals, specifically user behavior and communication patterns, from multiple social platforms GitHub and Reddit to forecast prices for three cyptocurrencies with high developer and community interest - Bitcoin, Ethereum, and Monero. We evaluate the performance of neural network models that rely on long short-term memory units (LSTMs) trained on historical price data and social data against price only LSTMs and baseline autoregressive integrated moving average (ARIMA) models, commonly used to predict stock prices. Our results not only demonstrate that social signals reduce error when forecasting daily coin price, but also show that the language used in comments within the official communities on Reddit (r/Bitcoin, r/Ethereum, and r/Monero) are the best predictors overall. We observe that models are more accurate in forecasting price one day ahead for Bitcoin (4% root mean squared percent error) compared to Ethereum (7%) and Monero (8%). |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1907.00558&r=all |
By: | Joshua Zoen Git Hiew; Xin Huang; Hao Mou; Duan Li; Qi Wu; Yabo Xu |
Abstract: | Traditional sentiment construction in finance relies heavily on the dictionary-based approach, with a few exceptions using simple machine learning techniques such as Naive Bayes classifier. While the current literature has not yet invoked the rapid advancement in the natural language processing, we construct in this research a textual-based sentiment index using a novel model BERT recently developed by Google, especially for three actively trading individual stocks in Hong Kong market with hot discussion on Weibo.com. On the one hand, we demonstrate a significant enhancement of applying BERT in sentiment analysis when compared with existing models. On the other hand, by combining with the other two existing methods commonly used on building the sentiment index in the financial literature, i.e., option-implied and market-implied approaches, we propose a more general and comprehensive framework for financial sentiment analysis, and further provide convincing outcomes for the predictability of individual stock return for the above three stocks using LSTM (with a feature of a nonlinear mapping), in contrast to the dominating econometric methods in sentiment influence analysis that are all of a nature of linear regression. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.09024&r=all |
By: | Sebastian MP (Indian Institute of Management, Kozhikode) |
Abstract: | Smart healthcare technologies are widely in use for the prevention and early diagnosis of diseases and are instrumental in transforming conventional medical care to patient-centric care. However, the traditional hospitals cannot entirely be replaced by home health systems, rather forcing them to become smart. The future smart hospitals are expected to have artificial intelligence (AI) tools for performing the patient diagnosis and robots for performing surgeries. The physicians will have the managing role, which could be performed through a touchscreen. This paper explores the challenges and opportunities associated with smart hospitals, and how they contribute to the objective of quality healthcare for everyone. The methodology used for the research is literature review. Machines do not have the common sense and blindly do what human beings instruct them to do. Thus, in spite of the digitalization and technology transformation of the healthcare processes, we cannot have hospitals without the human element. |
Keywords: | AI,EHR, IoT, machine learning, smart heathcare, smart hospital, wearables |
Date: | 2019–03 |
URL: | http://d.repec.org/n?u=RePEc:iik:wpaper:315&r=all |
By: | Li-Chun Zhang |
Abstract: | Purchase data from retail chains provide proxy measures of private household expenditure on items that are the most troublesome to collect in the traditional expenditure survey. Due to the sheer amount of proxy data, the bias due to coverage and selection errors completely dominates the variance. We develop tests for bias based on audit sampling, which makes use of available survey data that cannot be linked to the proxy data source at the individual level. However, audit sampling fails to yield a meaningful mean squared error estimate, because the sampling variance is too large compared to the bias of the big data estimate. We propose a novel accuracy measure that is applicable in such situations. This can provide a necessary part of the statistical argument for the uptake of big data source, in replacement of traditional survey sampling. An application to disaggregated food price index is used to demonstrate the proposed approach. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.11208&r=all |
By: | Küfeoğlu, S.; Liu, G.; Anaya, K.; Pollitt, M. |
Abstract: | This paper reviews digitalisation in energy sector by looking at the business models of 40 interesting new start-up energy companies from around the world. These start-ups have been facilitated by the rise of distributed generation, much of it intermittent in nature. We review Artificial Intelligence (AI), Machine Learning, Deep Learning and Blockchain applications in energy sector. We discuss the rise of prosumers and small-scale renewable generation, highlighting the role of Feed-in-Tariffs (FITs), the Distribution System Platform concept and the potential for Peer-to-Peer (P2P) trading. Our aim is to help energy regulators calibrate their support new business models. |
Keywords: | Feed-in tariff, Distribution System Platform, Peer-to-Peer, Blockchain |
JEL: | L94 |
Date: | 2019–06–25 |
URL: | http://d.repec.org/n?u=RePEc:cam:camdae:1956&r=all |
By: | Wenhang Bao; Xiao-yang Liu |
Abstract: | Liquidation is the process of selling a large number of shares of one stock sequentially within a given time frame, taking into consideration the costs arising from market impact and a trader's risk aversion. The main challenge in optimizing liquidation is to find an appropriate modeling system that can incorporate the complexities of the stock market and generate practical trading strategies. In this paper, we propose to use multi-agent deep reinforcement learning model, which better captures high-level complexities comparing to various machine learning methods, such that agents can learn how to make the best selling decisions. First, we theoretically analyze the Almgren and Chriss model and extend its fundamental mechanism so it can be used as the multi-agent trading environment. Our work builds the foundation for future multi-agent environment trading analysis. Secondly, we analyze the cooperative and competitive behaviours between agents by adjusting the reward functions for each agent, which overcomes the limitation of single-agent reinforcement learning algorithms. Finally, we simulate trading and develop an optimal trading strategy with practical constraints by using a reinforcement learning method, which shows the capabilities of reinforcement learning methods in solving realistic liquidation problems. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.11046&r=all |
By: | Diego Bodas (Mapfre); Juan R. García López (BBVA Research); Tomasa Rodrigo López (BBVA Research); Pep Ruiz de Aguirre (BBVA Research); Camilo A. Ulloa (BBVA Research); Juan Murillo Arias (BBVA data & analytics); Juan de Dios Romero Palop (BBVA data & analytics); Heribert Valero Lapaz (BBVA data & analytics); Matías J. Pacce (Banco de España) |
Abstract: | In this paper we present a high-dimensionality Retail Trade Index (RTI) constructed to nowcast the retail trade sector economic performance in Spain, using Big Data sources and techniques. The data are the footprints of BBVA clients from their credit or debit card transactions at Spanish point of sale (PoS) terminals. The resulting indexes have been found to be robust when compared with the Spanish RTI, regional RTI (Spain’s autonomous regions), and RTI by retailer type (distribution classes) published by the National Statistics Institute (INE). We also went one step further, computing the monthly indexes for the provinces and sectors of activity and the daily general index, by obtaining timely, detailed information on retail sales. Finally, we analyzed the high-frequency consumption dynamics using BBVA retailer behavior and a structural time series model. |
Keywords: | retail sales, big data, electronic payments, consumption, structural time series model |
JEL: | C32 C81 E21 |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:bde:wpaper:1921&r=all |
By: | Bradley J. Pillay; Absalom E. Ezugwu |
Abstract: | The prediction of stock prices is an important task in economics, investment and financial decision-making. It has for several decades, spurred the interest of many researchers to design stock price predictive models. In this paper, the symbiotic organisms search algorithm, a new metaheuristic algorithm is employed as an efficient method for training feedforward neural networks (FFNN). The training process is used to build a better stock price predictive model. The Straits Times Index, Nikkei 225, NASDAQ Composite, S&P 500, and Dow Jones Industrial Average indices were utilized as time series data sets for training and testing proposed predic-tive model. Three evaluation methods namely, Root Mean Squared Error, Mean Absolute Percentage Error and Mean Absolution Deviation are used to compare the results of the implemented model. The computational results obtained revealed that the hybrid Symbiotic Organisms Search Algorithm exhibited outstanding predictive performance when compared to the hybrid Particle Swarm Optimization, Genetic Algorithm, and ARIMA based models. The new model is a promising predictive technique for solving high dimensional nonlinear time series data that are difficult to capture by traditional models. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.10121&r=all |
By: | Lechner, Michael; Okasa, Gabriel |
Abstract: | In econometrics so-called ordered choice models are popular when interest is in the estimation of the probabilities of particular values of categorical outcome variables with an inherent ordering, conditional on covariates. In this paper we develop a new machine learning estimator based on the random forest algorithm for such models without imposing any distributional assumptions. The proposed Ordered Forest estimator provides a flexible estimation method of the conditional choice probabilities that can naturally deal with nonlinearities in the data, while taking the ordering information explicitly into account. In addition to common machine learning estimators, it enables the estimation of marginal effects as well as conducting inference thereof and thus providing the same output as classical econometric estimators based on ordered logit or probit models. An extensive simulation study examines the finite sample properties of the Ordered Forest and reveals its good predictive performance, particularly in settings with multicollinearity among the predictors and nonlinear functional forms. An empirical application further illustrates the estimation of the marginal effects and their standard errors and demonstrates the advantages of the flexible estimation compared to a parametric benchmark model. |
Keywords: | Ordered choice models, random forests, probabilities, marginal effects, machine learning |
JEL: | C14 C25 C40 |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:usg:econwp:2019:08&r=all |
By: | Sam Ganzfried; Max Chiswick |
Abstract: | Poker is a large complex game of imperfect information, which has been singled out as a major AI challenge problem. Recently there has been a series of breakthroughs culminating in agents that have successfully defeated the strongest human players in two-player no-limit Texas hold 'em. The strongest agents are based on algorithms for approximating Nash equilibrium strategies, which are stored in massive binary files and unintelligible to humans. A recent line of research has explored approaches for extrapolating knowledge from strong game-theoretic strategies that can be understood by humans. This would be useful when humans are the ultimate decision maker and allow humans to make better decisions from massive algorithmically-generated strategies. Using techniques from machine learning we have uncovered a new simple, fundamental rule of poker strategy that leads to a significant improvement in performance over the best prior rule and can also easily be applied by human players. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.09895&r=all |
By: | Kiyohiko G. Nishimura (National Graduate Institute for Policy Studies (GRIPS) and The University of Tokyo); Seisho Sato (Faculty of Economics, The University of Tokyo); Akihiko Takahashi (Faculty of Economics, The University of Tokyo) |
Abstract: | This work develops and estimates a three-factor term structure model with explicit sentiment factors in a period including the global financial crisis, where market confidence was said to erode considerably. It utilizes a large text data of real time, relatively high-frequency market news and takes account of the difficulties in incorporating market sentiment into the models. To the best of our knowledge, this is the first attempt to use this category of data in term-structure models. Although market sentiment or market confidence is often regarded as an important driver of asset markets, it is not explicitly incorporated in traditional empirical factor models for daily yield curve data because they are unobservable. To overcome this problem, we use a text mining approach to generate observable variables which are driven by otherwise unobservable sentiment factors. Then, applying the Monte Carlo filter as a filtering method in a state space Bayesian filtering approach, we estimate the dynamic stochastic structure of these latent factors from observable variables driven by these latent variables. As a result, the three-factor model with text mining is able to distinguish (1) a spread-steepening factor which is driven by pessimists' view and explaining the spreads related to ultra-long term yields from (2) a spread-flattening factor which is driven by optimists' view and in uencing the long and medium term spreads. Also, the three-factor model with text mining has better fitting to the observed yields than the model without text mining. Moreover, we collect market participants' views about specific spreads in the term structure and find that the movement of the identified sentiment factors are consistent with the market participants' views, and thus market sentiment. |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:tky:fseres:2018cf1101&r=all |
By: | Philip ME Garboden (Department of Urban and Regional Planning, University of Hawai‘i at Manoa) |
Abstract: | This chapter considers the types of Big Data that have proven useful for macroeconomic forecasting. It first presents the various definitions of Big Data, proposing one we believe is most useful for forecasting. The literature on both the opportunities and challenges of Big Data are presented. It then proposes a taxonomy of the types of Big Data: 1) Financial Market Data; 2) E-Commerce and Credit Cards; 3) Mobile Phones; 4) Search; 5) Social Media Data; 6) Textual Data; 7) Sensors, and The Internet of Things; 8) Transportation Data; 9) Other Administrative Data. Noteworthy studies are described throughout. |
Keywords: | big data, data sources |
JEL: | C80 |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:hae:wpaper:2019-3&r=all |
By: | Joseph Staudt; Yifang Wei; Lisa Singh; Shawn Klimek; J. Bradford Jensen; Andrew L. Baer |
Abstract: | Between the 2007 and 2012 Economic Censuses (EC), the count of franchise-affiliated establishments declined by 9.8%. One reason for this decline was a reduction in resources that the Census Bureau was able to dedicate to the manual evaluation of survey responses in the franchise section of the EC. Extensive manual evaluation in 2007 resulted in many establishments, whose survey forms indicated they were not franchise-affiliated, being recoded as franchise-affiliated. No such evaluation could be undertaken in 2012. In this paper, we examine the potential of using external data harvested from the web in combination with machine learning methods to automate the process of evaluating responses to the franchise section of the 2017 EC. Our method allows us to quickly and accurately identify and recode establishments have been mistakenly classified as not being franchise-affiliated, increasing the unweighted number of franchise-affiliated establishments in the 2017 EC by 22%-42%. |
JEL: | C81 L8 |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:cen:wpaper:19-20&r=all |
By: | Catherine Doz (Paris School of Economics and University Paris); Peter Fuleky (Department of Economics, University of Hawaii at Manoa, UHERO) |
Abstract: | Dynamic factor models are parsimonious representations of relationships among time series variables. With the surge in data availability, they have proven to be indispensable in macroeconomic forecasting. This chapter surveys the evolution of these models from their pre-big-data origins to the large-scale models of recent years. We review the associated estimation theory, forecasting approaches, and several extensions of the basic framework. |
Keywords: | dynamic factor models, big data, two-step estimation, time domain, frequency domain, structural breaks |
JEL: | C32 C38 C53 |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:hae:wpaper:2019-4&r=all |
By: | Bi, Huixin (Federal Reserve Bank of Kansas City); Traum, Nora |
Abstract: | This paper examines how newspaper reporting affects government bond prices during the U.S. state default of the 1840s. Using unsupervised machine learning algorithms, the paper first constructs novel ``fiscal information indices'' for state governments based on U.S. newspapers at the time. The impact of the indices on government bond prices varied over time. Before the crisis, the entry of new western states into the bond market spurred competition: more state-specific fiscal news imposed downward pressure on bond prices for established states in the market. During the crisis, more state-specific fiscal information increased (lowered) bond prices for states with sound (unsound) fiscal policy. |
Keywords: | Sovereign Default; Information; Fiscal Policy |
JEL: | E62 H30 N41 |
Date: | 2019–06–01 |
URL: | http://d.repec.org/n?u=RePEc:fip:fedkrw:rwp19-04&r=all |
By: | Filipe R. Campante; Davin Chor; Bingjing Li |
Abstract: | We study how adverse economic shocks influence political outcomes in authoritarian regimes in strong states, by examining the 2013-2015 export slowdown in China. We exploit detailed customs data and the variation they reveal about Chinese prefectures’ underlying exposure to the global trade slowdown, in order to implement a shift-share instrumental variables strategy. Prefectures that experienced a more severe export slowdown witnessed a significant increase in incidents of labor strikes. This was accompanied by a heightened emphasis in such prefectures on upholding domestic stability, as evidenced from: (i) textual analysis measures we constructed from official annual work reports using machine-learning algorithms; and (ii) data we gathered on local fiscal expenditures channelled towards public security uses and social spending. The central government was subsequently more likely to replace the party secretary in prefectures that saw a high level of “excess strikes”, above what could be predicted from the observed export slowdown, suggesting that local leaders were held to account on yardsticks related to political stability. |
JEL: | D73 D74 F10 F14 F16 H10 J52 P26 |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:25925&r=all |
By: | Yuki Higuchi (Graduate School of Economics, Nagoya City University); Nobuhiko Fuwa (Graduate School of Public Policy, The University of Tokyo); Kei Kajisa (School of International Politics, Economics and Communication, Aoyama Gakuin University); Takahiro Sato (Faculty of Agriculture and Life Science, Hirosaki University); Yasuyuki Sawada (Faculty of Economics, The University of Tokyo) |
Abstract: | Aid from local governments can play a critical role as a risk-coping device in a postdisaster situation if the recipients have been properly targeted. Combining (i) satellite images (objective information on flood damage), (ii) administrative records (objective information on aid receipt), and (iii) sui generis survey data (self-reported information on damage assessment and aid receipt) on a large-scale flooding in the Philippines, we analyze the accuracy of disaster aid targeting and self-reporting bias in flood damage and aid receipt. We find that damage is over-reported while aid receipt is under-reported, and as a result, the estimated targeting accuracy based on self-reported information is substantially downward-biased. |
Date: | 2018–12 |
URL: | http://d.repec.org/n?u=RePEc:tky:fseres:2018cf1107&r=all |
By: | Yuki Higuchi (Graduate School of Economics, Nagoya City University); Nobuhiko Fuwa (Graduate School of Public Policy, The University of Tokyo); Kei Kajisa (School of International Politics, Economics and Communication, Aoyama Gakuin University); Takahiro Sato (Faculty of Agriculture and Life Science, Hirosaki University); Yasuyuki Sawada (Faculty of Economics, The University of Tokyo) |
Abstract: | Aid from local governments can play a critical role as a risk-coping device in a postdisaster situation if the recipients have been properly targeted. Combining (i) satellite images (objective information on flood damage), (ii) administrative records (objective information on aid receipt), and (iii) sui generis survey data (self-reported information on damage assessment and aid receipt) on a large-scale flooding in the Philippines, we analyze the accuracy of disaster aid targeting and self-reporting bias in flood damage and aid receipt. We find that damage is over-reported while aid receipt is under-reported, and as a result, the estimated targeting accuracy based on self-reported information is substantially downward-biased. |
Date: | 2018–12 |
URL: | http://d.repec.org/n?u=RePEc:tky:fseres:2018cf1106&r=all |