nep-big 2019-12-23 papers

on Big Data

Issue of 2019‒12‒23
sixteen papers chosen by
Tom Coupé
University of Canterbury

Wage Indexation and Jobs. A Machine Learning Approach By Gert Bijnens; Shyngys Karimov; Jozef Konings
How Much Information Do Monetary Policy Committees Disclose? Evidence from the FOMC's Minutes and Transcripts By Apel, Mikael; Blix Grimaldi, Marianna; Hull, Isaiah
Text Selection By Bryan T. Kelly; Asaf Manela; Alan Moreira
Towards using responsible artificial intelligence in product recommender systems in marketing By Christine Balagué; El Mehdi Rochd
Does High Frequency Social Media Data Improve Forecasts of Low Frequency Consumer Confidence Measures? By Steven F. Lehrer; Tian Xie; Tao Zeng
From Gutenberg to Google: The Internet Is Adopted Earlier if Ancestors Had Advanced Information Technology in 1500 AD By Ljunge, Martin
Competitive Imperfect Price Discrimination and Market Power By Paul Belleflamme; Wing Man Wynne Lam; Wouter Vergote
Promise and Peril in the Smart City: Local Government in the Age of Digital Urbanism By John Lorinc
Digital Transformation in Transport, Construction, Energy, Government and Public Administration By BALDINI Gianmarco; BARBONI Marcello; BONO Flavio; DELIPETREV Blagoj; DUCH BROWN Nestor; FERNANDEZ MACIAS Enrique; GKOUMAS Konstantinos; JOOSSENS Elisabeth; KALPAKA Anna; NEPELSKI Daniel; NUNES DE LIMA Maria; PAGANO Andrea; PRETTICO Giuseppe; SANCHEZ MARTIN Jose Ignacio; SOBOLEWSKI Maciej; TRIAILLE Jean Paul; TSAKALIDIS Anastasios; URZI BRANCATI Maria Cesira
Exporting and productivity as part of the growth process: Causal evidence from a data-driven structural VAR By Tommaso Ciarli; Alex Coad; Alessio Moneta
Recalibrating the Reported Returns to Agricultural R&D: What if We All Heeded Griliches? By Rao, Xudong; Hurley, Terrance M.; Pardey, Philip G.
Valuing Private Equity Strip by Strip By Arpit Gupta; Stijn Van Nieuwerburgh
Changing Fortunes: Long-Termism—G-Zero, Artificial Intelligence and Debt By Stephen S. Poloz
CRAN R Package ‘bayesvl: Visually Learning the Graphical Structure of Bayesian Networks and Performing MCMC with 'Stan'' By Vuong, Quan-Hoang; La, Viet-Phuong; Ho, Toan Manh
Global trends towards urban street-network sprawl By Barrington-Leigh, Christopher Paul; Millard-Ball, Adam
Human vs. Machine: Disposition Effect Among Algorithmic and Human Day-traders By Karolis Liaudinskas

Wage Indexation and Jobs. A Machine Learning Approach

By:	Gert Bijnens; Shyngys Karimov; Jozef Konings
Abstract:	In 2015 Belgium suspended the automatic wage indexation for a period of 12 months in order to boost competitiveness and increase employment. This paper uses a novel, machine learning based approach to construct a counterfactual experiment. This artificial counterfactual allows us to analyze the employment impact of suspending the indexation mechanism. We find a positive impact on employment of 0.5 percent which corresponds to a labor demand elasticity of -0.25. This effect is more pronounced for manufacturing firms, where the impact on employment can reach 2 percent, which corresponds to a labor demand elasticity of -1.
Keywords:	labor demand, wage elasticity, counterfactual analysis, artificial control, machine learning
Date:	2019–11–27
URL:	http://d.repec.org/n?u=RePEc:ete:vivwps:643831&r=all

How Much Information Do Monetary Policy Committees Disclose? Evidence from the FOMC's Minutes and Transcripts

By:	Apel, Mikael (Monetary Policy Department, Central Bank of Sweden); Blix Grimaldi, Marianna (Swedish National Debt Office); Hull, Isaiah (Research Department, Central Bank of Sweden)
Abstract:	The purpose of central bank minutes is to give an account of monetary policy meeting discussions to outside observers, thereby enabling them to draw informed conclusions about future policy. However, minutes are by necessity a shortened and edited representation of a broader discussion. Consequently, they may omit information that is predictive of future policy decisions. To investigate this, we compare the information content of the FOMC's minutes and transcripts, focusing on three dimensions which are likely to be excluded from the minutes: 1) the committee's degree of hawkishness; 2) the chairperson's degree of hawkishness; and 3) the level of agreement between committee members. We measure committee and chairperson hawkishness with a novel dictionary that is constructed using the FOMC's minutes and transcripts. We measure agreement by performing deep transfer learning, a technique that involves training a deep learning model on one set of documents - U.S. congressional debates - and then making predictions on another: FOMC transcripts. Our findings suggest that transcripts are more informative than minutes and heightened committee agreement typically precedes policy rate increases.
Keywords:	Central Bank Communication; Monetary Policy; Machine Learning
JEL:	D71 D83 E52 E58
Date:	2019–11–01
URL:	http://d.repec.org/n?u=RePEc:hhs:rbnkwp:0381&r=all

Text Selection

By:	Bryan T. Kelly; Asaf Manela; Alan Moreira
Abstract:	Text data is ultra-high dimensional, which makes machine learning techniques indispensable for textual analysis. Text is often selected—journalists, speechwriters, and others craft messages to target their audiences’ limited attention. We develop an economically motivated high dimensional selection model that improves learning from text (and from sparse counts data more generally). Our model is especially useful when the choice to include a phrase is more interesting than the choice of how frequently to repeat it. It allows for parallel estimation, making it computationally scalable. A first application revisits the partisanship of US congressional speech. We find that earlier spikes in partisanship manifested in increased repetition of different phrases, whereas the upward trend starting in the 1990s is due to entirely distinct phrase selection. Additional applications show how our model can backcast, nowcast, and forecast macroeconomic indicators using newspaper text, and that it substantially improves out-of-sample fit relative to alternative approaches.
JEL:	C1 C4 C55 C58 E17 G12 G17
Date:	2019–11
URL:	http://d.repec.org/n?u=RePEc:nbr:nberwo:26517&r=all

Towards using responsible artificial intelligence in product recommender systems in marketing

By:	Christine Balagué (MMS - Département Management, Marketing et Stratégie - IMT - Institut Mines-Télécom [Paris] - TEM - Télécom Ecole de Management - IMT-BS - Institut Mines-Télécom Business School, LITEM - Laboratoire en Innovation, Technologies, Economie et Management - UEVE - Université d'Évry-Val-d'Essonne - IMT-BS - Institut Mines-Télécom Business School); El Mehdi Rochd (MMS - Département Management, Marketing et Stratégie - IMT - Institut Mines-Télécom [Paris] - TEM - Télécom Ecole de Management - IMT-BS - Institut Mines-Télécom Business School, LITEM - Laboratoire en Innovation, Technologies, Economie et Management - UEVE - Université d'Évry-Val-d'Essonne - IMT-BS - Institut Mines-Télécom Business School)
Abstract:	Most of product recommender systems in marketing are based on artificial intelligence algorithms using machine learning or deep learning techniques. One of the current challenges for companies is to avoid negative effects of these product recommender systems on customers (or prospects), such as unfairness, biais, discrimination, opacity, encapsulated opinion in the implemented recommender systems algorithms. This research focuses on the fairness challenge. We first make a literature review on the importance and challenges of using ethical algorithms. Second, we define the fairness concept and present the reasons why it is important for companies to address this issue in marketing. Third, we present the different methodologies used in recommender systems algorithms. Using a dataset in the entertainment industry, we measure the algorithm fairness for each methology and compare the results. Finally, we improve the existing methods by proposing a new product recommender system aiming at increasing fairness versus previous methods, without compromising the recommendation systems performance.
Keywords:	Recommender systems,Ethics,Algorithms,Fairness
Date:	2019
URL:	http://d.repec.org/n?u=RePEc:hal:journl:hal-02332033&r=all

Does High Frequency Social Media Data Improve Forecasts of Low Frequency Consumer Confidence Measures?

By:	Steven F. Lehrer; Tian Xie; Tao Zeng
Abstract:	Social media data presents challenges for forecasters since one must convert text into data and deal with issues related to these measures being collected at different frequencies and volumes than traditional financial data. In this paper, we use a deep learning algorithm to measure sentiment within Twitter messages on an hourly basis and introduce a new method to undertake MIDAS that allows for a weaker discounting of historical data that is well-suited for this new data source. To evaluate the performance of approach relative to alternative MIDAS strategies, we conduct an out of sample forecasting exercise for the consumer confidence index with both traditional econometric strategies and machine learning algorithms. Irrespective of the estimator used to conduct forecasts, our results show that (i) including consumer sentiment measures from Twitter greatly improves forecast accuracy, and (ii) there are substantial gains from our proposed MIDAS procedure relative to common alternatives.
JEL:	C58 G17
Date:	2019–11
URL:	http://d.repec.org/n?u=RePEc:nbr:nberwo:26505&r=all

From Gutenberg to Google: The Internet Is Adopted Earlier if Ancestors Had Advanced Information Technology in 1500 AD

By:	Ljunge, Martin (Research Institute of Industrial Economics (IFN))
Abstract:	Individuals with ancestry from countries with advanced information technology in 1500 AD, such as movable type and paper, adopt the internet faster than those with less advanced ancestry. The analysis illustrates persistence over five centuries in information technology adoption in European and U.S. populations. The results hold when excluding the most and least advanced ancestries, and when accounting for additional deep roots of development. Historical information technology is a better predictor of internet adoption than current development. A machine learning procedure supports the findings. Human capital is a plausible channel as 1500 AD information technology predicts early 20th century school enrollment, which predicts 21st century internet adoption. A three-stage model including human capital around 1990, yields similar results.
Keywords:	Internet; Technology diffusion; Information technology; Intergenerational transmission; Printing press
JEL:	D13 D83 J24 N70 O33 Z13
Date:	2019–12–18
URL:	http://d.repec.org/n?u=RePEc:hhs:iuiwop:1312&r=all

Competitive Imperfect Price Discrimination and Market Power

By:	Paul Belleflamme; Wing Man Wynne Lam; Wouter Vergote
Abstract:	Two duopolists compete in price on the market for a homogeneous product. They can ‘profile’ consumers, i.e., identify their valuations with some probability. If both firms can profile consumers but with different abilities, then they achieve positive expected profits at equilibrium. This provides a rationale for firms to (partially and unequally) share data about consumers, or for data brokers to sell different customer analytics to competing firms. Consumers prefer that both firms profile exactly the same set of consumers, or that only one firm profiles consumers, as this entails marginal cost pricing (so does a policy requiring list prices to be public). Otherwise, more protective privacy regulations have ambiguous effects on consumer surplus.
Keywords:	price discrimination, price dispersion, Bertrand competition, privacy, big data
JEL:	D11 D18 L12 L86
Date:	2019
URL:	http://d.repec.org/n?u=RePEc:ces:ceswps:_7964&r=all

Promise and Peril in the Smart City: Local Government in the Age of Digital Urbanism

By:	John Lorinc (Spacing magazine)
Abstract:	In the past few years, a growing numbers of urbanists, planners, technology companies, and governance experts have started to use the term â€œsmart city.â€ Some define smart cities in terms of using emerging and established technologies to improve the performance of municipal systems. Others take a more expansive view that embeds these new systems in a broader vision of urban regions characterized by innovation-based economic activity, a highly educated labour force, and policy-making that leverages these new technologies to confront stubborn urban problems. The market for smart-city technologies â€“ such as cutting edge networked sensors, big-data repositories, powerful analytics software, and smart grids â€“ has gathered momentum, as leading technology suppliers develop products and services geared to this domain. Entire new communities are being developed using smart-city systems, in some cases as proof-of-concept living labs. Yet the rapid adoption of consumer and security technologies that do not fall under the conventional â€œsmart cityâ€ definition also have far-reaching impacts on municipal systems (such as housing, transportation, and policing), including those that have benefited from new smart-city systems. These include ride- and apartment-sharing apps, autonomous vehicles, and data-driven law enforcement or predictive policing applications. In other words, the emerging challenge facing municipal policymakers is to determine the degree of investment or procurement in purpose-built smart-city technologies while adapting regulatory and governance systems to respond to changes arising from the adoption of services such as Airbnb and Uber. At the same time, policymakers must consider some unfamiliar issues in responding to smart-city developments, including equity, privacy, algorithmic bias, and data governance. This Forum paper draws on the insights and professional experiences of four individuals with informed perspectives on these questions: Tracey Cook, Executive Director, Municipal Licensing and Standards, City of Toronto; Pamela Robinson, Associate Professor, School of Urban and Regional Planning, Ryerson University; Peter Sloly, Partner and National Security and Justice Lead, Deloitte Canada; and Zachary Spicer, Visiting Researcher, Institute on Municipal Finance and Governance. The report concludes by observing that policymakers must be smart when thinking about the smart city trend and ensure that technologies are not adopted for their promised efficiencies only.
Keywords:	smart city, municipal policy, local governance, digital urbanism
Date:	2018–06
URL:	http://d.repec.org/n?u=RePEc:mfg:iforum:08&r=all

Digital Transformation in Transport, Construction, Energy, Government and Public Administration

By:	BALDINI Gianmarco (European Commission - JRC); BARBONI Marcello (European Commission - JRC); BONO Flavio (European Commission - JRC); DELIPETREV Blagoj (European Commission - JRC); DUCH BROWN Nestor (European Commission - JRC); FERNANDEZ MACIAS Enrique (European Commission - JRC); GKOUMAS Konstantinos (European Commission - JRC); JOOSSENS Elisabeth (European Commission - JRC); KALPAKA Anna (European Commission - JRC); NEPELSKI Daniel (European Commission - JRC); NUNES DE LIMA Maria (European Commission - JRC); PAGANO Andrea (European Commission - JRC); PRETTICO Giuseppe (European Commission - JRC); SANCHEZ MARTIN Jose Ignacio (European Commission - JRC); SOBOLEWSKI Maciej (European Commission - JRC); TRIAILLE Jean Paul (European Commission - JRC); TSAKALIDIS Anastasios (European Commission - JRC); URZI BRANCATI Maria Cesira (European Commission - JRC)
Abstract:	This report provides an analysis of digital transformation (DT) in a selection of policy areas covering transport, construction, energy, and digital government and public administration. DT refers in the report to the profound changes that are taking place in all sectors of the economy and society as a result of the uptake and integration of digital technologies in every aspect of human life. Digital technologies are having increasing impacts on the way of living, of working, on communication, and on social interaction of a growing share of the population. DT is expected to be a strategic policy area for a number of years to come and there is an urgent need to be able to identify and address current and future challenges for the economy and society, evaluating impact and identifying areas requiring policy intervention. Because of the very wide range of interrelated domains to be considered when analysing DT, a multidisciplinary approach was adopted to produce this report, involving experts from different domains. For each of the four sectors that are covered, the report presents an overview of DT, DT enablers and barriers, its economic and social impacts, and concludes with the way forward for policy and future research.
Keywords:	Digital transformation, Construction, Transport, Energy, Digital Government
Date:	2019–12
URL:	http://d.repec.org/n?u=RePEc:ipt:iptwpa:jrc116179&r=all

Exporting and productivity as part of the growth process: Causal evidence from a data-driven structural VAR

By:	Tommaso Ciarli; Alex Coad; Alessio Moneta
Abstract:	This paper introduces a little known category of estimators - Linear Non-Gaussian vector autoregression models that are acyclic or cyclic - imported from the machine learning literature, to revisit a well-known debate. Does exporting increase firm productivity? Or is it only more productive firms that remain in the export market? We focus on a relatively well-studied country (Chile) and on already-exporting firms (i.e. the intensive margin of exporting). We explicitly look at the co-evolution of productivity and growth, and attempt to ascertain both contemporaneous and lagged causal relationships. Our findings suggest that exporting does not have any causal influence on the other variables. Instead, export seems to be determined by other dimensions of firm growth. With respect to learning by exporting (LBE), we find no evidence that export growth causes productivity growth within the period and very little evidence that exporting growth has a causal effect on subsequent TFP growth.
Keywords:	Productivity; Exporting; Learning-by-exporting; Causality; Structural VAR; Independent Component Analysis.
Date:	2019–12–20
URL:	http://d.repec.org/n?u=RePEc:ssa:lemwps:2019/39&r=all

Recalibrating the Reported Returns to Agricultural R&D: What if We All Heeded Griliches?

By:	Rao, Xudong; Hurley, Terrance M.; Pardey, Philip G.
Abstract:	Zvi Griliches’ seminal analysis of hybrid corn spawned a large literature seeking to quantify and demonstrate the value of agricultural research and development (R&D) investments. The most important metric for quantifying the rate of return to R&D emerging from this literature is the internal rate of return (IRR), even though Griliches was skeptical of its usefulness as a metric in this context. An alternative metric, also reported by Griliches but not as commonly used in the subsequent returns-to-research literature, is the benefit-cost ratio (BCR). We assess how the implications of the returns to agricultural R&D literature may have differed if the BCR had become the standard rather than the IRR. We reveal that the IRR and BCR produce substantially different rankings of agricultural R&D projects; differences that persist even under substantial commodity and geographical aggregations of the BCR and IRR estimates. The median across 2,627 reported IRRs is 37.5 percent per year. Using data gleaned from 492 research evaluation studies, we developed and deployed a methodology to impute 2,126 BCRs (median of 5.4) and modified internal rates of returns, MIRRs (16.4 percent per year) assuming a uniform 10 percent per year discount rate and a 30-year research timeline.
Keywords:	Agricultural and Food Policy, Agricultural Finance, Research Methods/ Statistical Methods
Date:	2019–12
URL:	http://d.repec.org/n?u=RePEc:ags:umaesp:298430&r=all

Valuing Private Equity Strip by Strip

By:	Arpit Gupta; Stijn Van Nieuwerburgh
Abstract:	We propose a new valuation method for private equity investments. First, we construct a cash-flow replicating portfolio for the private investment, applying Machine Learning techniques on cash-flows on various listed equity and fixed income instruments. The second step values the replicating portfolio using a flexible asset pricing model that accurately prices the systematic risk in bonds of different maturities and a broad cross-section of equity factors. The method delivers a measure of the risk-adjusted profit earned on a PE investment and a time series for the expected return on PE fund categories. We apply the method to buyout, venture capital, real estate, and infrastructure funds, among others. Accounting for horizon-dependent risk and exposure to a broad cross-section of equity factors results in negative average risk-adjusted profits. Substantial cross-sectional variation and persistence in performance suggests some funds outperform. We also find declining expected returns on PE funds in the later part of the sample.
JEL:	G00 G11 G12 G23 G32 R30 R51
Date:	2019–11
URL:	http://d.repec.org/n?u=RePEc:nbr:nberwo:26514&r=all

Changing Fortunes: Long-Termism—G-Zero, Artificial Intelligence and Debt

By:	Stephen S. Poloz
Abstract:	This paper discusses three long-term forces that are acting on the global economy and their implications for companies and policy-makers: * the transition in geopolitics away from a global order based on international co-operation, or “deglobalization”; * the spread of new technology, particularly artificial intelligence, through the “fourth industrial revolution”; and * the steady buildup of debt—public and private—in most countries. Deglobalization leads to reduced investment and the deconstruction of global value chains, which will reduce global potential economic growth and living standards. The fourth industrial revolution will foster a period of stronger productivity growth and low inflation, accompanied by significant labour market disruptions. High and growing debt levels raise a range of risks associated with financial vulnerabilities. As well, the coincident rise in populism with doubts about the value of central bank independence risks an alignment of incentives between governments and highly indebted households, favouring a return to inflationary policies in the future. The paper concludes with a list of inferences and long-term policy implications. It was developed from a talk first delivered at the Spruce Meadows Changing Fortunes Round Table in Calgary, Alberta, in September 2019.
Keywords:	Financial stability; International topics; Monetary Policy; Trade Integration; Uncertainty and monetary policy
JEL:	E63 F02 F15 F53 F6 H O11 O33
Date:	2019–12
URL:	http://d.repec.org/n?u=RePEc:bca:bocadp:19-12&r=all

CRAN R Package ‘bayesvl: Visually Learning the Graphical Structure of Bayesian Networks and Performing MCMC with 'Stan''

By:	Vuong, Quan-Hoang; La, Viet-Phuong; Ho, Toan Manh (Thanh Tay University Hanoi)
Abstract:	Reference manual for R package "bayesvl: Visually Learning the Graphical Structure of Bayesian Networks and Performing MCMC with 'Stan'" developed by Vuong Quan Hoang and Viet Phuong La. The package is published in The Comprehensive R Archive Network (CRAN). For more information: https://cran.r-project.org/web/packages/ bayesvl/index.html
Date:	2019–05–23
URL:	http://d.repec.org/n?u=RePEc:osf:osfxxx:94fh6&r=all

Global trends towards urban street-network sprawl

By:	Barrington-Leigh, Christopher Paul (McGill University); Millard-Ball, Adam
Abstract:	We present the first global time series of street-network sprawl — that is, sprawl as measured through the local connectivity of the street network. Using high-resolution data from OpenStreetMap and a satellite-derived time series of urbanization, we compute and validate changes over time in multidimensional street connectivity measures based on graph-theoretic and geographic concepts. We report on global, national, and city-level trends since 1975 in the Street-Network Disconnectedness Index (SNDi), based on every mapped node and edge in the world. Streets in new developments in 90% of the 134 most populous countries have become less connected since 1975, while just 29% show an improving trend since 2000. The same period saw a near doubling in the relative frequency of a street-network type characterized by high circuity, typical of gated communities. We identify persistence in street-network sprawl, indicative of path-dependent processes. Specifically, cities and countries with low connectivity in recent years also had relatively low preexisting connectivity in our earliest time period. We discuss implications for policy intervention in road building in new and expanding cities as a top priority for sustainable urban development.
Date:	2019–04–23
URL:	http://d.repec.org/n?u=RePEc:osf:osfxxx:2cp5u&r=all

Human vs. Machine: Disposition Effect Among Algorithmic and Human Day-traders

By:	Karolis Liaudinskas
Abstract:	Can humans achieve rationality, as defined by the expected utility theory, by automating their decision making? We use millisecond-stamped transaction-level data from the Copenhagen Stock Exchange to estimate the disposition effect – the tendency to sell winning but not losing stocks – among algorithmic and human professional day-traders. We find that: (1) the disposition effect is substantial among humans but virtually zero among algorithms; (2) this difference is not fully explained by rational explanations and is, at least partially, attributed to prospect theory, realization utility and beliefs in mean-reversion; (3) the disposition effect harms trading performance, which further deems such behavior irrational.
Keywords:	disposition effect, algorithmic trading, financial markets, rationality, automation
JEL:	D8 D91 G11 G12 G23 O3
Date:	2019–11
URL:	http://d.repec.org/n?u=RePEc:bge:wpaper:1133&r=all

This nep-big issue is ©2019 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.