nep-big New Economics Papers
on Big Data
Issue of 2019‒11‒25
twenty-six papers chosen by
Tom Coupé
University of Canterbury

  1. Hello, World: Artificial intelligence and its use in the public sector By Jamie Berryhill; Kévin Kok Heang; Rob Clogher; Keegan McBride
  2. Machine Learning and Causality: The Impact of Financial Crises on Growth By Andrew J Tiffin
  3. Slave to the Algorithm? Why a 'right to an explanation' is probably not the remedy you are looking for By Edwards, Lilian; Veale, Michael
  4. Using Data Derived from Cellular Phone Locations to Estimate Visitation to Natural Areas: An Application to Water Recreation in New England, USA. By Merrill, Nathaniel; Atkinson, Sarina F.; Mulvaney, Kate K.; Mazzotta, Marisa J.; Bousquin, Justin
  5. Clarity, Surprises, and Further Questions in the Article 29 Working Party Draft Guidance on Automated Decision-Making and Profiling By Veale, Michael; Edwards, Lilian
  6. Bottom-up Leading Macroeconomic Indicators: An Application to Non-Financial Corporate Defaults using Machine Learning By Tyler Pike; Horacio Sapriza; Thomas Zimmermann
  7. A Coherent Framework for Predicting Emerging Market Credit Spreads with Support Vector Regression By Gary S. Anderson; Alena Audzeyeva
  8. From Transactions Data to Economic Statistics: Constructing Real-time, High-frequency, Geographic Measures of Consumer Spending By Aditya Aladangady; Shifrah Aron-Dine; Wendy E. Dunn; Laura Feiveson; Paul Lengermann; Claudia R. Sahm
  9. Introductory Remarks : a speech at "Nontraditional Data, Machine Learning, and Natural Language Processing in Macroeconomics," a research conference sponsored by the Federal Reserve Board, Washington, D.C., October 1, 2019. By Clarida, Richard H.
  10. Predicting Indian stock market using the psycho-linguistic features of financial news By B. Shravan Kumar; Vadlamani Ravi; Rishabh Miglani
  11. Latent Heterogeneity in the Marginal Propensity to Consume By Lewis, Daniel J.; Melcangi, Davide; Pilossoph, Laura
  12. Is Positive Sentiment in Corporate Annual Reports Informative? Evidence from Deep Learning By Mehran Azimi; Anup Agrawal
  13. Machine Learning et nouvelles sources de données pour le scoring de crédit By Christophe HURLIN; Christophe PERIGNON
  14. The Behavioral Economics of Artificial Intelligence: Lessons from Experiments with Computer Players By Christoph March
  15. Improving the Accuracy of Economic Measurement with Multiple Data Sources: The Case of Payroll Employment Data By Tomaz Cajner; Leland Crane; Ryan Decker; Adrian Hamins-Puertolas; Christopher J. Kurz
  16. Technological Progress and Monetary Policy: Managing the Fourth Industrial Revolution By Stephen S. Poloz
  18. An Unethical Optimization Principle By Nicholas Beale; Heather Battey; Anthony C. Davison; Robert S. MacKay
  19. Neural networks for option pricing and hedging: a literature review By Johannes Ruf; Weiguan Wang
  20. EQC and extreme weather events (part 2): Measuring the impact of insurance on New Zealand landslip, storm and flood recovery using nightlights By Sally Owen; Ilan Noy; Jacob Pástor-Paz; David Fleming
  21. The future of UK Carbon pricing: Artificial Intelligence and the Emissions Trading System By Ojo, Marianne
  22. Many Average Partial Effects: with an Application to Text Regression By Harold D. Chiang
  23. Index Tracking with Cardinality Constraints: A Stochastic Neural Networks Approach By Yu Zheng; Bowei Chen; Timothy M. Hospedales; Yongxin Yang
  24. What if We All Worked Gigs in the Cloud? The Economic Relevance of Digital Labour Platforms By Steven Engels; Monika Sherwood
  25. Using textual analysis to identify merger participants: Evidence from the U.S. banking industry By Katsafados, Apostolos G.; Androutsopoulos, Ion; Chalkidis, Ilias; Fergadiotis, Emmanouel; Leledakis, George N.; Pyrgiotakis, Emmanouil G.
  26. Mining the Automotive Industry: A Network Analysis of Corporate Positioning and Technological Trends By Stoehr, Niklas; Braesemann, Fabian; Zhou, Shi

  1. By: Jamie Berryhill; Kévin Kok Heang; Rob Clogher; Keegan McBride
    Abstract: Artificial Intelligence (AI) is an area of research and technology application that can have a significant impact on public policies and services in many ways. In just a few years, it is expected that the potential will exist to free up nearly one-third of public servants’ time, allowing them to shift from mundane tasks to high-value work. Governments can also use AI to design better policies and make better decisions, improve communication and engagement with citizens and residents, and improve the speed and quality of public services. While the potential benefits of AI are significant, attaining them is not an easy task. Government use of AI trails that of the private sector; the field is complex and has a steep learning curve; and the purpose of, and context within, government are unique and present a number of challenges.
    Date: 2019–11–21
  2. By: Andrew J Tiffin
    Abstract: Machine learning tools are well known for their success in prediction. But prediction is not causation, and causal discovery is at the core of most questions concerning economic policy. Recently, however, the literature has focused more on issues of causality. This paper gently introduces some leading work in this area, using a concrete example—assessing the impact of a hypothetical banking crisis on a country’s growth. By enabling consideration of a rich set of potential nonlinearities, and by allowing individually-tailored policy assessments, machine learning can provide an invaluable complement to the skill set of economists within the Fund and beyond.
    Date: 2019–11–01
  3. By: Edwards, Lilian; Veale, Michael
    Abstract: Cite as Lilian Edwards and Michael Veale, 'Slave to the Algorithm? Why a 'right to an explanation' is probably not the remedy you are looking for' (2017) 16 Duke Law and Technology Review 18–84. (First posted on SSRN 24 May 2017) Algorithms, particularly machine learning (ML) algorithms, are increasingly important to individuals’ lives, but have caused a range of concerns revolving mainly around unfairness, discrimination and opacity. Transparency in the form of a “right to an explanation” has emerged as a compellingly attractive remedy since it intuitively promises to “open the black box” to promote challenge, redress, and hopefully heightened accountability. Amidst the general furore over algorithmic bias we describe, any remedy in a storm has looked attractive. However, we argue that a right to an explanation in the EU General Data Protection Regulation (GDPR) is unlikely to present a complete remedy to algorithmic harms, particularly in some of the core “algorithmic war stories” that have shaped recent attitudes in this domain. Firstly, the law is restrictive, unclear, or even paradoxical concerning when any explanation-related right can be triggered. Secondly, even navigating this, the legal conception of explanations as “meaningful information about the logic of processing” may not be provided by the kind of ML “explanations” computer scientists have developed, partially in response. ML explanations are restricted both by the type of explanation sought, the dimensionality of the domain and the type of user seeking an explanation. However, “subject-centric" explanations (SCEs) focussing on particular regions of a model around a query show promise for interactive exploration, as do explanation systems based on learning a model from outside rather than taking it apart (pedagogical vs decompositional explanations ) in dodging developers' worries of IP or trade secrets disclosure. Based on our analysis, we fear that the search for a “right to an explanation” in the GDPR may be at best distracting, and at worst nurture a new kind of “transparency fallacy.” But all is not lost. We argue that other parts of the GDPR related (i) to the right to erasure ("right to be forgotten") and the right to data portability; and (ii) to privacy by design, Data Protection Impact Assessments and certification and privacy seals, may have the seeds we can use to make algorithms more responsible, explicable, and human-centred.
    Date: 2017–11–18
  4. By: Merrill, Nathaniel; Atkinson, Sarina F.; Mulvaney, Kate K.; Mazzotta, Marisa J.; Bousquin, Justin
    Abstract: We introduce and validate the use of commercially-available datasets of human mobility based on cell phone locational data to estimate visitation to natural areas. By combining this data with on-the-ground observations of visitation to water recreation areas in New England, we fit a model to estimate daily visitation for four months to over 500 sites. The results show the potential for this new big data source of human mobility to overcome limitations in traditional methods of estimating visitation and to provide consistent information at policy-relevant scales. The high-resolution information in both space and time provided by cell phone location derived data creates opportunities for developing next-generation models of human interactions with the natural environment. However, the opaque and rapidly developing methods for processing locational information by the data providers required a calibration and validation against data collected by traditional means to confidently reproduce the desired estimates of visitation.
    Date: 2019–11–08
  5. By: Veale, Michael; Edwards, Lilian
    Abstract: Cite as: Michael Veale and Lilian Edwards, 'Clarity, Surprises, and Further Questions in the Article 29 Working Party Draft Guidance on Automated Decision-Making and Profiling' (forthcoming) Computer Law and Security Review The new Article 29 Data Protection Working Party’s draft guidance on automated decision-making and profiling seeks to clarify the European data protection (DP) law’s little-used right to prevent automated decision-making, as well as the provisions around profiling more broadly, in the run-up to the General Data Protection Regulation. In this paper, we analyse these new guidelines in the context of recent scholarly debates and technological concerns. They foray into the less-trodden areas of bias and non-discrimination, the significance of advertising, the nature of “solely” automated decisions, impacts upon groups and the inference of special categories of data — at times, appearing more to be making or extending rules than to be interpreting them. At the same time, they provide only partial clarity — and perhaps even some extra confusion — around both the much discussed “right to an explanation” and the apparent prohibition on significant automated decisions concerning children. The Working Party appear to feel less mandated to adjudicate in these conflicts between the recitals and the enacting articles than to explore altogether new avenues. Nevertheless, the directions they choose to explore are particularly important ones for the future governance of machine learning and artificial intelligence in Europe and beyond.
    Date: 2017–11–18
  6. By: Tyler Pike; Horacio Sapriza; Thomas Zimmermann
    Abstract: This paper constructs a leading macroeconomic indicator from microeconomic data using recent machine learning techniques. Using tree-based methods, we estimate probabilities of default for publicly traded non-financial firms in the United States. We then use the cross-section of out-of-sample predicted default probabilities to construct a leading indicator of non-financial corporate health. The index predicts real economic outcomes such as GDP growth and employment up to eight quarters ahead. Impulse responses validate the interpretation of the index as a measure of financial stress.
    Keywords: Corporate Default ; Early Warning Indicators ; Economic Activity ; Machine Learning
    JEL: C53 E32 G33
    Date: 2019–09–20
  7. By: Gary S. Anderson; Alena Audzeyeva
    Abstract: We propose a coherent framework using support vector regression (SRV) for generating and ranking a set of high quality models for predicting emerging market sovereign credit spreads. Our framework adapts a global optimization algorithm employing an hv-block cross-validation metric, pertinent for models with serially correlated economic variables, to produce robust sets of tuning parameters for SRV kernel functions. In contrast to previous approaches identifying a single "best" tuning parameter setting, a task that is pragmatically improbable to achieve in many applications, we proceed with a collection of tuning parameter candidates, employing the Model Confidence Set test to select the most accurate models from the collection of promising candidates. Using bond credit spread data for three large emerging market economies and an array of input variables motivated by economic theory, we apply our framework to identify relatively small sets of SVR models with su perior out-of-sample forecasting performance. Benchmarking our SRV forecasts against random walk and conventional linear model forecasts provides evidence for the notably superior forecasting accuracy of SRV-based models. In contrast to routinely used linear model benchmarks, the SRV-based models can generate accurate forecasts using only a small set of input variables limited to the country-specific credit-spread-curve factors, lending some support to the rational expectation theory of the term structure in the context of emerging market credit spreads. Consequently, our evidence indicates a better ability of highly flexible SVR to capture investor expectations about future spreads reflected in today's credit spread curve.
    Keywords: Support vector machine regressions ; Out-of-sample predictability ; Soverign cedit spreads ; Machine learning ; Emerging markets ; Model confidence set
    JEL: G17 F15 G15 F34 F17 C53
    Date: 2019–10–17
  8. By: Aditya Aladangady; Shifrah Aron-Dine; Wendy E. Dunn; Laura Feiveson; Paul Lengermann; Claudia R. Sahm
    Abstract: Access to timely information on consumer spending is important to economic policymakers. The Census Bureau's monthly retail trade survey is a primary source for monitoring consumer spending nationally, but it is not well suited to study localized or short-lived economic shocks. Moreover, lags in the publication of the Census estimates and subsequent, sometimes large, revisions diminish its usefulness for real-time analysis. Expanding the Census survey to include higher frequencies and subnational detail would be costly and would add substantially to respondent burden. We take an alternative approach to fill these information gaps. Using anonymized transactions data from a large electronic payments technology company, we create daily estimates of retail spending at detailed geographies. Our daily estimates are available only a few days after the transactions occur, and the historical time series are available from 2010 to the present. When aggregated to the national leve l, the pattern of monthly growth rates is similar to the official Census statistics. We discuss two applications of these new data for economic analysis: First, we describe how our monthly spending estimates are useful for real-time monitoring of aggregate spending, especially during the government shutdown in 2019, when Census data were delayed and concerns about the economy spiked. Second, we show how the geographic detail allowed us quantify in real time the spending effects of Hurricanes Harvey and Irma in 2017.
    Keywords: Big data ; Consumer spending ; Macroeconomic forecasting
    Date: 2019–08
  9. By: Clarida, Richard H. (Board of Governors of the Federal Reserve System (U.S.))
    Date: 2019–10–01
  10. By: B. Shravan Kumar; Vadlamani Ravi; Rishabh Miglani
    Abstract: Financial forecasting using news articles is an emerging field. In this paper, we proposed hybrid intelligent models for stock market prediction using the psycholinguistic variables (LIWC and TAALES) extracted from news articles as predictor variables. For prediction purpose, we employed various intelligent techniques such as Multilayer Perceptron (MLP), Group Method of Data Handling (GMDH), General Regression Neural Network (GRNN), Random Forest (RF), Quantile Regression Random Forest (QRRF), Classification and regression tree (CART) and Support Vector Regression (SVR). We experimented on the data of 12 companies stocks, which are listed in the Bombay Stock Exchange (BSE). We employed chi-squared and maximum relevance and minimum redundancy (MRMR) feature selection techniques on the psycho-linguistic features obtained from the new articles etc. After extensive experimentation, using the Diebold-Mariano test, we conclude that GMDH and GRNN are statistically the best techniques in that order with respect to the MAPE and NRMSE values.
    Date: 2019–11
  11. By: Lewis, Daniel J. (Federal Reserve Bank of New York); Melcangi, Davide (Federal Reserve Bank of New York); Pilossoph, Laura (Federal Reserve Bank of New York)
    Abstract: We estimate the distribution of marginal propensities to consume (MPCs) using a new approach based on the fuzzy C-means algorithm (Dunn 1973; Bezdek 1981). The algorithm generalizes the K-means methodology of Bonhomme and Manresa (2015) to allow for uncertain group assignment and to recover unobserved heterogeneous effects in cross-sectional and short panel data. We extend the fuzzy C-means approach from the cluster means case to a fully general regression setting and derive asymptotic properties of the corresponding estimators by showing that the problem admits a generalized method of moments (GMM) formulation. We apply the estimator to the 2008 tax rebate and household consumption data, exploiting the randomized timing of disbursements. We find a considerable degree of heterogeneity in MPCs, which varies by consumption good, and provide evidence on their observable determinants, without requiring ex ante assumptions about such relationships. Our aggregated heterogeneous results suggest that the partial equilibrium consumption response to the stimulus was twice as large as what is implied by homogeneous estimates.
    Keywords: marginal propensity to consume; consumption; tax rebate; heterogeneous treatment effects; machine learning; clustering; C-means; K-means
    JEL: D12 D91 E21 E32 E62
    Date: 2019–11–01
  12. By: Mehran Azimi; Anup Agrawal
    Abstract: We use a novel text classification approach from deep learning to more accurately measure sentiment in a large sample of 10-Ks. In contrast to most prior literature, we find that positive, and negative, sentiment predicts abnormal return and abnormal trading volume around 10-K filing date and future firm fundamentals and policies. Our results suggest that the qualitative information contained in corporate annual reports is richer than previously found. Both positive and negative sentiments are informative when measured accurately, but they do not have symmetric implications, suggesting that a net sentiment measure advocated by prior studies would be less informative.
    JEL: C81 G10 G14 G30
    Date: 2019–08–21
  13. By: Christophe HURLIN; Christophe PERIGNON
    Date: 2019
  14. By: Christoph March
    Abstract: Artificial intelligence (AI) is starting to pervade the economic and social life rendering strategic interactions with artificial agents more and more common. At the same time, experimental economic research has increasingly employed computer players to advance our understanding of strategic interaction in general. What can this strand of research teach us about an AI-shaped future? I review 90 experimental studies using computer players. I find that, in a nutshell, humans act more selfishly and more rational in the presence of computer players, and they are often able to exploit these players. Still, many open questions prevail.
    Keywords: experiment, robots, computer players, survey
    JEL: C90 C92 O33
    Date: 2019
  15. By: Tomaz Cajner; Leland Crane; Ryan Decker; Adrian Hamins-Puertolas; Christopher J. Kurz
    Abstract: This paper combines information from two sources of U.S. private payroll employment to increase the accuracy of real-time measurement of the labor market. The sources are the Current Employment Statistics (CES) from BLS and microdata from the payroll processing firm ADP. We briefly describe the ADP-derived data series, compare it to the BLS data, and describe an exercise that benchmarks the data series to an employment census. The CES and the ADP employment data are each derived from roughly equal-sized samples. We argue that combining CES and ADP data series reduces the measurement error inherent in both data sources. In particular, we infer "true" unobserved payroll employment growth using a state-space model and find that the optimal predictor of the unobserved state puts approximately equal weight on the CES and ADP-derived series. Moreover, the estimated state contains information about future readings of payroll employment.
    Keywords: Big data ; Economic measurement ; Labor market ; State-space models
    JEL: J2 J11 C53 C55 C81
    Date: 2019–09–05
  16. By: Stephen S. Poloz
    Abstract: This paper looks at the implications for monetary policy of the widespread adoption of artificial intelligence and machine learning, which is sometimes called the “fourth industrial revolution.” The paper reviews experiences from the previous three industrial revolutions, developing a template of shared characteristics: * new technology displaces workers; * investor hype linked to the new technology leads to financial excesses; * new types of jobs are created; * productivity and potential output rise; * prices and inflation fall; and * real debt burdens increase, which can provoke crises when asset prices crash. The experience of the Federal Reserve during 1995–2006 is particularly instructive. The paper uses the Bank of Canada’s main structural model, ToTEM (Terms-of-Trade Economic Model), to replicate that experience and consider options for monetary policy. Under a Taylor rule, monetary policy may allow growth to run as long as inflation remains subdued, easing the burden of adjustment on those workers directly affected by the new technology, while macroprudential policies help check financial excesses. This argues for a family of Taylor rules enhanced by the addition of financial stability considerations.
    Keywords: Economic models; Financial stability; Monetary policy framework; Uncertainty and monetary policy
    JEL: C5 E3 O11 O33
    Date: 2019–11
  17. By: Alevtina Repina (National Research University Higher School of Economics)
    Abstract: Artificial intelligence is having a transformative effect on the business world. Among others, legal services industry is susceptible to these transformations, but being a part of the legal system, it adopts novelties more slowly than other service-based industries. The issue of AI acceptance for legal services is widely discussed in Russia. The opportunities and threats of AI implementation are the subjects of academic research, business enquires, experts' assessments, and professional community discussions. Still, all those pieces of evidence are biased by the objectives of specific research and methodology used, mostly have no or little empirical data to ground conclusions on. The absence of empirical evidence on the state-of-the-art of AI in legal services and users’ expectations on AI implementation hinders further research in various topics – from legal firms’ management and legal innovations to the lawyering process and access to justice. This paper confirms expert opinions regarding AI technologies and their implementations for legal services, suggesting the cooperation of lawyer and AI in legal service rendering rather than competition. Russian lawyers appear to have the experience of using very advanced AI solutions, including those that are unavailable directly on the Russian market. The expectations of lawyers as users of AI technologies could be described as uncertain, which means that further extension of the AI implementation is still a disputable issue
    Keywords: artificial intelligence, legal services, users, expectations, technology adoption, survey, Russia
    JEL: O33 O14 D22
    Date: 2019
  18. By: Nicholas Beale; Heather Battey; Anthony C. Davison; Robert S. MacKay
    Abstract: If an artificial intelligence aims to maximise risk-adjusted return, then under mild conditions it is disproportionately likely to pick an unethical strategy unless the objective function allows sufficiently for this risk. Even if the proportion ${\eta}$ of available unethical strategies is small, the probability ${p_U}$ of picking an unethical strategy can become large; indeed unless returns are fat-tailed ${p_U}$ tends to unity as the strategy space becomes large. We define an Unethical Odds Ratio Upsilon (${\Upsilon}$) that allows us to calculate ${p_U}$ from ${\eta}$, and we derive a simple formula for the limit of ${\Upsilon}$ as the strategy space becomes large. We give an algorithm for estimating ${\Upsilon}$ and ${p_U}$ in finite cases and discuss how to deal with infinite strategy spaces. We show how this principle can be used to help detect unethical strategies and to estimate ${\eta}$. Finally we sketch some policy implications of this work.
    Date: 2019–11
  19. By: Johannes Ruf; Weiguan Wang
    Abstract: Neural networks have been used as a nonparametric method for option pricing and hedging since the early 1990s. Far over a hundred papers have been published on this topic. This note intends to provide a comprehensive review. Papers are compared in terms of input features, output variables, benchmark models, performance measures, data partition methods, and underlying assets. Furthermore, related work and regularisation techniques are discussed.
    Date: 2019–11
  20. By: Sally Owen (Victoria University of Wellington); Ilan Noy (Victoria University of Wellington); Jacob Pástor-Paz (Motu Economic and Public Policy Research); David Fleming (Motu Economic and Public Policy Research)
    Abstract: Climate change is predicted to make extreme weather events worse and more frequent in many places around the world. In New Zealand, the Earthquake Commission (EQC) was created to provide insurance for earthquakes. In some circumstances, however, homeowners affected by extreme weather events can also make claims to the EQC – for landslip, storm or flood events. In this paper, we explore the impact of this public natural hazard insurance on community recovery from weather-related events. We do this by using a proxy for short-term economic recovery: satellite imagery of average monthly night-time radiance. Linking these night-time light data to precipitation data records, we compare houses which experienced damage from extreme rainfall episodes to those that suffered no damage even though they experienced extreme rainfall. Using data from three recent intense storms, we find that households which experienced damage, and were paid in a timely manner by EQC, did not fare any worse than households that suffered no damage from these extreme events. This finding suggests that EQC insurance is serving its stated purpose by protecting households from the adverse impact of extreme weather events.
    Keywords: climate change, extreme weather, public insurance, recovery, New Zealand
    JEL: Q15 Q10 Q17 Q02
    Date: 2019–11
  21. By: Ojo, Marianne
    Abstract: As well as highlighting factors which should be taken into consideration in the Design of a UK Emissions Trading System, This paper aims to address particularly, the question relating to how “in the absence of historical emissions data, the regulator is able to make an environmentally robust assessment of the eligibility and emissions target of a new entrant for the Small Emitter Opt-Out or the Ultra-Small Emitters Exemption, without undermining the environmental integrity of the system”.
    Keywords: Emissions Trading System; Artificial Intelligence; Vertical Integration; Block chain systems; Sustainable Development; energy; climate, environment; Ultra-Small Emitters Exemption; trade relationships; transparency; information disclosure
    JEL: E6 E62 F1 F17 F18 G3 G38 K2 M4
    Date: 2019–07
  22. By: Harold D. Chiang
    Abstract: We study estimation, pointwise and simultaneous inference, and confidence intervals for many average partial effects of lasso Logit. Focusing on high-dimensional cluster-sampling environments, we propose a new average partial effect estimator and explore its asymptotic properties. Practical penalty choices compatible with our asymptotic theory are also provided. The proposed estimator allow for valid inference without requiring oracle property. We provide easy-to-implement algorithms for cluster-robust high-dimensional hypothesis testing and construction of simultaneously valid confidence intervals using a multiplier cluster bootstrap. We apply the proposed algorithms to the text regression model of Wu (2018) to examine the presence of gendered language on the internet.
    JEL: C23 C25 C55
    Date: 2019–10–06
  23. By: Yu Zheng; Bowei Chen; Timothy M. Hospedales; Yongxin Yang
    Abstract: Partial (replication) index tracking is a popular passive investment strategy. It aims to replicate the performance of a given index by constructing a tracking portfolio which contains some constituents of the index. The tracking error optimisation is quadratic and NP-hard when taking the $\ell_0$ constraint into account so it is usually solved by heuristic methods such as evolutionary algorithms. This paper introduces a simple, efficient and scalable connectionist model as an alternative. We propose a novel reparametrisation method and then solve the optimisation problem with stochastic neural networks. The proposed approach is examined with S\&P 500 index data for more than 10 years and compared with widely used index tracking approaches such as forward and backward selection and the largest market capitalisation methods. The empirical results show our model achieves excellent performance. Compared with the benchmarked models, our model has the lowest tracking error, across a range of portfolio sizes. Meanwhile it offers comparable performance to the others on secondary criteria such as volatility, Sharpe ratio and maximum drawdown.
    Date: 2019–11
  24. By: Steven Engels; Monika Sherwood
    Abstract: This paper explores the increasing diffusion of digital labour platforms, i.e. online software which facilitates the interaction between buyers and sellers of paid labour services through matching algorithms and structured information exchange. Although the phenomenon itself has only recently started to develop, its prevalence is rapidly increasing. We illustrate the various forms digital labour platforms can take, frame the issues they raise in the broader debate on digitalisation and succinctly describe the various angles from which the Commission services have so far approached digital labour platforms in analytical and policy work. The paper also explores the impact the rapid growth of the considered platforms could potentially have on the wider economy and raises three sets of relevant economic policy questions, focusing on: • the contribution of digital labour platforms to overall labour market functioning (including wages) and productivity; • the possible impact of digital labour platforms on macro-economic aggregates such as GDP and total employment at EU and Member State level; • the impact of the growing participation in the labour markets intermediated by online platforms on public finances.
    JEL: J01 E24
    Date: 2019–06
  25. By: Katsafados, Apostolos G.; Androutsopoulos, Ion; Chalkidis, Ilias; Fergadiotis, Emmanouel; Leledakis, George N.; Pyrgiotakis, Emmanouil G.
    Abstract: In this paper, we use the sentiment of annual reports to gauge the likelihood of a bank to participate in a merger transaction. We conduct our analysis on a sample of annual reports of listed U.S. banks over the period 1997 to 2015, using the Loughran and McDonald’s lists of positive and negative words for our textual analysis. We find that a higher frequency of positive (negative) words in a bank’s annual report relates to a higher probability of becoming a bidder (target). Our results remain robust to the inclusion of bank-specific control variables in our logistic regressions.
    Keywords: Textual analysis; text sentiment; bank mergers and acquisitions; acquisition likelihood
    JEL: G00 G17 G21 G34
    Date: 2019–11
  26. By: Stoehr, Niklas; Braesemann, Fabian; Zhou, Shi
    Abstract: The digital transformation is driving revolutionary innovations and new market entrants threaten established sectors of the economy such as the automotive industry. Following the need for monitoring shifting industries, we present a network-centred analysis of car manufacturer web pages. Solely exploiting publicly-available information, we con- struct large networks from web pages and hyperlinks. The network properties disclose the internal corporate positioning of the three largest automotive manufacturers, Toyota, Volkswagen and Hyundai with respect to innovative trends and their international outlook. We tag web pages concerned with topics like e-mobility & environment or autonomous driving, and investigate their relevance in the network. Toyota and Hyundai are concerned with e-mobility throughout large parts of their web page network; Volkswagen devotes more specialized sections to it, but reveals a strong focus on autonomous driving. Sentiment analysis on individual web pages uncovers a relationship between page linking and use of positive language, particularly with respect to innovative trends. Web pages of the same country domain form clusters of different size in the network that reveal strong correlations with sales market orientation. Our approach is highly transparent, reproducible and data driven, and could be used to gain complementary insights into innovative strategies of firms and competitive landscapes.
    Date: 2019–10–09

This nep-big issue is ©2019 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.