nep-big New Economics Papers
on Big Data
Issue of 2021‒05‒10
24 papers chosen by
Tom Coupé
University of Canterbury

  1. Trade sentiment and the stock market: new evidence based on big data textual analysis of Chinese media By Amstad, Marlene; Gambacorta, Leonardo; He, Chao; Xia, Fan Dora
  2. Machine Collaboration By Qingfeng Liu; Yang Feng
  3. Business analytics meets artificial intelligence: Assessing the demand effects of discounts on Swiss train tickets By Martin Huber; Jonas Meier; Hannes Wallimann
  4. Detecting bid-rigging coalitions in different countries and auction formats By David Imhof; Hannes Wallimann
  5. Human Biographical Record (HBR) By Nekoei, Arash; Sinn, Fabian
  6. Learning Bermudans By Riccardo Aiolfi; Nicola Moreni; Marco Bianchetti; Marco Scaringi; Filippo Fogliani
  7. Algorithm is Experiment: Machine Learning, Market Design, and Policy Eligibility Rules By Yusuke Narita; Kohei Yata
  8. The Gender Pay Gap Revisited with Big Data: Do Methodological Choices Matter? By STRITTMATTER, Anthony; Wunsch, Conny
  9. Stock Price Forecasting in Presence of Covid-19 Pandemic and Evaluating Performances of Machine Learning Models for Time-Series Forecasting By Navid Mottaghi; Sara Farhangdoost
  10. Algorithm is Experiment: Machine Learning, Market Design, and Policy Eligibility Rules By Yusuke Narita; Kohei Yata
  11. Automatic Debiased Machine Learning via Neural Nets for Generalized Linear Regression By Victor Chernozhukov; Whitney K. Newey; Victor Quintas-Martinez; Vasilis Syrgkanis
  12. The Impact of COVID-19 on Conspiracy Attitudes and Risk Perception in Italy: an Infodemiological Survey through Google Trends. By Rovetta, Alessandro
  13. Household Savings and Monetary Policy under Individual and Aggregate Stochastic Volatility By Gorodnichenko, Yuriy; Maliar, Lilia; Maliar, Serguei; Naubert, Christopher
  14. Methods for small area population forecasts: state-of-the-art and research needs By Wilson, Thomas; Grossman, Irina; Alexander, Monica; Rees, Philip; Temple, Jeromey
  15. Using machine learning and qualitative interviews to design a five-question women's agency index By Biradavolu, Monica; Cooper, Jan; Jayachandran, Seema
  16. Can Machine Learning Catch the COVID-19 Recession? By Goulet Coulombe, Philippe; Marcellino, Massimiliano; Stevanovic, Dalibor
  17. MRC-LSTM: A Hybrid Approach of Multi-scale Residual CNN and LSTM to Predict Bitcoin Price By Qiutong Guo; Shun Lei; Qing Ye; Zhiyang Fang
  18. Using Machine Learning to Analyze Climate Change Technology Transfer (CCTT) By Kulkarni, Shruti
  19. Optimal Targeting in Fundraising: A Machine-Learning Approach By Tobias Cagala; Ulrich Glogowsky; Johannes Rincke; Anthony Strittmatter
  20. AI Watch Index. Policy relevant dimensions to assess Europe’s performance in artificial intelligence By Montserrat Lopez-Cobo; Riccardo Righi; Sofia Samoili; Miguel Vazquez-Prada Baillet; Melisande Cardona; Giuditta De-Prato
  21. Turn, turn, turn: A digital history of German historiography, 1950-2019 By Wehrheim, Lino; Jopp, Tobias Alexander; Spoerer, Mark
  22. The Voice of Monetary Policy By Gorodnichenko, Yuriy; Pham, Tho; Talavera, Oleksandr
  23. Nowcasting with Large Bayesian Vector Autoregressions By Cimadomo, Jacopo; Giannone, Domenico; Lenza, Michele; Monti, Francesca; Sokol, Andrej
  24. Artificial Intelligence, Globalization, and Strategies for Economic Development By Korinek, Anton; Stiglitz, Joseph E

  1. By: Amstad, Marlene; Gambacorta, Leonardo; He, Chao; Xia, Fan Dora
    Abstract: Trade tensions between China and US have played an important role in swinging global stock markets but effects are difficult to quantify. We develop a novel trade sentiment index (TSI) based on textual analysis and machine learning applied on a big data pool that assesses the positive or negative tone of the Chinese media coverage, and evaluates its capacity to explain the behaviour of 60 global equity markets. We find the TSI to contribute around 10% of model capacity to explain the stock price variability from January 2018 to June 2019 in countries that are more exposed to the China-US value chain. Most of the contribution is given by the tone extracted from social media (9%), while that obtained from traditional media explains only a modest part of stock price variability (1%). No equity market benefits from the China-US trade war, and Asian markets tend to be more negatively affected. In particular, we find that sectors most affected by tariffs such as information technology related ones are particularly sensitive to the tone in trade tension.
    Keywords: Big Data; Machine Learning; neural network; sentiment; Stock returns; Trade
    JEL: C45 C55 D80 F13 F14 G15
    Date: 2021–01
  2. By: Qingfeng Liu; Yang Feng
    Abstract: We propose a new ensemble framework for supervised learning, named machine collaboration (MaC), based on a collection of base machines for prediction tasks. Different from bagging/stacking (a parallel & independent framework) and boosting (a sequential & top-down framework), MaC is a type of circular & interactive learning framework. The circular & interactive feature helps the base machines to transfer information circularly and update their own structures and parameters accordingly. The theoretical result on the risk bound of the estimator based on MaC shows that circular & interactive feature can help MaC reduce the risk via a parsimonious ensemble. We conduct extensive experiments on simulated data and 119 benchmark real data sets. The results of the experiments show that in most cases, MaC performs much better than several state-of-the-art methods, including CART, neural network, stacking, and boosting.
    Date: 2021–05
  3. By: Martin Huber; Jonas Meier; Hannes Wallimann
    Abstract: We assess the demand effects of discounts on train tickets issued by the Swiss Federal Railways, the so-called `supersaver tickets', based on machine learning, a subfield of artificial intelligence. Considering a survey-based sample of buyers of supersaver tickets, we investigate which customer- or trip-related characteristics (including the discount rate) predict buying behavior, namely: booking a trip otherwise not realized by train, buying a first- rather than second-class ticket, or rescheduling a trip (e.g.\ away from rush hours) when being offered a supersaver ticket. Predictive machine learning suggests that customer's age, demand-related information for a specific connection (like departure time and utilization), and the discount level permit forecasting buying behavior to a certain extent. Furthermore, we use causal machine learning to assess the impact of the discount rate on rescheduling a trip, which seems relevant in the light of capacity constraints at rush hours. Assuming that (i) the discount rate is quasi-random conditional on our rich set of characteristics and (ii) the buying decision increases weakly monotonically in the discount rate, we identify the discount rate's effect among `always buyers', who would have traveled even without a discount, based on our survey that asks about customer behavior in the absence of discounts. We find that on average, increasing the discount rate by one percentage point increases the share of rescheduled trips by 0.16 percentage points among always buyers. Investigating effect heterogeneity across observables suggests that the effects are higher for leisure travelers and during peak hours when controlling several other characteristics.
    Date: 2021–05
  4. By: David Imhof; Hannes Wallimann
    Abstract: We propose an original application of screening methods using machine learning to detect collusive groups of firms in procurement auctions. As a methodical innovation, we calculate coalition-based screens by forming coalitions of bidders in tenders to flag bid-rigging cartels. Using Swiss, Japanese and Italian procurement data, we investigate the effectiveness of our method in different countries and auction settings, in our cases first-price sealed-bid and mean-price sealed-bid auctions. We correctly classify 90\% of the collusive and competitive coalitions when applying four machine learning algorithms: lasso, support vector machine, random forest, and super learner ensemble method. Finally, we find that coalition-based screens for the variance and the uniformity of bids are in all the cases the most important predictors according the random forest.
    Date: 2021–05
  5. By: Nekoei, Arash; Sinn, Fabian
    Abstract: We construct a new dataset of more than seven million notable individuals across recorded human history, the Human Biographical Record (HBR). With Wikidata as the backbone, HBR adds further information from various digital sources, including Wikipedia in all 292 languages. Machine learning and text analysis combine the sources and extract information on date and place of birth and death, gender, occupation, education, and family background. This paper discusses HBR's construction and its completeness, coverage, accuracy, and also its strength and weakness relative to prior datasets. HBR is the first part of a larger project, the human record project that we briefly introduce.
    Keywords: Bid data; economic history; Machine Learning
    Date: 2021–02
  6. By: Riccardo Aiolfi; Nicola Moreni; Marco Bianchetti; Marco Scaringi; Filippo Fogliani
    Abstract: American and Bermudan-type financial instruments are often priced with specific Monte Carlo techniques whose efficiency critically depends on the effective dimensionality of the problem and the available computational power. In our work we focus on Bermudan Swaptions, well-known interest rate derivatives embedded in callable debt instruments or traded in the OTC market for hedging or speculation purposes, and we adopt an original pricing approach based on Supervised Learning (SL) algorithms. In particular, we link the price of a Bermudan Swaption to its natural hedges, i.e. the underlying European Swaptions, and other sound financial quantities through SL non-parametric regressions. We test different algorithms, from linear models to decision tree-based models and Artificial Neural Networks (ANN), analyzing their predictive performances. All the SL algorithms result to be reliable and fast, allowing to overcome the computational bottleneck of standard Monte Carlo simulations; the best performing algorithms for our problem result to be Ridge, ANN and Gradient Boosted Regression Tree. Moreover, using feature importance techniques, we are able to rank the most important driving factors of a Bermudan Swaption price, confirming that the value of the maximum underlying European Swaption is the prevailing feature.
    Date: 2021–05
  7. By: Yusuke Narita (Cowles Foundation, Yale University); Kohei Yata (Yale University)
    Abstract: Algorithms produce a growing portion of decisions and recommendations both in policy and business. Such algorithmic decisions are natural experiments (conditionally quasirandomly assigned instruments) since the algorithms make decisions based only on observable input variables. We use this observation to develop a treatment-effect estimator for a class of stochastic and deterministic algorithms. Our estimator is shown to be consistent and asymptotically normal for well-defined causal effects. A key special case of our estimator is a high-dimensional regression discontinuity design. The proofs use tools from differential geometry and geometric measure theory, which may be of independent interest. The practical performance of our method is first demonstrated in a high-dimensional simulation resembling decision-making by machine learning algorithms. Our estimator has smaller mean squared errors compared to alternative estimators. We finally apply our estimator to evaluate the effect of Coronavirus Aid, Relief, and Economic Security (CARES) Act, where more than $10 billion worth of relief funding is allocated to hospitals via an algorithmic rule. The estimates suggest that the relief funding has little effect on COVID- 19-related hospital activity levels. Naive OLS and IV estimates exhibit substantial selection bias.
    Date: 2021–04
  8. By: STRITTMATTER, Anthony; Wunsch, Conny
    Abstract: The vast majority of existing studies that estimate the average unexplained gender pay gap use unnecessarily restrictive linear versions of the Blinder-Oaxaca decomposition. Using a notably rich and large data set of 1.7 million employees in Switzerland, we investigate how the methodological improvements made possible by such big data affect estimates of the unexplained gender pay gap. We study the sensitivity of the estimates with regard to i) the availability of observationally comparable men and women, ii) model flexibility when controlling for wage determinants, and iii) the choice of different parametric and semi-parametric estimators, including variants that make use of machine learning methods. We find that these three factors matter greatly. Blinder-Oaxaca estimates of the unexplained gender pay gap decline by up to 39% when we enforce comparability between men and women and use a more flexible specification of the wage equation. Semi-parametric matching yields estimates that when compared with the Blinder-Oaxaca estimates, are up to 50% smaller and also less sensitive to the way wage determinants are included.
    Keywords: Common Support; Gender Inequality; Gender pay gap; Machine Learning; Matching estimator; Model specification
    JEL: C21 J31
    Date: 2021–02
  9. By: Navid Mottaghi; Sara Farhangdoost
    Abstract: With the heightened volatility in stock prices during the Covid-19 pandemic, the need for price forecasting has become more critical. We investigated the forecast performance of four models including Long-Short Term Memory, XGBoost, Autoregression, and Last Value on stock prices of Facebook, Amazon, Tesla, Google, and Apple in COVID-19 pandemic time to understand the accuracy and predictability of the models in this highly volatile time region. To train the models, the data of all stocks are split into train and test datasets. The test dataset starts from January 2020 to April 2021 which covers the COVID-19 pandemic period. The results show that the Autoregression and Last value models have higher accuracy in predicting the stock prices because of the strong correlation between the previous day and the next day's price value. Additionally, the results suggest that the machine learning models (Long-Short Term Memory and XGBoost) are not performing as well as Autoregression models when the market experiences high volatility.
    Date: 2021–05
  10. By: Yusuke Narita (Massachusetts Institute of Technology); Kohei Yata (Yale University)
    Abstract: Algorithms produce a growing portion of decisions and recommendations both in policy and business. Such algorithmic decisions are natural experiments (conditionally quasi-randomly assigned instruments) since the algorithms make decisions based only on observable input variables. We use this observation to develop a treatment-effect estimator for a class of stochastic and deterministic algorithms. Our estimator is shown to be consistent and asymptotically normal for well-defined causal effects. A key special case of our estimator is a high-dimensional regression discontinuity design. The proofs use tools from differential geometry and geometric measure theory, which may be of independent interest. The practical performance of our method is first demonstrated in a high-dimensional simulation resembling decision-making by machine learning algorithms. Our estimator has smaller mean squared errors compared to alternative estimators. We finally apply our estimator to evaluate the effect of Coronavirus Aid, Relief, and Economic Security (CARES) Act, where more than $10 billion worth of relief funding is allocated to hospitals via an algorithmic rule. The estimates suggest that the relief funding has little effects on COVID- 19-related hospital activity levels. Naive OLS and IV estimates exhibit substantial selection bias.
    Keywords: natural experiment, treatment effects, geometric measure theory, COVID-19
    JEL: D70 C90 H51
    Date: 2021–04
  11. By: Victor Chernozhukov; Whitney K. Newey; Victor Quintas-Martinez; Vasilis Syrgkanis
    Abstract: We give debiased machine learners of parameters of interest that depend on generalized linear regressions, which regressions make a residual orthogonal to regressors. The parameters of interest include many causal and policy effects. We give neural net learners of the bias correction that are automatic in only depending on the object of interest and the regression residual. Convergence rates are given for these neural nets and for more general learners of the bias correction. We also give conditions for asymptotic normality and consistent asymptotic variance estimation of the learner of the object of interest.
    Date: 2021–04
  12. By: Rovetta, Alessandro (Mensana srls)
    Abstract: Background: The novel coronavirus disease (COVID-19) caused the worst international crisis since World War II. Italy was one of the countries most affected by both the pandemic and the related infodemic. The success of anti-COVID-19 strategies and future public health policies in Italy cannot prescind from containment of fake news and divulgation of correct information. Objective: The aim of this paper is to analyze the impact of COVID-19 on conspiracy attitudes and risk perception of Italian web users. Methods: Google Trends was used to monitor users' web interest in specific topics, such as conspiracy hypotheses, vaccine side effects, and pollution/climate change. The keywords adopted to represent these topics were mined from – an Italian website specialized in detecting online hoaxes – and Google Trends suggestions (i.e., related topics and related queries). Relative search volumes of the timelapse 2016-2020 (pre-COVID-19) and 2020-2021 (post-COVID-19) were compared through percentage difference (∆_%) and Welch’s t-test (t). When data series were not stationary, other ad-hoc criteria were used. The trend slopes were assessed through Sen's Slope (SS). The significance thresholds have been indicatively set at P=.05 and t=1.9. Results: The COVID-19 pandemic drastically enforced Italian netizens' conspiracy attitudes (Δ_%∈[60,288],t∈[6,12]). The regional web interest towards conspiracy-related queries has increased and become more homogeneous compared to the pre-COVID-19 period ((RSV) ̅=80±2.8,t_min=1.8,Δ_(min%)=+12.4,min ∆_(SD%)=-25.8). Besides, a growing trend in web interest in the infodemic YouTube channel "ByoBlu" has been highlighted. The web interest in fake news has increased more than that in anti-hoax services (t_1=11.3 vs t_2=4.5,Δ_1=+157.6 vs Δ_2=+84.7). Equivalently, the web interest in vaccine side effects exceeded that in pollution/climate change (SS_vac=0.22,P<.001 vs SS_pol=0.05,P<.001,∆_%=+296.4). Conclusions: COVID-19 has given a significant boost to online conspiracy attitudes in Italy. In particular, the average web interest in conspiracy hypotheses has increased and become more uniform across regions. The pandemic accelerated an already growing trend in users' interest towards some fake news sources, including the 500,000 subscribers YouTube channel "ByoBlu" (canceled for disinformation in March 2021). The risk perception related to COVID-19 vaccines has been so distorted that vaccine side effects-related queries outweighed those relating to pollution and climate change, which are much more urgent issues. Based on these findings, it is necessary that the Italian authorities implement more effective infoveillance systems and communication by the mass media is less sensationalistic and more consistent with the available scientific evidence. In this context, Google Trends can be used to monitor the users' response to specific infodemiological countermeasures. Further research is needed to understand the psychological mechanisms that regulate risk perception Keywords: COVID-19, fake news, Google Trends, infodemiology, Italy, risk perception
    Date: 2021–04–24
  13. By: Gorodnichenko, Yuriy; Maliar, Lilia; Maliar, Serguei; Naubert, Christopher
    Abstract: In this paper, we study household consumption-saving and portfolio choices in a heterogeneous-agent economy with sticky prices and time-varying total factor productivity and idiosyncratic stochastic volatility. Agents can save through liquid bonds and illiquid capital and shares. With rich heterogeneity at the household level, we are able to quantify the impact of uncertainty across the income and wealth distribution. Our results help us in identifying who wins and who loses when during periods of heightened individual and aggregate uncertainty. To study the importance of heterogeneity in understanding the transmission of economic shocks, we use a deep learning algorithm. Our method preserves non-linearities, which is essential for understanding the pricing decisions for illiquid assets.
    Keywords: deep learning; HANK; Heterogeneous Agents; Machine Learning; neural network
    Date: 2021–02
  14. By: Wilson, Thomas (The University of Melbourne); Grossman, Irina; Alexander, Monica; Rees, Philip; Temple, Jeromey
    Abstract: Small area population forecasts are widely used by government and business for a variety of planning, research and policy purposes, and often influence major investment decisions. Yet the toolbox of small area population forecasting methods and techniques is modest relative to that for national and large subnational regional forecasting. In this paper we assess the current state of small area population forecasting, and suggest areas for further research. The paper provides a review of the literature on small area population forecasting methods published over the period 2001-2020. The key themes covered by the review are: extrapolative and comparative methods, simplified cohort-component methods, model averaging and combining, incorporating socio-economic variables and spatial relationships, ‘downscaling’ and disaggregation approaches, linking population with housing, estimating and projecting small area component input data, microsimulation, machine learning, and forecast uncertainty. Several avenues for further research are then suggested, including more work on model averaging and combining, developing new forecasting methods for situations which current models cannot handle, quantifying uncertainty, exploring methodologies such as machine learning and spatial statistics, creating user-friendly tools for practitioners, and understanding more about how forecasts are used.
    Date: 2021–04–28
  15. By: Biradavolu, Monica; Cooper, Jan; Jayachandran, Seema
    Abstract: We propose a new method to design a short survey measure of a complex concept such as women's agency. The approach combines mixed-methods data collection and machine learning. We select the best survey questions based on how strongly correlated they are with a "gold standard" measure of the concept derived from qualitative interviews. In our application, we measure agency for 209 women in Haryana, India, first, through a semi-structured interview and, second, through a large set of close-ended questions. We use qualitative coding methods to score each woman's agency based on the interview, which we treat as her true agency. To identify the close-ended questions most predictive of the "truth," we apply statistical algorithms that build on LASSO and random forest but constrain how many variables are selected for the model (five in our case). The resulting five-question index is as strongly correlated with the coded qualitative interview as is an index that uses all of the candidate questions. This approach of selecting survey questions based on their statistical correspondence to coded qualitative interviews could be used to design short survey modules for many other latent constructs.
    Keywords: feature selection; psychometrics; Survey Design; Women's Empowerment
    JEL: C83 D13 J16 O12
    Date: 2021–03
  16. By: Goulet Coulombe, Philippe; Marcellino, Massimiliano; Stevanovic, Dalibor
    Abstract: Based on evidence gathered from a newly built large macroeconomic data set for the UK, labeled UK-MD and comparable to similar datasets for the US and Canada, it seems the most promising avenue for forecasting during the pandemic is to allow for general forms of nonlinearity by using machine learning (ML) methods. But not all nonlinear ML methods are alike. For instance, some do not allow to extrapolate (like regular trees and forests) and some do (when complemented with linear dynamic components). This and other crucial aspects of ML-based forecasting in unprecedented times are studied in an extensive pseudo-out-of-sample exercise.
    Date: 2021–03
  17. By: Qiutong Guo; Shun Lei; Qing Ye; Zhiyang Fang
    Abstract: Bitcoin, one of the major cryptocurrencies, presents great opportunities and challenges with its tremendous potential returns accompanying high risks. The high volatility of Bitcoin and the complex factors affecting them make the study of effective price forecasting methods of great practical importance to financial investors and researchers worldwide. In this paper, we propose a novel approach called MRC-LSTM, which combines a Multi-scale Residual Convolutional neural network (MRC) and a Long Short-Term Memory (LSTM) to implement Bitcoin closing price prediction. Specifically, the Multi-scale residual module is based on one-dimensional convolution, which is not only capable of adaptive detecting features of different time scales in multivariate time series, but also enables the fusion of these features. LSTM has the ability to learn long-term dependencies in series, which is widely used in financial time series forecasting. By mixing these two methods, the model is able to obtain highly expressive features and efficiently learn trends and interactions of multivariate time series. In the study, the impact of external factors such as macroeconomic variables and investor attention on the Bitcoin price is considered in addition to the trading information of the Bitcoin market. We performed experiments to predict the daily closing price of Bitcoin (USD), and the experimental results show that MRC-LSTM significantly outperforms a variety of other network structures. Furthermore, we conduct additional experiments on two other cryptocurrencies, Ethereum and Litecoin, to further confirm the effectiveness of the MRC-LSTM in short-term forecasting for multivariate time series of cryptocurrencies.
    Date: 2021–05
  18. By: Kulkarni, Shruti
    Abstract: The objective of the present paper is to review the current state of climate change technology transfer. This research proposes a method for analyzing climate change technology transfer using patent analysis and topic modeling. A collection of climate change patent data from patent databases would be used as input to group patents in several relevant topics for climate change mitigation using the topic exploration model in this research. The research questions we want to address are: how have patenting activities changed over time in climate change mitigation related technology (CCMT) patents? And who are the technological leaders? The investigation of these questions can offer the technological landscape in climate change-related technologies at the international level. We propose a hybrid Latent Dirichlet Allocation (LDA) approach for topic modelling and identification of relationships between terms and topics related to CCMT, enabling better visualizations of underlying intellectual property dynamics. Further, a predictive model for CCTT is proposed using techniques such as social network analysis (SNA) and, regression analysis. The competitor analysis is also proposed to identify countries with a similar patent landscape. The projected results are expected to facilitate the transfer process associated with existing and emerging climate change technologies and improve technology cooperation between governments.
    Date: 2020–04–25
  19. By: Tobias Cagala; Ulrich Glogowsky; Johannes Rincke; Anthony Strittmatter
    Abstract: Ineffective fundraising lowers the resources charities can use for goods provision. We combine a field experiment and a causal machine-learning approach to increase a charity’s fundraising effectiveness. The approach optimally targets fundraising to individuals whose expected donations exceed solicitation costs. Among past donors, optimal targeting substantially increases donations (net of fundraising costs) relative to benchmarks that target everybody or no one. Instead, individuals who were previously asked but never donated should not be targeted. Further, the charity requires only publicly available geospatial information to realize the gains from targeting. We conclude that charities not engaging in optimal targeting waste resources.
    Keywords: Fundraising; charitable giving; gift exchange; targeting; optimal policy learning; individualized treatment rules
    JEL: C93 D64 H41 L31 C21
    Date: 2021–04
  20. By: Montserrat Lopez-Cobo (European Commission - JRC); Riccardo Righi (European Commission - JRC); Sofia Samoili (European Commission - JRC); Miguel Vazquez-Prada Baillet (European Commission - JRC); Melisande Cardona (European Commission - JRC); Giuditta De-Prato (European Commission - JRC)
    Abstract: This report illustrates and follows the steps to build an AI Watch Index from the collection of outputs of AI Watch analyses. It identifies a set of indicators suitable to provide a comprehensive and balanced overview of the various topics addressed by the research activities carried out within AI Watch. The aim of this index is to provide quantitative indicators on several policy relevant dimensions in order to assess the performance and positioning of the EU and its Member States in Artificial Intelligence. The report describes first a suitable approach for identification of indicators. The methodology aims to identify, select, and collect a number of policy relevant indicators that allow, as much as possible, cross-country and temporal comparability in the evolving international economic, social, industrial and research landscape of artificial intelligence. Then, the report proposes a list of indicators, thoroughly described and organised along the main dimensions covered by AI Watch’s activity.
    Keywords: artificial intelligence, quantitative indicator, index, ai ecosystem, ai industry, ai research and development, ai technology, societal aspects of ai, ai in public sector
    Date: 2021–04
  21. By: Wehrheim, Lino; Jopp, Tobias Alexander; Spoerer, Mark
    Abstract: The increasing availability of digital text collections and the corresponding establishment of methods for computer-assisted analysis open up completely new perspectives on historical textual sources. In this paper, we use the possibilities of text mining to investigate the history of German historiography. The aim of the paper is to use topic models, i.e. methods of automated content analysis, to explore publication trends within German historiography since the end of World War II and, thus, to gain data-based insights into the history of the discipline. For this purpose, we evaluate a text corpus consisting of more than 9,000 articles from eleven leading historiographical journals. The following questions are addressed: (1) Which research subjects mattered, and in how far did this change over time? (2) In how far does this change reflect historiographical paradigm shifts, or 'turns'? (3) Do the data allow to map the emergence of these turns, i.e., can we periodize/historicize them? (4) Which of the proclaimed turns mattered in the sense that it is actually reflected in the research themes we find, and which turn does not?
    Keywords: German historiography,cultural turn,digital history,topic modelling
    JEL: B40 N01
    Date: 2021
  22. By: Gorodnichenko, Yuriy; Pham, Tho; Talavera, Oleksandr
    Abstract: We develop a deep learning model to detect emotions embedded in press conferences after the meetings of the Federal Open Market Committee and examine the influence of the detected emotions on financial markets. We find that, after controlling for the Fed's actions and the sentiment in policy texts, positive tone in the voices of Fed Chairs leads to statistically significant and economically large increases in share prices. In other words, how policy messages are communicated can move the stock market. In contrast, the bond market appears to take few vocal cues from the Chairs. Our results provide implications for improving the effectiveness of central bank communications.
    Keywords: Bond market; communication; Emotion; monetary policy; Stock market; text sentiment; voice
    JEL: D84 E31 E58 G12
    Date: 2021–03
  23. By: Cimadomo, Jacopo; Giannone, Domenico; Lenza, Michele; Monti, Francesca; Sokol, Andrej
    Abstract: Monitoring economic conditions in real time, or nowcasting, and Big Data analytics share some challenges, sometimes called the three "Vs". Indeed, nowcasting is characterized by the use of a large number of time series (Volume), the complexity of the data covering various sectors of the economy, with different frequencies and precision and asynchronous release dates (Variety), and the need to incorporate new information continuously and in a timely manner (Velocity). In this paper, we explore three alternative routes to nowcasting with Bayesian Vector Autoregressive (BVAR) models and find that they can effectively handle the three Vs by producing, in real time, accurate probabilistic predictions of US economic activity and a meaningful narrative by means of scenario analysis.
    Keywords: Big Data; business cycles; Mixed frequency; Nowcasting; Real time; Scenario analysis
    JEL: C01 C33 C53 E32 E37
    Date: 2021–02
  24. By: Korinek, Anton; Stiglitz, Joseph E
    Abstract: Progress in artificial intelligence and related forms of automation technologies threatens to reverse the gains that developing countries and emerging markets have experienced from integrating into the world economy over the past half century, aggravating poverty and inequality. The new technologies have the tendency to be labor-saving, resource-saving, and to give rise to winner-takes-all dynamics that advantage developed countries. We analyze the economic forces behind these developments and describe economic policies that would mitigate the adverse effects on developing and emerging economies while leveraging the potential gains from technological advances. We also describe reforms to our global system of economic governance that would share the benefits of AI more widely with developing countries.
    Keywords: artificial intelligence; inequality; labor-saving progress; terms-of-trade losses
    JEL: D63 F63 O25 O32
    Date: 2021–02

This nep-big issue is ©2021 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.