nep-big New Economics Papers
on Big Data
Issue of 2022‒09‒05
twenty-six papers chosen by
Tom Coupé
University of Canterbury

  1. Machine learning using Stata/Python By Giovanni Cerulli
  2. Long Story Short: Omitted Variable Bias in Causal Machine Learning By Victor Chernozhukov; Carlos Cinelli; Whitney Newey; Amit Sharma; Vasilis Syrgkanis
  3. Estimating Consumer Segments and Choices from Limited Information: The Application of Machine Learning Methods By Qin, Fei; Wu, Steven Y.
  4. Application of machine learning models and interpretability techniques to identify the determinants of the price of bitcoin By José Manuel Carbó; Sergio Gorjón
  5. Sitting Next to a Dropout - Academic Success of Students with More Educated Peers By Daniel Goller; Andrea Diem; Stefan C. Wolter
  6. Machine Learning Methods for Inflation Forecasting in Brazil: new contenders versus classical models By Wagner Piazza Gaglianone; Gustavo Silva Araujo
  7. Augmented Bilinear Network for Incremental Multi-Stock Time-Series Classification By Mostafa Shabani; Dat Thanh Tran; Juho Kanniainen; Alexandros Iosifidis
  8. Using Machine Learning to Capture Heterogeneity in Trade Agreements By Baier, Scott; Regmi, Narendra
  9. Predicciones agregadas de pobreza con información a escala micro y macro: evaluación, diagnóstico y propuestas By Sosa Escudero, Walter; Cornejo, Magdalena
  10. Help Really Wanted? The Impact of Age Stereotypes in Job Ads on Applications from Older Workers By Ian Burn; Daniel Firoozi; Daniel Ladd; David Neumark
  11. Multi-horizon Forecasts of Agricultural Commodity Prices using Deep Learning By Bora, Siddhartha S.; Katchova, Ani
  12. Assortment Optimization with Customer Choice Modeling in a Crowdfunding Setting By Fatemeh Nosrat
  13. The Targeted Assignment of Incentive Schemes By Saskia Opitz; Dirk Sliwka; Timo Vogelsang; Tom Zimmermann
  14. Artificial Intelligence : up to here ... and on again By Aarts, Emile
  15. Skills and employment transitions in Brazil By Adamczyk, Willian.; Ehrl, Philipp,; Monasterio, Leonardo,
  16. Revitalising the Silk Road: Evidence from Railway Infrastructure Investments in Northwest China By Yu, Lamont Bo; Tran, Trang My; Lee, Wang-Sheng
  17. Solving the optimal stopping problem with reinforcement learning: an application in financial option exercise By Leonardo Kanashiro Felizardo; Elia Matsumoto; Emilio Del-Moral-Hernandez
  18. Learning Financial Networks with High-frequency Trade Data By Kara Karpman; Sumanta Basu; David Easley
  19. StockBot: Using LSTMs to Predict Stock Prices By Shaswat Mohanty; Anirudh Vijay; Nandagopan Gopakumar
  20. On Deep Generative Modeling in Economics: An Application with Public Procurement Data By Marcelin Joanis; Andrea Lodi; Igor Sadoune
  21. Mean Convergence, Combinatorics, and Grade-Point Averages By Waddell, Glen R.; McDonough, Robert
  22. Forecasting Algorithms for Causal Inference with Panel Data By Jacob Goldin; Julian Nyarko; Justin Young
  23. Targeted bidders in government tenders By Cappelletti, Matilde; Giuffrida, Leonardo M.
  24. Sentimiento en el Informe de Estabilidad Financiera del Banco Central de Chile By J. Sebastián Becerra; Alejandra Cruces
  25. Development of "Alternative Data Consumption Index":Nowcasting Private Consumption Using Alternative Data By Tomohiro Okubo; Koji Takahashi; Haruhiko Inatsugu; Masato Takahashi
  26. The Impact of Retail Investors Sentiment on Conditional Volatility of Stocks and Bonds By Elroi Hadad; Haim Kedar-Levy

  1. By: Giovanni Cerulli (IRcRES, Rome)
    Abstract: Two related Stata modules, r_ml_stata and c_ml_stata, are presented for
    Date: 2022–07–03
  2. By: Victor Chernozhukov; Carlos Cinelli; Whitney Newey; Amit Sharma; Vasilis Syrgkanis
    Abstract: We derive general, yet simple, sharp bounds on the size of the omitted variable bias for a broad class of causal parameters that can be identified as linear functionals of the conditional expectation function of the outcome. Such functionals encompass many of the traditional targets of investigation in causal inference studies, such as, for example, (weighted) average of potential outcomes, average treatment effects (including subgroup effects, such as the effect on the treated), (weighted) average derivatives, and policy effects from shifts in covariate distribution -- all for general, nonparametric causal models. Our construction relies on the Riesz-Frechet representation of the target functional. Specifically, we show how the bound on the bias depends only on the additional variation that the latent variables create both in the outcome and in the Riesz representer for the parameter of interest. Moreover, in many important cases (e.g, average treatment effects and avearage derivatives) the bound is shown to depend on easily interpretable quantities that measure the explanatory power of the omitted variables. Therefore, simple plausibility judgments on the maximum explanatory power of omitted variables (in explaining treatment and outcome variation) are sufficient to place overall bounds on the size of the bias. Furthermore, we use debiased machine learning to provide flexible and efficient statistical inference on learnable components of the bounds. Finally, empirical examples demonstrate the usefulness of the approach.
    JEL: C14 C21 C31
    Date: 2022–07
  3. By: Qin, Fei; Wu, Steven Y.
    Keywords: Marketing, Research Methods/Statistical Methods, Agribusiness
    Date: 2022–08
  4. By: José Manuel Carbó (Banco de España); Sergio Gorjón (Banco de España)
    Abstract: So-called cryptocurrencies are becoming more popular by the day, with a total market capitalization that exceeded $3 trillion at its peak in 2021. Bitcoin has emerged as the most popular among them, with a total valuation that reached an all-time high of $68,000 in November 2021. However, its price has historically been subject to large and abrupt fluctuations, as the sudden drop in the months that followed once again proved. Since bitcoin looks all set to continue growing while largely concentrating its activity in unregulated environments, concerns have been raised among authorities all over the world about its potential impact on financial stability, monetary policy, and the integrity of the financial system. As a result, building a sound and proper regulatory and supervisory framework to address these challenges hinges upon achieving a better understanding of both the critical underlying factors that influence the formation of bitcoin prices and the stability of such factors over time. In this article we analyse which variables determine the price at which bitcoin is traded on the most relevant exchanges. To this end, we use a flexible machine learning model, specifically a Long Short Term Memory (LSTM) neural network, to establish the price of bitcoin as a function of a number of economic, technological and investor attention variables. Our LSTM model replicates reasonably well the behaviour of the price of bitcoin over different periods of time. We then use an interpretability technique known as SHAP to understand which features most influence the LSTM outcome. We conclude that the importance of the different variables in bitcoin price formation changes substantially over the period analysed. Moreover, we find that not only does their influence vary, but also that new explanatory factors often seem to appear over time that, at least for the most part, were initially unknown.
    Keywords: Bitcoin, machine learning, LSTM, interpretability techniques
    JEL: C40 C45 G12 G15
    Date: 2022–04
  5. By: Daniel Goller; Andrea Diem; Stefan C. Wolter
    Abstract: We investigate the impact of the presence of university dropouts on the academic success of first-time students. Our identification strategy relies on quasi-random variation in the proportion of returning dropouts. The estimated average zero effect of dropouts on first-time students’ success masks treatment heterogeneity and non-linearities. First, we find negative effects on the academic success of their new peers from dropouts re-enrolling in the same subject and, conversely, positive effects of dropouts changing subjects. Second, using causal machine learning methods, we find that the effects vary nonlinearly with different treatment intensities and prevailing treatment levels.
    Keywords: university dropouts, peer effects, better prepared students, causal machine learning
    JEL: A23 C14 I23
    Date: 2022
  6. By: Wagner Piazza Gaglianone; Gustavo Silva Araujo
    Abstract: In this paper, we explore machine learning (ML) methods to improve inflation forecasting in Brazil. An extensive out-of-sample forecasting exercise is designed with multiple horizons, a large database of 501 series, and 50 forecasting methods, including new machine learning techniques proposed here, traditional econometric models and forecast combination methods. We also provide tools to identify the key variables to predict inflation, thus helping to open the ML black box. Despite the evidence of no universal best model, the results indicate machine learning methods can, in numerous cases, outperform traditional econometric models in terms of mean-squared error. Moreover, the results indicate the existence of nonlinearities in the inflation dynamics, which are relevant to forecast inflation. The set of top forecasts often includes forecast combinations, tree-based methods (such as random forest and xgboost), breakeven inflation, and survey-based expectations. Altogether, these findings offer a valuable contribution to macroeconomic forecasting, especially, focused on Brazilian inflation.
    Date: 2022–07
  7. By: Mostafa Shabani; Dat Thanh Tran; Juho Kanniainen; Alexandros Iosifidis
    Abstract: Deep Learning models have become dominant in tackling financial time-series analysis problems, overturning conventional machine learning and statistical methods. Most often, a model trained for one market or security cannot be directly applied to another market or security due to differences inherent in the market conditions. In addition, as the market evolves through time, it is necessary to update the existing models or train new ones when new data is made available. This scenario, which is inherent in most financial forecasting applications, naturally raises the following research question: How to efficiently adapt a pre-trained model to a new set of data while retaining performance on the old data, especially when the old data is not accessible? In this paper, we propose a method to efficiently retain the knowledge available in a neural network pre-trained on a set of securities and adapt it to achieve high performance in new ones. In our method, the prior knowledge encoded in a pre-trained neural network is maintained by keeping existing connections fixed, and this knowledge is adjusted for the new securities by a set of augmented connections, which are optimized using the new data. The auxiliary connections are constrained to be of low rank. This not only allows us to rapidly optimize for the new task but also reduces the storage and run-time complexity during the deployment phase. The efficiency of our approach is empirically validated in the stock mid-price movement prediction problem using a large-scale limit order book dataset. Experimental results show that our approach enhances prediction performance as well as reduces the overall number of network parameters.
    Date: 2022–07
  8. By: Baier, Scott; Regmi, Narendra (Mercury Publication)
    Abstract: Abstract not available.
    Date: 2021–03–25
  9. By: Sosa Escudero, Walter; Cornejo, Magdalena
    Abstract: En este documento se discuten y revisan diversas alternativas para realizar pronósticos de pobreza para varios países de América Latina. El punto de partida es el modelo base desarrollado por CEPAL y luego se generan variantes que exploran estrategias novedosas asociadas a las técnicas de machine learning (aprendizaje automático). Se parte de la construcción de un panel para 12 países de la región entre 2000 y 2019 y se realiza un análisis comparativo de las proyecciones realizadas de las tasas agregadas de pobreza y pobreza extrema. Se evalúan distintas alternativas de pronóstico de pobreza que buscan explotar la naturaleza micro-macro de los datos, la dinámica temporal de las series, la heterogeneidad del panel y el uso de técnicas de machine learning que permiten lidiar con la complejidad de los modelos. El desempeño predictivo fue evaluado tanto a nivel agregado como a través de grupos de individuos (i.e. mujeres, desocupados y jóvenes).
    Date: 2022–07–25
  10. By: Ian Burn; Daniel Firoozi; Daniel Ladd; David Neumark
    Abstract: Correspondence studies have found evidence of age discrimination in callback rates for older workers, but less is known about whether job advertisements can themselves shape the age composition of the applicant pool. We construct job ads for administrative assistant, retail, and security guard jobs, using language from real job ads collected in a prior large-scale correspondence study (Neumark et al., 2019a). We modify the job-ad language to randomly vary whether or not the job ad includes ageist language regarding age-related stereotypes. Our main analysis relies on machine learning methods to design job ads based on the semantic similarity between phrases in job ads and age-related stereotypes. In contrast to a correspondence study in which job searchers are artificial and researchers study the responses of real employers, in our research the job ads are artificial and we study the responses of real job searchers. We find that job-ad language related to ageist stereotypes, even when the language is not blatantly or specifically age-related, deters older workers from applying for jobs. The change in the age distribution of applicants is large, with significant declines in the average and median age, the 75th percentile of the age distribution, and the share of applicants over 40. Based on these estimates and those from the correspondence study, and the fact that we use real-world ageist job-ad language, we conclude that job-ad language that deters older workers from applying for jobs can have roughly as large an impact on hiring of older workers as direct age discrimination in hiring.
    JEL: J14 J6 J7 J78
    Date: 2022–07
  11. By: Bora, Siddhartha S.; Katchova, Ani
    Keywords: Marketing, Agricultural Finance, Agricultural and Food Policy
    Date: 2022–08
  12. By: Fatemeh Nosrat
    Abstract: Crowdfunding, which is the act of raising funds from a large number of people's contributions, is among the most popular research topics in economic theory. Due to the fact that crowdfunding platforms (CFPs) have facilitated the process of raising funds by offering several features, we should take their existence and survival in the marketplace into account. In this study, we investigated the significant role of platform features in a customer behavioral choice model. In particular, we proposed a multinomial logit model to describe the customers' (backers') behavior in a crowdfunding setting. We proceed by discussing the revenue-sharing model in these platforms. For this purpose, we conclude that an assortment optimization problem could be of major importance in order to maximize the platforms' revenue. We were able to derive a reasonable amount of data in some cases and implement two well-known machine learning methods such as multivariate regression and classification problems to predict the best assortments the platform could offer to every arriving customer. We compared the results of these two methods and investigated how well they perform in all cases.
    Date: 2022–07
  13. By: Saskia Opitz (University of Cologne, Faculty of Management, Economics and Social Sciences, Department of Corporate Development); Dirk Sliwka (University of Cologne, Faculty of Management, Economics and Social Sciences, Department of Corporate Development); Timo Vogelsang (Frankfurt School of Finance & Management,Department of Accounting); Tom Zimmermann (University of Cologne, Faculty of Management, Economics and Social Sciences, Department of Corporate Development)
    Abstract: A central question in designing optimal policies concerns the assignment of individuals with different observable characteristics to different treatments. We study this question in the context of increasing workers’ performance by using targeted incentives based on measurable worker characteristics. To do so, we ran two large-scale experiments. The key results are that (i) performance can be predicted by accurately measured personality traits, (ii) a machine learning algorithm can detect such heterogeneity in worker responses to different schemes, and (iii) a targeted assignment of schemes to individual workers increases performance in a second experiment significantly above the level achieved by the single best scheme.
    Keywords: Randomized Controlled Trial, Incentives, Heterogeneity, Treatment Effects, Selection, Algorithm
    JEL: C21 C93 M52
    Date: 2022–08
  14. By: Aarts, Emile (Tilburg University, School of Economics and Management)
    Date: 2022
  15. By: Adamczyk, Willian.; Ehrl, Philipp,; Monasterio, Leonardo,
    Abstract: This paper analyses employment transitions and workers’ skills in Brazil using a random sample from the universe of formal labour contracts covering the period from 2003 to 2018. We develop a novel procedure to derive a measure of occupational distance and internationally comparable skill measures from occupations’ task descriptions in the country under analysis based on machine learning and natural language processing methods, but without usual ad hoc classifications. Our findings confirm that workers who use non-routine cognitive skills intensively experience the highest employment growth rates and wages. Their labour market exit risk is relatively low, occupational and sectoral changes are least common and, in the case of occupational switching, non-routine cognitive workers tend to find occupations that are higher-paid and closer in terms of their task content. Against the same characteristics, routine and non-routine manual workers are worse off in the labour market. Overall, there have been signs of routine-biased technological change and employment polarization since the 2014 Brazilian economic crisis.
    Keywords: employment, cognitive skill, occupational change
    Date: 2022
  16. By: Yu, Lamont Bo (University of Macau); Tran, Trang My (Monash University); Lee, Wang-Sheng (Monash University)
    Abstract: China’s Belt and Road Initiative was introduced in 2013 to revitalise the Silk Road and promote economic development and integration. This paper investigates the economic effects of the opening of the only high-speed rail (HSR) line in northwest China which connects China’s northwestern provinces along this Silk Road land route. We use a recently developed machine-learning extended nightlight data series from 2000 to 2019 and employ the ridge augmented synthetic control method (Ben-Michael et al., 2021) to assess the effects of the HSR line connection on economic activity along this Silk Road land route. We further propose an algorithm that helps automate the donor pool selection process while ensuring optimal pre-treatment fitness. Our results show that there are winners and losers from the opening of the Lanzhou–Urumqi HSR line. While there is some indication of the role that HSR can help play in making progress towards breaking through the Hu Huanyong Line, a geographical demarcation in China that is of vast economic significance, not all counties benefited from the opening of the HSR line.
    Keywords: high-speed railway, augmented synthetic control, Hu Huanyong Line
    JEL: O22 R11 R58
    Date: 2022–07
  17. By: Leonardo Kanashiro Felizardo; Elia Matsumoto; Emilio Del-Moral-Hernandez
    Abstract: The optimal stopping problem is a category of decision problems with a specific constrained configuration. It is relevant to various real-world applications such as finance and management. To solve the optimal stopping problem, state-of-the-art algorithms in dynamic programming, such as the least-squares Monte Carlo (LSMC), are employed. This type of algorithm relies on path simulations using only the last price of the underlying asset as a state representation. Also, the LSMC was thinking for option valuation where risk-neutral probabilities can be employed to account for uncertainty. However, the general optimal stopping problem goals may not fit the requirements of the LSMC showing auto-correlated prices. We employ a data-driven method that uses Monte Carlo simulation to train and test artificial neural networks (ANN) to solve the optimal stopping problem. Using ANN to solve decision problems is not entirely new. We propose a different architecture that uses convolutional neural networks (CNN) to deal with the dimensionality problem that arises when we transform the whole history of prices into a Markovian state. We present experiments that indicate that our proposed architecture improves results over the previous implementations under specific simulated time series function sets. Lastly, we employ our proposed method to compare the optimal exercise of the financial options problem with the LSMC algorithm. Our experiments show that our method can capture more accurate exercise opportunities when compared to the LSMC. We have outstandingly higher (above 974\% improvement) expected payoff from these exercise policies under the many Monte Carlo simulations that used the real-world return database on the out-of-sample (test) data.
    Date: 2022–07
  18. By: Kara Karpman; Sumanta Basu; David Easley
    Abstract: Financial networks are typically estimated by applying standard time series analyses to price-based economic variables collected at low-frequency (e.g., daily or monthly stock returns or realized volatility). These networks are used for risk monitoring and for studying information flows in financial markets. High-frequency intraday trade data sets may provide additional insights into network linkages by leveraging high-resolution information. However, such data sets pose significant modeling challenges due to their asynchronous nature, nonlinear dynamics, and nonstationarity. To tackle these challenges, we estimate financial networks using random forests. The edges in our network are determined by using microstructure measures of one firm to forecast the sign of the change in a market measure (either realized volatility or returns kurtosis) of another firm. We first investigate the evolution of network connectivity in the period leading up to the U.S. financial crisis of 2007-09. We find that the networks have the highest density in 2007, with high degree connectivity associated with Lehman Brothers in 2006. A second analysis into the nature of linkages among firms suggests that larger firms tend to offer better predictive power than smaller firms, a finding qualitatively consistent with prior works in the market microstructure literature.
    Date: 2022–08
  19. By: Shaswat Mohanty; Anirudh Vijay; Nandagopan Gopakumar
    Abstract: The evaluation of the financial markets to predict their behaviour have been attempted using a number of approaches, to make smart and profitable investment decisions. Owing to the highly non-linear trends and inter-dependencies, it is often difficult to develop a statistical approach that elucidates the market behaviour entirely. To this end, we present a long-short term memory (LSTM) based model that leverages the sequential structure of the time-series data to provide an accurate market forecast. We then develop a decision making StockBot that buys/sells stocks at the end of the day with the goal of maximizing profits. We successfully demonstrate an accurate prediction model, as a result of which our StockBot can outpace the market and can strategize for gains that are ~15 times higher than the most aggressive ETFs in the market.
    Date: 2022–07
  20. By: Marcelin Joanis; Andrea Lodi; Igor Sadoune
    Abstract: We propose a solution based on deep generative modeling to the problem of sampling synthetic instances of public procurement auctions from observed data. Our contribution is twofold. First, we overcome the challenges inherent to the replication of multi-level structures commonly seen in auction data, and second, we provide a specific validation procedure to evaluate the faithfulness of the resulting synthetic distributions. More generally, we argue that the generation of reliable artificial data accounts for research design improvements in applications ranging from inference to simulation crafting. In that regard, applied and social sciences can benefit from generative methods that alleviate the hardship of artificial sampling from highly-structured qualitative distributions, so characteristic of real-world data. As we dive deep into the technicalities of such algorithms, this paper can also serve as a general guideline in the context of density estimation for discrete distributions.
    Date: 2022–07
  21. By: Waddell, Glen R. (University of Oregon); McDonough, Robert (University of Oregon)
    Abstract: While comparing students across large differences in GPA follows one's intuition that higher GPAs correlate positively with higher-performing students, this need not be the case locally. Grade-point averaging is fundamentally a combinatorics problem, and thereby challenges inference based on local comparisons—this is especially true when students have experienced only small numbers of classes. While the effect of combinatorics diminishes in larger numbers of classes, mean convergence then has us jeopardize local comparability as GPA better delineates students of different ability. Given these two characteristics in decoding GPA, we discuss the advantages of machine-learning approaches to identifying treatment in educational settings.
    Keywords: GPA, grades, program evaluation, random forest, regression discontinuity
    JEL: I21 I26 C21
    Date: 2022–07
  22. By: Jacob Goldin; Julian Nyarko; Justin Young
    Abstract: Conducting causal inference with panel data is a core challenge in social science research. Advances in forecasting methods can facilitate this task by more accurately predicting the counterfactual evolution of a treated unit had treatment not occurred. In this paper, we draw on a newly developed deep neural architecture for time series forecasting (the N-BEATS algorithm). We adapt this method from conventional time series applications by incorporating leading values of control units to predict a "synthetic" untreated version of the treated unit in the post-treatment period. We refer to the estimator derived from this method as SyNBEATS, and find that it significantly outperforms traditional two-way fixed effects and synthetic control methods across a range of settings. We also find that SyNBEATS attains comparable or more accurate performance relative to more recent panel estimation methods such as matrix completion and synthetic difference in differences. Our results highlight how advances in the forecasting literature can be harnessed to improve causal inference in panel settings.
    Date: 2022–08
  23. By: Cappelletti, Matilde; Giuffrida, Leonardo M.
    Abstract: A set-aside restricts participation in procurement contests to targeted firms. Despite being widely used, its effects on actual competition and contract outcomes are ambiguous. We pool a decade of US federal procurement data to shed light on this empirical question using a two-stage approach. To circumvent the lack of exogenous variation in our data, as a first step we draw on random forest techniques to calculate the likelihood of a tender being set aside. We then estimate the effect of restricted tenders on pre- and postaward outcomes using an inverse probability weighting regression adjustment. Set-asides prompt more firms to bid - that is, the increase in targeted bidders more than offsets the loss of untargeted. During the execution phase, set-aside contracts incur higher cost overruns and delays. The more restrictive the setaside, the stronger these effects. In a subset of our data we leverage an expected spike in set-aside spending and we find no evidence of better performance by winners over a ten-year period.
    Keywords: small businesses,set-aside,competition,procurement,public contracts,random forest,firm dynamics
    JEL: D22 H32 H57 L25
    Date: 2022
  24. By: J. Sebastián Becerra; Alejandra Cruces
    Abstract: The purpose of the Financial Stability Report (FSR) is to report, on a semi-annual basis, recent macroeconomic and financial events that could affect the financial stability of the Chilean economy, such as the evolution of the indebtedness of the main credit users, the performance of the capital market, and the capacity of the financial system and the international financial position to adapt adequately to adverse economic situations. Together with the above, the FSR presents the policies and measures aimed at the normal functioning of the financial system, in order to promote knowledge and public debate on these issues. In this work, a methodology of text mining and sentiment analysis is proposed to estimate the tone of the FSR. Two products are generated from this work, a financial stability dictionary in Spanish and a Financial Sentiment Index. Based on OLS estimates, it is observed that a more optimistic tone of the FSR is in line with higher economic and credit activity, lower volatility in local and foreign financial asset prices, a more capitalized banking sector and lower political and economic uncertainty.
    Date: 2021–11
  25. By: Tomohiro Okubo (Bank of Japan); Koji Takahashi (Bank of Japan); Haruhiko Inatsugu (Bank of Japan); Masato Takahashi (Bank of Japan)
    Abstract: In the field of macroeconomic analysis, there has recently been a growing interest in "alternative data" or nontraditional data whose information sources differ from those of existing statistics. Using alternative data that become timely available, this paper aims to capture developments in Japan's private consumption at the macro level earlier than existing statistics. We construct the "Alternative Data Consumption Index" (ALC) by combining three types of alternative data: (1) credit card transaction data (JCB Consumption NOW); (2) point-of-sale (POS) data (METI POS and GfK); and (3) spending records obtained from a personal financial management service (Money Forward). We nowcast the Consumption Activity Index (CAI), which is compiled and released by the Bank of Japan, using the ALC. With respect to timeliness, the ALC has a significant advantage over the CAI; the ALC for the month is available in the middle of the following month, approximately 3 weeks earlier than the release of the CAI. Our findings show that the ALC is generally accurate in nowcasting the CAI and thus aggregate consumption developments. It also accurately captures the substantial changes in consumption activities caused by the spread of COVID-19 since spring 2020. Overall, the results suggest that alternative data can capture macro level consumption activity promptly and accurately, making them a powerful tool for understanding economic conditions.
    Keywords: Nowcasting; Alternative Data; Private Consumption
    JEL: C49 E21 E27
  26. By: Elroi Hadad; Haim Kedar-Levy
    Abstract: We measure bond and stock conditional return volatility as a function of changes in sentiment, proxied by six indicators from the Tel Aviv Stock Exchange. We find that changes in sentiment affect conditional volatilities at different magnitudes and often in an opposite manner in the two markets, subject to market states. We are the first to measure bonds conditional volatility of retail investors sentiment thanks to a unique dataset of corporate bond returns from a limit-order-book with highly active retail traders. This market structure differs from the prevalent OTC platforms, where institutional investors are active yet less prone to sentiment.
    Date: 2022–08

This nep-big issue is ©2022 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.