nep-big New Economics Papers
on Big Data
Issue of 2021‒08‒16
thirty papers chosen by
Tom Coupé
University of Canterbury

  1. Machine Learning and Factor-Based Portfolio Optimization By Thomas Conlon; John Cotter; Iason Kynigakis
  2. Economic Recession Prediction Using Deep Neural Network By Zihao Wang; Kun Li; Steve Q. Xia; Hongfu Liu
  3. Realised Volatility Forecasting: Machine Learning via Financial Word Embedding By Eghbal Rahimikia; Stefan Zohren; Ser-Huang Poon
  4. Machine Learning Classification Methods and Portfolio Allocation: An Examination of Market Efficiency By Yang Bai; Kuntara Pukthuanthong
  5. Application of classification algorithms for the assessment of confirmation to quality remarks By Fabio Zambuto; Simona Arcuti; Roberto Sabatini; Daniele Zambuto
  6. Feature importance recap and stacking models for forex price prediction By Yunze Li; Yanan Xie; Chen Yu; Fangxing Yu; Bo Jiang; Matloob Khushi
  7. The market notices published by the Italian Stock Exchange: a machine learning approach for the selection of the relevant ones By Marta Bernardini; Paolo Massaro; Francesca Pepe; Francesco Tocco
  8. Credit scoring using neural networks and SURE posterior probability calibration By Matthieu Garcin; Samuel Stéphan
  9. Why East Asian students perform better in mathematics than their peers: An investigation using a machine learning approach By Hanol Lee; Jong-Wha Lee
  10. Analyse du marché du travail à l’aide des données de Google Trends By Hugo Couture; Dalibor Stevanovic
  11. LocalGLMnet: interpretable deep learning for tabular data By Ronald Richman; Mario V. W\"uthrich
  12. A Hybrid Learning Approach to Detecting Regime Switches in Financial Markets By Peter Akioyamen; Yi Zhou Tang; Hussien Hussien
  13. Factor Representation and Decision Making in Stock Markets Using Deep Reinforcement Learning By Zhaolu Dong; Shan Huang; Simiao Ma; Yining Qian
  14. Bad machines corrupt good morals By Nils Köbis; Jean-François Bonnefon; Iyad Rahwan
  15. Using Satellite Imagery and Deep Learning to Evaluate the Impact of Anti-Poverty Programs By Luna Yue Huang; Solomon M. Hsiang; Marco Gonzalez-Navarro
  16. Relational Graph Neural Networks for Fraud Detection in a Super-Appe nvironment By Jaime D. Acevedo-Viloria; Luisa Roa; Soji Adeshina; Cesar Charalla Olazo; Andr\'es Rodr\'iguez-Rey; Jose Alberto Ramos; Alejandro Correa-Bahnsen
  17. Nighttime Light Intensity and Child Health Outcomes in Bangladesh By Mohammad Rafiqul Islam; Masud Alam; Munshi Naser \.Ibne Afzal
  18. The Role of Social Movements, Coalitions, and Workers in Resisting Harmful Artificial Intelligence and Contributing to the Development of Responsible AI By Susan von Struensee
  19. Graph-Based Learning for Stock Movement Prediction with Textual and Relational Data By Qinkai Chen; Christian-Yann Robert
  20. Hedging with linear regressions and neural networks By Ruf, Johannes; Wang, Weiguan
  21. The financial market impact of ECB monetary policy press conferences - a text based approach By Parle, Conor
  22. Neural network approximation for superhedging prices By Francesca Biagini; Lukas Gonon; Thomas Reitsam
  23. Deep equal risk pricing of financial derivatives with non-translation invariant risk measures By Alexandre Carbonneau; Fr\'ed\'eric Godin
  24. How Do Workers Adjust When Firms Adopt New Technologies? By Genz, Sabrina; Gregory, Terry; Janser, Markus; Lehmer, Florian; Matthes, Britta
  25. Temporal-Relational Hypergraph Tri-Attention Networks for Stock Trend Prediction By Chaoran Cui; Xiaojie Li; Juan Du; Chunyun Zhang; Xiushan Nie; Meng Wang; Yilong Yin
  26. Automated Identification of Climate Risk Disclosures in Annual Corporate Reports By David Friederich; Lynn H. Kaack; Alexandra Luccioni; Bjarne Steffen
  27. Estimating the effects of universal transfers: new ML approach and application to labor supply reaction to child benefits By Filip Premik
  28. Data-driven mergers and personalization By Zhijun Chen; Chongwoo Choe; Jiajia Cong; Noriaki Matsushima
  29. Financial literacy and individual success: Lebanese framework modeling By Bachir El Murr; Genane Youness; Hala Gharib; Mayssaa Daher
  30. Cross-border Data Regulation for Digital Platforms: Data Privacy and Security By Serzo, Aiken Larisa O.

  1. By: Thomas Conlon; John Cotter; Iason Kynigakis
    Abstract: We examine machine learning and factor-based portfolio optimization. We find that factors based on autoencoder neural networks exhibit a weaker relationship with commonly used characteristic-sorted portfolios than popular dimensionality reduction techniques. Machine learning methods also lead to covariance and portfolio weight structures that diverge from simpler estimators. Minimum-variance portfolios using latent factors derived from autoencoders and sparse methods outperform simpler benchmarks in terms of risk minimization. These effects are amplified for investors with an increased sensitivity to risk-adjusted returns, during high volatility periods or when accounting for tail risk.
    Date: 2021–07
  2. By: Zihao Wang; Kun Li; Steve Q. Xia; Hongfu Liu
    Abstract: We investigate the effectiveness of different machine learning methodologies in predicting economic cycles. We identify the deep learning methodology of Bi-LSTM with Autoencoder as the most accurate model to forecast the beginning and end of economic recessions in the U.S. We adopt commonly-available macro and market-condition features to compare the ability of different machine learning models to generate good predictions both in-sample and out-of-sample. The proposed model is flexible and dynamic when both predictive variables and model coefficients vary over time. It provided good out-of-sample predictions for the past two recessions and early warning about the COVID-19 recession.
    Date: 2021–07
  3. By: Eghbal Rahimikia; Stefan Zohren; Ser-Huang Poon
    Abstract: We develop FinText, a novel, state-of-the-art, financial word embedding from Dow Jones Newswires Text News Feed Database. Incorporating this word embedding in a machine learning model produces a substantial increase in volatility forecasting performance on days with volatility jumps for 23 NASDAQ stocks from 27 July 2007 to 18 November 2016. A simple ensemble model, combining our word embedding and another machine learning model that uses limit order book data, provides the best forecasting performance for both normal and jump volatility days. Finally, we use Integrated Gradients and SHAP (SHapley Additive exPlanations) to make the results more 'explainable' and the model comparisons more transparent.
    Date: 2021–08
  4. By: Yang Bai; Kuntara Pukthuanthong
    Abstract: We design a novel framework to examine market efficiency through out-of-sample (OOS) predictability. We frame the asset pricing problem as a machine learning classification problem and construct classification models to predict return states. The prediction-based portfolios beat the market with significant OOS economic gains. We measure prediction accuracies directly. For each model, we introduce a novel application of binomial test to test the accuracy of 3.34 million return state predictions. The tests show that our models can extract useful contents from historical information to predict future return states. We provide unique economic insights about OOS predictability and machine learning models.
    Date: 2021–08
  5. By: Fabio Zambuto (Bank of Italy); Simona Arcuti (Bank of Italy); Roberto Sabatini (Bank of Italy); Daniele Zambuto
    Abstract: In the context of the data quality management of supervisory banking data, the Bank of Italy receives a significant number of data reports at various intervals from Italian banks. If any anomalies are found, a quality remark is sent back, questioning the data submitted. This process can lead to the bank in question confirming or revising the data it previously transmitted. We propose an innovative methodology, based on text mining and machine learning techniques, for the automatic processing of the data confirmations received from banks. A classification model is employed to predict whether these confirmations should be accepted or rejected based on the reasons provided by the reporting banks, the characteristics of the validation quality checks, and reporting behaviour across the banking system. The model was trained on past cases already labelled by data managers and its performance was assessed against a set of cross-checked cases that were used as gold standard. The empirical findings show that the methodology predicts the correct decisions on recurrent data confirmations and that the performance of the proposed model is comparable to that of data managers currently engaged in data analysis.
    Keywords: supervisory banking data, data quality management, machine learning, text mining, latent dirichlet allocation, gradient boosting.
    JEL: C18 C81 G21
    Date: 2021–07
  6. By: Yunze Li; Yanan Xie; Chen Yu; Fangxing Yu; Bo Jiang; Matloob Khushi
    Abstract: Forex trading is the largest market in terms of qutantitative trading. Traditionally, traders refer to technical analysis based on the historical data to make decisions and trade. With the development of artificial intelligent, deep learning plays a more and more important role in forex forecasting. How to use deep learning models to predict future price is the primary purpose of most researchers. Such prediction not only helps investors and traders make decisions, but also can be used for auto-trading system. In this article, we have proposed a novel approach of feature selection called 'feature importance recap' which combines the feature importance score from tree-based model with the performance of deep learning model. A stacking model is also developed to further improve the performance. Our results shows that proper feature selection approach could significantly improve the model performance, and for financial data, some features have high importance score in many models. The results of stacking model indicate that combining the predictions of some models and feed into a neural network can further improve the performance.
    Date: 2021–07
  7. By: Marta Bernardini (Bank of Italy); Paolo Massaro (Bank of Italy); Francesca Pepe (Bank of Italy); Francesco Tocco (Bank of Italy)
    Abstract: Bank of Italy data managers check the market notices published daily by the Italian Stock Exchange (Borsa Italiana) and select those of interest to update the Bank of Italy's Securities Database. This activity is time-consuming and prone to errors should a data manager overlook a relevant notice. In this paper we describe the implementation of a supervised model to automatically select the market notices. The model outperforms the manual approach used by data managers and can therefore be implemented in the regular process to update the Securities Database.
    Keywords: machine learning, Securities Database, automatic selection, Italian Stock Exchange
    JEL: C18 C81 G23
    Date: 2021–07
  8. By: Matthieu Garcin (ESILV - Ecole Supérieure d'Ingénieurs Léonard de Vinci); Samuel Stéphan (ESILV - Ecole Supérieure d'Ingénieurs Léonard de Vinci, SAMM - Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) - UP1 - Université Paris 1 Panthéon-Sorbonne)
    Abstract: In this article we compare the performances of a logistic regression and a feed forward neural network for credit scoring purposes. Our results show that the logistic regression gives quite good results on the dataset and the neural network can improve a little the performance. We also consider different sets of features in order to assess their importance in terms of prediction accuracy. We found that temporal features (i.e. repeated measures over time) can be an important source of information resulting in an increase in the overall model accuracy. Finally, we introduce a new technique for the calibration of predicted probabilities based on Stein's unbiased risk estimate (SURE). This calibration technique can be applied to very general calibration functions. In particular, we detail this method for the sigmoid function as well as for the Kumaraswamy function, which includes the identity as a particular case. We show that stacking the SURE calibration technique with the classical Platt method can improve the calibration of predicted probabilities.
    Keywords: Deep learning,credit scoring,calibration,SURE
    Date: 2021–07–15
  9. By: Hanol Lee; Jong-Wha Lee
    Abstract: Using a machine learning approach, we attempt to identify the school-, student-, and country-related factors that predict East Asian students’ higher PISA mathematics scores compared to their international peers. We identify student- and school-related factors, such as metacognition–assess credibility, mathematics learning time, early childhood education and care, grade repetition, school type and size, class size, and student behavior hindering learning, as important predictors of the higher average mathematics scores of East Asian students. Moreover, country-level factors, such as the proportion of youth not in education, training, or employment and the number of R&D researchers, are also found to have high predicting power. The results also highlight the nonlinear and complex relationships between educational inputs and outcomes.
    Keywords: education, East Asia, machine learning, mathematics test score, PISA
    JEL: C53 C55 I21 J24 O1
    Date: 2021–07
  10. By: Hugo Couture; Dalibor Stevanovic
    Abstract: In this report, we evaluate the relevance of weekly Google search query data for current and next month prediction on several labour market variables in Canada and Quebec. Several types of mixed-frequency models are considered and their performance is evaluated in an out-of-sample forecasting exercise spanning the period 2014M09 - 2019M09. Google Trends improve the accuracy of forecasts of the employment rate, hours worked and unemployment rate. The availability of this data in high frequency is crucial. Their contribution is important especially during the first two weeks of the month, so when Labor Force Survey data are not yet available for the last month. Dans ce rapport, nous évaluons la pertinence des données hebdomadaires des requêtes faites sur le moteur de recherche de Google au niveau de la prédiction du mois courant et du prochain mois sur plusieurs variables du marché d’emploi au Canada et au Québec. Plusieurs types de modèles en fréquence mixte sont considérés et leur performance est évaluée dans un exercice de prévision hors échantillon s’étalant sur la période 2014M09 - 2019M09. Les Google Trends améliorent la précision des prévisions du taux d’emploi, des heures travaillées et du taux de chômage. La disponibilité de ces données en haute fréquence est cruciale. Leur apport est important surtout durant les deux premières semaines du mois, donc lorsque les données de l’Enquête sur la population active ne sont pas encore disponibles pour le dernier mois.
    Keywords: Forecasting,Macroeconomics,Job market,Google Trends,Machine Learning, Prévision,Macroéconomie,Marché d’emploi,Google Trends,Machine Learning
    JEL: C53 C55 E37
    Date: 2021–08–02
  11. By: Ronald Richman; Mario V. W\"uthrich
    Abstract: Deep learning models have gained great popularity in statistical modeling because they lead to very competitive regression models, often outperforming classical statistical models such as generalized linear models. The disadvantage of deep learning models is that their solutions are difficult to interpret and explain, and variable selection is not easily possible because deep learning models solve feature engineering and variable selection internally in a nontransparent way. Inspired by the appealing structure of generalized linear models, we propose a new network architecture that shares similar features as generalized linear models, but provides superior predictive power benefiting from the art of representation learning. This new architecture allows for variable selection of tabular data and for interpretation of the calibrated deep learning model, in fact, our approach provides an additive decomposition in the spirit of Shapley values and integrated gradients.
    Date: 2021–07
  12. By: Peter Akioyamen (Western University); Yi Zhou Tang (Western University); Hussien Hussien (Western University)
    Abstract: Financial markets are of much interest to researchers due to their dynamic and stochastic nature. With their relations to world populations, global economies and asset valuations, understanding, identifying and forecasting trends and regimes are highly important. Attempts have been made to forecast market trends by employing machine learning methodologies, while statistical techniques have been the primary methods used in developing market regime switching models used for trading and hedging. In this paper we present a novel framework for the detection of regime switches within the US financial markets. Principal component analysis is applied for dimensionality reduction and the k-means algorithm is used as a clustering technique. Using a combination of cluster analysis and classification, we identify regimes in financial markets based on publicly available economic data. We display the efficacy of the framework by constructing and assessing the performance of two trading strategies based on detected regimes.
    Date: 2021–08
  13. By: Zhaolu Dong; Shan Huang; Simiao Ma; Yining Qian
    Abstract: Deep Reinforcement learning is a branch of unsupervised learning in which an agent learns to act based on environment state in order to maximize its total reward. Deep reinforcement learning provides good opportunity to model the complexity of portfolio choice in high-dimensional and data-driven environment by leveraging the powerful representation of deep neural networks. In this paper, we build a portfolio management system using direct deep reinforcement learning to make optimal portfolio choice periodically among S\&P500 underlying stocks by learning a good factor representation (as input). The result shows that an effective learning of market conditions and optimal portfolio allocations can significantly outperform the average market.
    Date: 2021–08
  14. By: Nils Köbis (Center for Humans and Machines); Jean-François Bonnefon (UT1 - Université de Toulouse 1 Capitole); Iyad Rahwan (Center for Humans and Machines)
    Abstract: Machines powered by Artificial Intelligence (AI) are now influencing the behavior of humans in ways that are both like and unlike the ways humans influence each other. In light of recent research showing that other humans can exert a strong corrupting influence on people's ethical behavior, worry emerges about the corrupting power of AI agents. To estimate the empirical validity of these fears, we review the available evidence from behavioral science, human-computer interaction, and AI research. We propose that the main social roles through which both humans and machines can influence ethical behavior are (a) role model, (b) advisor, (c) partner, and (d) delegate. When AI agents become influencers (role models or advisors), their corrupting power may not exceed (yet) the corrupting power of humans. However, AI agents acting as enablers of unethical behavior (partners or delegates) have many characteristics that may let people reap unethical benefits while feeling good about themselves, indicating good reasons for worry. Based on these insights, we outline a research agenda that aims at providing more behavioral insights for better AI oversight.
    Keywords: machine behavior,behavioral ethics,corruption,artificial intelligence
    Date: 2021–06
  15. By: Luna Yue Huang; Solomon M. Hsiang; Marco Gonzalez-Navarro
    Abstract: The rigorous evaluation of anti-poverty programs is key to the fight against global poverty. Traditional approaches rely heavily on repeated in-person field surveys to measure program effects. However, this is costly, time-consuming, and often logistically challenging. Here we provide the first evidence that we can conduct such program evaluations based solely on high-resolution satellite imagery and deep learning methods. Our application estimates changes in household welfare in a recent anti-poverty program in rural Kenya. Leveraging a large literature documenting a reliable relationship between housing quality and household wealth, we infer changes in household wealth based on satellite-derived changes in housing quality and obtain consistent results with the traditional field-survey based approach. Our approach generates inexpensive and timely insights on program effectiveness in international development programs.
    JEL: C8 H0 O1 O22 Q0 R0
    Date: 2021–07
  16. By: Jaime D. Acevedo-Viloria; Luisa Roa; Soji Adeshina; Cesar Charalla Olazo; Andr\'es Rodr\'iguez-Rey; Jose Alberto Ramos; Alejandro Correa-Bahnsen
    Abstract: Large digital platforms create environments where different types of user interactions are captured, these relationships offer a novel source of information for fraud detection problems. In this paper we propose a framework of relational graph convolutional networks methods for fraudulent behaviour prevention in the financial services of a Super-App. To this end, we apply the framework on different heterogeneous graphs of users, devices, and credit cards; and finally use an interpretability algorithm for graph neural networks to determine the most important relations to the classification task of the users. Our results show that there is an added value when considering models that take advantage of the alternative data of the Super-App and the interactions found in their high connectivity, further proofing how they can leverage that into better decisions and fraud detection strategies.
    Date: 2021–07
  17. By: Mohammad Rafiqul Islam; Masud Alam; Munshi Naser \.Ibne Afzal
    Abstract: This study examines the impact of nighttime light intensity on child health outcomes in Bangladesh. We use nighttime light intensity as a proxy measure of urbanization and argue that the higher intensity of nighttime light, the higher is the degree of urbanization, which positively affects child health outcomes. In econometric estimation, we employ a methodology that combines parametric and non-parametric approaches using the Gradient Boosting Machine (GBM), K-Nearest Neighbors (KNN), and Bootstrap Aggregating that originate from machine learning algorithms. Based on our benchmark estimates, findings show that one standard deviation increase of nighttime light intensity is associated with a 1.515 rise of Z-score of weight for age after controlling for several control variables. The maximum increase of weight for height and height for age score range from 5.35 to 7.18 units. To further understand our benchmark estimates, generalized additive models also provide a robust positive relationship between nighttime light intensity and children's health outcomes. Finally, we develop an economic model that supports the empirical findings of this study that the marginal effect of urbanization on children's nutritional outcomes is strictly positive.
    Date: 2021–08
  18. By: Susan von Struensee
    Abstract: There is mounting public concern over the influence that AI based systems has in our society. Coalitions in all sectors are acting worldwide to resist hamful applications of AI. From indigenous people addressing the lack of reliable data, to smart city stakeholders, to students protesting the academic relationships with sex trafficker and MIT donor Jeffery Epstein, the questionable ethics and values of those heavily investing in and profiting from AI are under global scrutiny. There are biased, wrongful, and disturbing assumptions embedded in AI algorithms that could get locked in without intervention. Our best human judgment is needed to contain AI's harmful impact. Perhaps one of the greatest contributions of AI will be to make us ultimately understand how important human wisdom truly is in life on earth.
    Date: 2021–07
  19. By: Qinkai Chen; Christian-Yann Robert
    Abstract: Predicting stock prices from textual information is a challenging task due to the uncertainty of the market and the difficulty understanding the natural language from a machine's perspective. Previous researches focus mostly on sentiment extraction based on single news. However, the stocks on the financial market can be highly correlated, one news regarding one stock can quickly impact the prices of other stocks. To take this effect into account, we propose a new stock movement prediction framework: Multi-Graph Recurrent Network for Stock Forecasting (MGRN). This architecture allows to combine the textual sentiment from financial news and multiple relational information extracted from other financial data. Through an accuracy test and a trading simulation on the stocks in the STOXX Europe 600 index, we demonstrate a better performance from our model than other benchmarks.
    Date: 2021–07
  20. By: Ruf, Johannes; Wang, Weiguan
    Abstract: We study neural networks as nonparametric estimation tools for the hedging of options. To this end, we design a network, named HedgeNet, that directly outputs a hedging strategy. This network is trained to minimize the hedging error instead of the pricing error. Applied to end-of-day and tick prices of S&P 500 and Euro Stoxx 50 options, the network is able to reduce the mean squared hedging error of the Black-Scholes benchmark significantly. However, a similar benefit arises by simple linear regressions that incorporate the leverage effect.
    Keywords: benchmarking; Black-Scholes; data Leakage; hedging error; leverage effect; statistical hedging; Taylor & Francis deal
    JEL: J1 C1
    Date: 2021–06–30
  21. By: Parle, Conor (Central Bank of Ireland)
    Abstract: Using methods from natural language processing I create two measures of the monetary policy tilt of the ECB entitled the “Hawk-Dove Indices”, that outline the beliefs of the ECB on the current state of the economy and the outlook for growth and inflation. These measures closely track interest rate expectations over the tightening and loosening cycle, and can provide a useful measure of monetary policy tilt at zero lower bound episodes and contains information about the state of the economy. I exploit the time lag between decision announcements and the ECB’s monetary policy press conference to assess the immediate financial market impact of changes in communication within the press conference, free from the effects of the shock from the monetary policy decision. Consistent with the literature on the information channel of monetary policy, I find a non-negligible positive (negative) effect on stock prices of a more hawkish (dovish) tone in the press conference, indicating that the ECB reveals “private information” during these press conferences, and that market participants internalise this as good (bad) news regarding the future state of the economy, rather than internalising a future potential increase (decrease) in interest rates. This effect is stronger prior to the introduction of formal forward guidance, suggesting that since then ECB communication has been less surprising to markets in recent times.
    Keywords: Monetary policy, communication, machine learning, natural language processing, event study, information effects
    JEL: E52 E58 C55
    Date: 2021–05
  22. By: Francesca Biagini; Lukas Gonon; Thomas Reitsam
    Abstract: This article examines neural network-based approximations for the superhedging price process of a contingent claim in a discrete time market model. First we prove that the $\alpha$-quantile hedging price converges to the superhedging price at time $0$ for $\alpha$ tending to $1$, and show that the $\alpha$-quantile hedging price can be approximated by a neural network-based price. This provides a neural network-based approximation for the superhedging price at time $0$ and also the superhedging strategy up to maturity. To obtain the superhedging price process for $t>0$, by using the Doob decomposition it is sufficient to determine the process of consumption. We show that it can be approximated by the essential supremum over a set of neural networks. Finally, we present numerical results.
    Date: 2021–07
  23. By: Alexandre Carbonneau; Fr\'ed\'eric Godin
    Abstract: The use of non-translation invariant risk measures within the equal risk pricing (ERP) methodology for the valuation of financial derivatives is investigated. The ability to move beyond the class of convex risk measures considered in several prior studies provides more flexibility within the pricing scheme. In particular, suitable choices for the risk measure embedded in the ERP framework such as the semi-mean-square-error (SMSE) are shown herein to alleviate the price inflation phenomenon observed under Tail Value-at-Risk based ERP as documented for instance in Carbonneau and Godin (2021b). The numerical implementation of non-translation invariant ERP is performed through deep reinforcement learning, where a slight modification is applied to the conventional deep hedging training algorithm (see Buehler et al., 2019) so as to enable obtaining a price through a single training run for the two neural networks associated with the respective long and short hedging strategies. The accuracy of the neural network training procedure is shown in simulation experiments not to be materially impacted by such modification of the training algorithm.
    Date: 2021–07
  24. By: Genz, Sabrina (Institute for Employment Research (IAB), Nuremberg); Gregory, Terry (IZA); Janser, Markus (Institute for Employment Research (IAB), Nuremberg); Lehmer, Florian (Institute for Employment Research (IAB), Nuremberg); Matthes, Britta (Institute for Employment Research (IAB), Nuremberg)
    Abstract: We investigate how workers adjust to firms' investments into new digital technologies, including artificial intelligence, augmented reality, or 3D printing. For this, we collected novel data that links survey information on firms' technology adoption to administrative social security data. We then compare individual outcomes between workers employed at technology adopters relative to non-adopters. Depending on the type of technology, we find evidence for improved employment stability, higher wage growth, and increased cumulative earnings in response to digital technology adoption. These beneficial adjustments seem to be driven by technologies used by service providers rather than manufacturers. However, the adjustments do not occur equally across worker groups: IT-related expert jobs with non-routine analytic tasks benefit most from technological upgrading, coinciding with highly complex job requirements, but not necessarily with more academic skills.
    Keywords: technological change, artificial intelligence, employment stability, wages
    JEL: J23 J31 J62
    Date: 2021–08
  25. By: Chaoran Cui; Xiaojie Li; Juan Du; Chunyun Zhang; Xiushan Nie; Meng Wang; Yilong Yin
    Abstract: Predicting the future price trends of stocks is a challenging yet intriguing problem given its critical role to help investors make profitable decisions. In this paper, we present a collaborative temporal-relational modeling framework for end-to-end stock trend prediction. The temporal dynamics of stocks is firstly captured with an attention-based recurrent neural network. Then, different from existing studies relying on the pairwise correlations between stocks, we argue that stocks are naturally connected as a collective group, and introduce the hypergraph structures to jointly characterize the stock group-wise relationships of industry-belonging and fund-holding. A novel hypergraph tri-attention network (HGTAN) is proposed to augment the hypergraph convolutional networks with a hierarchical organization of intra-hyperedge, inter-hyperedge, and inter-hypergraph attention modules. In this manner, HGTAN adaptively determines the importance of nodes, hyperedges, and hypergraphs during the information propagation among stocks, so that the potential synergies between stock movements can be fully exploited. Extensive experiments on real-world data demonstrate the effectiveness of our approach. Also, the results of investment simulation show that our approach can achieve a more desirable risk-adjusted return. The data and codes of our work have been released at
    Date: 2021–07
  26. By: David Friederich; Lynn H. Kaack; Alexandra Luccioni; Bjarne Steffen
    Abstract: It is important for policymakers to understand which financial policies are effective in increasing climate risk disclosure in corporate reporting. We use machine learning to automatically identify disclosures of five different types of climate-related risks. For this purpose, we have created a dataset of over 120 manually-annotated annual reports by European firms. Applying our approach to reporting of 337 firms over the last 20 years, we find that risk disclosure is increasing. Disclosure of transition risks grows more dynamically than physical risks, and there are marked differences across industries. Country-specific dynamics indicate that regulatory environments potentially have an important role to play for increasing disclosure.
    Date: 2021–08
  27. By: Filip Premik (Group for Research in Applied Economics (GRAPE))
    Abstract: This paper evaluates effects of introduction of a universal child benefit program on female labor supply. Large scale government interventions affect economic outcomes through different channels of various magnitude and direction of the effects. In order to account for this feature, I develop a model in which a woman decides whether to participate in the labor market in a given period. I show how to use the resulting decision rules to explain flows in aggregate labor supply and simulate counterfactual paths of labor force. My framework combines flexibility of reduced form approaches with an appealing structure of dynamic discrete choice models. The model is estimated nonparametrically using recent advances in machine learning methods. The results indicate a 2-4 percentage points drop in labor force among the eligible females, mainly driven by changes in women's perceived trade-offs and beliefs that discouraged inflows.
    Keywords: child benefits, labor supply, program evaluation, difference-in-difference estimation, covariate balancing propensity score
    JEL: C21 C23 I38 J22
    Date: 2021
  28. By: Zhijun Chen; Chongwoo Choe; Jiajia Cong; Noriaki Matsushima
    Abstract: This paper studies tech mergers that involve a large volume of consumer data. The merger links the markets for data collection and data application through a consumption synergy. The merger-specific efficiency gains exist in the market for data application due to the consumption synergy and data-enabled personalization. Prices fall in the market for data collection due to the merged firm's incentives to expand its outreach in the market for data application. But in the market for data application, prices generally rise as the efficiency gains are extracted away through personalized pricing, rather than being passed on to consumers. When the consumption synergy is large enough, the merger can result in monopolization of both markets, with further consumer harm when stand-alone competitors exit in the long run. We discuss policy implications including various merger remedies.
    Date: 2021–11
  29. By: Bachir El Murr (Université Libanaise); Genane Youness (CEDRIC - Centre d'études et de recherche en informatique et communications - ENSIIE - Ecole Nationale Supérieure d'Informatique pour l'Industrie et l'Entreprise - CNAM - Conservatoire National des Arts et Métiers [CNAM]); Hala Gharib; Mayssaa Daher
    Abstract: This paper sheds light on the role the financial literacy features may play amid other determinant factors of individual success. A survey is conducted on a random sample of households' members, based on the individual perception as an assessment criteria of financial literacy status and career success. A non-parametric method, ctree, and a semi-parametric method, the multivariate logistic regression with interaction using random forest, are used. The two models are built to perform supervised learning classifications. They are validated through 10-fold cross validation technique to assure their capability to predict key factors of individual success, among which financial literacy features. It shows that personal and socioeconomic factors do not have any noticeable impact on professional success. Current educational system seems offering light insight on the professional perspectives of individuals. Financial literacy factors
    Keywords: financial literacy,individual success,ctree,multivariate logistic regression,Random Forest
    Date: 2021–06–29
  30. By: Serzo, Aiken Larisa O.
    Abstract: The rise of digital platforms necessarily entails the processing of personal data between platforms and their users. More than enabling the delivery of services by the platforms, data shared by users has increasingly become valuable as various businesses are able to leverage their access to data in order to create and upsell other services. <p>However, the ability of platforms to engage in cross-border transactions or operations are affected by the stringent requirements of data protection laws, coupled with the divergent regulations among jurisdictions. <p>With the Philippines as an example, this paper points out the salient points in existing data protection regulations and the impact of these principles on both platforms and data subjects. <p> Comments to this paper are welcome within 60 days from date of posting. Email
    Keywords: regulatory reform, data privacy, digital platforms, data sharing
    Date: 2020

This nep-big issue is ©2021 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.