nep-cmp New Economics Papers
on Computational Economics
Issue of 2020‒08‒17
twenty papers chosen by

  1. Artificial Neural Networks Performance in WIG20 Index Options Pricing By Maciej Wysocki; Robert Ślepaczuk
  2. Predicting prices of S&P500 index using classical methods and recurrent neural networks By Mateusz Kijewski; Robert Ślepaczuk
  3. Grounded reality meets machine learning: A deep-narrative analysis framework for energy policy research By Debnath, R.; Darby, S.; Bardhan, R.; Mohaddes, K.; Sunikka-Blank, M.
  4. The Pandemics in Artificial Society: Agent-Based Model to Reflect Strategies on COVID-19 By Situngkir, Hokky; Lumbantobing, Andika Bernad
  5. Towards better understanding of complex machine learning models using Explainable Artificial Intelligence (XAI) - case of Credit Scoring modelling By Marta Kłosok; Marcin Chlebus
  6. Tax-Aware Portfolio Construction via Convex Optimization By Nicholas Moehle; Mykel J. Kochenderfer; Stephen Boyd; Andrew Ang
  7. Deep neural network for optimal retirement consumption in defined contribution pension system By Wen Chen; Nicolas Langrené
  8. Pricing equity-linked life insurance contracts with multiple risk factors by neural networks By Karim Barigou; Lukasz Delong
  9. The hard problem of prediction for conflict prevention By Hannes Mueller; Christopher Rauh
  10. All the bottles in one basket? Diversification and product portfolio composition By Friberg, Richard
  11. Pricing equity-linked life insurance contracts with multiple risk factors by neural networks By Karim Barigou; Lukasz Delong
  12. Solving High-Order Portfolios via Successive Convex Approximation Algorithms By Rui Zhou; Daniel P. Palomar
  13. Applications of artificial intelligence technologies on mental health research during COVID-19 By Hossain, Md Mahbub; McKyer, E. Lisako J.; Ma, Ping
  14. Monte-Carlo Simulation Studies in Survey Statistics – An Appraisal By Jan Pablo Burgard; Patricia Dörr; Ralf Münnich
  15. Building(s and) cities: delineating urban areas with a machine learning algorithm By Daniel Arribas-Bel; Miquel-Àngel Garcia-López; Elisabet Viladecans-Marsal
  16. Choosing between explicit cartel formation and tacit collusion – An experiment By Maximilian Andres; Lisa Bruttel; Jana Friedrichsen
  17. Measuring uncertainty at the regional level using newspaper text By Christopher Rauh
  18. The potential influence of machine learning and data science on the future of economics: Overview of highly-cited research By Deshpande, Advait
  19. Mind the gap! Machine learning, ESG metrics and sustainable investment By Ariel Lanza; Enrico Bernardini; Ivan Faiella
  20. Grade Expectations: How well can we predict future grades based on past performance? By Jake Anders; Catherine Dilnot; Lindsey Macmillan; Gill Wyness

  1. By: Maciej Wysocki (Quantitative Finance Research Group; Faculty of Economic Sciences, University of Warsaw); Robert Ślepaczuk (Quantitative Finance Research Group; Faculty of Economic Sciences, University of Warsaw)
    Abstract: In this paper the performance of artificial neural networks in option pricing is analyzed and compared with the results obtained from the Black – Scholes – Merton model based on the historical volatility. The results are compared based on various error metrics calculated separately between three moneyness ratios. The market data-driven approach is taken in order to train and test the neural network on the real-world data from the Warsaw Stock Exchange. The artificial neural network does not provide more accurate option prices. The Black – Scholes – Merton model turned out to be more precise and robust to various market conditions. In addition, the bias of the forecasts obtained from the neural network differs significantly between moneyness states.
    Keywords: option pricing, machine learning, artificial neural networks, implied volatility, supervised learning, index options, Black – Scholes – Merton model
    JEL: C4 C14 C45 C53 C58 G13
    Date: 2020
  2. By: Mateusz Kijewski (Quantitative Finance Research Group; Faculty of Economic Sciences, University of Warsaw); Robert Ślepaczuk (Quantitative Finance Research Group; Faculty of Economic Sciences, University of Warsaw)
    Abstract: This study implements algorithmic investment strategies with buy/sell signals based on classical methods and recurrent neural network model (LSTM). The research compares the performance of investment algorithms on time series of S&P500 index covering 20 years of data from 2000 to 2020. This paper presents an approach for dynamic optimization of parameters during backtesting process by using rolling training-testing window. Every method was tested in terms of robustness to changes in parameters and evaluated by appropriate performance statistics e.g. Information Ratio, Maximum Drawdown, etc. Combination of signals from different methods was stable and outperformed benchmark of Buy & Hold strategy doubling its returns on the same level of risk. Detailed sensitivity analysis revealed that classical methods which used rolling training-testing window were significantly more robust to changes in parameters than LSTM model in which hyperparameters were selected heuristically.
    Keywords: : machine learning, recurrent neural networks, long short-term memory model, time series analysis, algorithmic investment strategies, systematic transactional systems, technical analysis, ARIMA model
    JEL: C4 C14 C45 C53 C58 G13
    Date: 2020
  3. By: Debnath, R.; Darby, S.; Bardhan, R.; Mohaddes, K.; Sunikka-Blank, M.
    Abstract: Text-based data sources like narratives and stories have become increasingly popular as critical insight generator in energy research and social science. However, their implications in policy application usually remain superficial and fail to fully exploit state-of-the-art resources which digital era holds for text analysis. This paper illustrates the potential of deep-narrative analysis in energy policy research using text analysis tools from the cutting-edge domain of computational social sciences, notably topic modelling. We argue that a nested application of topic modelling and grounded theory in narrative analysis promises advances in areas where manual-coding driven narrative analysis has traditionally struggled with directionality biases, scaling, systematisation and repeatability. The nested application of the topic model and the grounded theory goes beyond the frequentist approach of narrative analysis and introduces insight generation capabilities based on the probability distribution of words and topics in a text corpus. In this manner, our proposed methodology deconstructs the corpus and enables the analyst to answer research questions based on the foundational element of the text data structure. We verify theoretical compatibility through a meta-analysis of a state-of-the-art bibliographic database on energy policy, narratives and computational social science. Furthermore, we establish a proof-ofconcept using a narrative-based case study on energy externalities in slum rehabilitation housing in Mumbai, India. We find that the nested application contributes to the literature gap on the need for multidisciplinary methodologies that can systematically include qualitative evidence into policymaking.
    Keywords: energy policy, narratives, topic modelling, computational social science, text analysis, methodological framework
    JEL: Q40 Q48 R28
    Date: 2020–07–14
  4. By: Situngkir, Hokky; Lumbantobing, Andika Bernad
    Abstract: Various social policies and strategies have been deliberated and used within many countries to handle the COVID-19 pandemic. Some of those basic ideas are strongly related to the understanding of human social interactions and the nature of disease transmission and spread. In this paper, we present an agent-based approach to model epidemiological phenomena as well as the interventions upon it. We elaborate on micro-social structures such as social-psychological factors and distributed ruling behaviors to grow an artificial society where the interactions among agents may exhibit the spreading of the virus. Capturing policies and strategies during the pandemic, four types of intervention are also applied in society. Emerged macro-properties of epidemics are delivered from sets of simulations, lead to comparisons between each policy/strategy’s effectivity.
    Keywords: COVID-19, coronavirus disease, policy, pandemic, social simulations, artificial society, agent-based modeling.
    JEL: C9 C99 H89 I1 I18 R0 Z18
    Date: 2020–07–26
  5. By: Marta Kłosok (Faculty of Economic Sciences, University of Warsaw); Marcin Chlebus (Faculty of Economic Sciences, University of Warsaw)
    Abstract: recent years many scientific journals have widely explored the topic of machine learning interpretability. It is important as application of Artificial Intelligence is growing rapidly and its excellent performance is of huge potential for many. There is also need for overcoming the barriers faced by analysts implementing intelligent systems. The biggest one relates to the problem of explaining why the model made a certain prediction. This work brings the topic of methods for understanding a black-box from both the global and local perspective. Numerous agnostic methods aimed at interpreting black-box model behavior and predictions generated by these complex structures are analyzed. Among them are: Permutation Feature Importance, Partial Dependence Plot, Individual Conditional Expectation Curve, Accumulated Local Effects, techniques approximating predictions of the black-box for single observations with surrogate models (interpretable white-boxes) and Shapley values framework. Our prospect leads toward the question to what extent presented tools enhance model transparency. All of the frameworks are examined in practice with a credit default data use case. The overview presented prove that each of the method has some limitations, but overall almost all summarized techniques produce reliable explanations and contribute to higher transparency accountability of decision systems.
    Keywords: machine learning, explainable Artificial Intelligence, visualization techniques, model interpretation, variable importance
    JEL: C25
    Date: 2020
  6. By: Nicholas Moehle; Mykel J. Kochenderfer; Stephen Boyd; Andrew Ang
    Abstract: We describe an optimization-based tax-aware portfolio construction method that adds tax liability to a standard Markowitz-based portfolio construction approach that models expected return, risk, and transaction costs. Our method produces a trade list that specifies the number of shares to buy of each asset and the number of shares to sell from each tax lot held. To avoid wash sales (in which some realized capital losses are disallowed), we assume that we trade monthly, and cannot simultaneously buy and sell the same asset. The tax-aware portfolio construction problem is not convex, but it becomes convex when we specify, for each asset, whether we buy or sell it. It can be solved using standard mixed-integer convex optimization methods at the cost of very long solve times for some problem instances. We present a custom convex relaxation of the problem that borrows curvature from the risk model. This relaxation can provide a good approximation of the true tax liability, while greatly enhancing computational tractability. This method requires the solution of only two convex optimization problems: the first determines whether we buy or sell each asset, and the second generates the final trade list. This method is therefore extremely fast even in the worst case. In our numerical experiments, which are based on a realistic tax-loss harvesting scenario, our method almost always solves the nonconvex problem to optimality, and when in does not, it produces a trade list very close to optimal. Backtests show that the performance of our method is indistinguishable from that obtained using a globally optimal solution, but with significantly reduced computational effort.
    Date: 2020–08
  7. By: Wen Chen (CSIRO - Commonwealth Scientific and Industrial Research Organisation [Canberra]); Nicolas Langrené (CSIRO - Commonwealth Scientific and Industrial Research Organisation [Canberra])
    Abstract: In this paper, we develop a deep neural network approach to solve a lifetime expected mortality-weighted utility-based model for optimal consumption in the decumulation phase of a defined contribution pension system. We formulate this problem as a multi-period finite-horizon stochastic control problem and train a deep neural network policy representing consumption decisions. The optimal consumption policy is determined by personal information about the retiree such as age, wealth, risk aversion and bequest motive, as well as a series of economic and financial variables including inflation rates and asset returns jointly simulated from a proposed seven-factor economic scenario generator calibrated from market data. We use the Australian pension system as an example, with consideration of the government-funded means-tested Age Pension and other practical aspects such as fund management fees. The key findings from our numerical tests are as follows. First, our deep neural network optimal consumption policy, which adapts to changes in market conditions, outperforms deterministic drawdown rules proposed in the literature. Moreover, the out-of-sample outperformance ratios increase as the number of training iterations increases, eventually reaching outperformance on all testing scenarios after less than 10 minutes of training. Second, a sensitivity analysis is performed to reveal how risk aversion and bequest motives change the consumption over a retiree's lifetime under this utility framework. Our results show that stronger risk aversion generates a flatter consumption pattern; however, there is not much difference in consumption with or without bequest until age 103. Third, we provide the optimal consumption rate with different starting wealth balances. We observe that optimal consumption rates are not proportional to initial wealth due to the Age Pension payment. Forth, with the same initial wealth balance and utility parameter settings, the optimal consumption level is different between males and females due to gender differences in mortality. Specifically, the optimal consumption level is slightly lower for females until age 84.
    Abstract: Dans cet article, nous développons une approche par réseau de neurones profond pour résoudre un problème de consommation optimale au cours de la phase de décumulation dans un système de retraite à cotisations définies. Le problème est basé sur un modèle d'espérance d'utilité cumulée au cours de la retraite, pondéré par les probabilités de survie à chaque âge. Nous le formulons comme un problème de commande stochastique multi-période à horizon de temps fini, et nous entraînons un réseau de neurones profond représentant les décisions de consommation. La conduite à suivre optimale en matière de consommation est déterminée par des données personnelles du retraité, telles que son âge, son épargne retraite, son aversion pour le risque et son désir de legs, ainsi que par un ensemble de variables économiques et financières comprenant le taux d'inflation et des rendements d'actifs financiers simulés conjointement par un générateur de scénarios économiques développé pour l'occasion, comprenant sept facteurs et calibré sur des données de marché. Nous prenons comme exemple le système de retraite australien, avec prise en compte de la pension de retraite versée par l'État sous condition de ressources, ainsi que d'autres aspects pratiques tels que les frais de gestion du plan d'épargne retraite privé obligatoire. Nos résultats principaux sont les suivants. Premièrement, les règles de décision de consommation déterminées par le réseau de neurones profond, qui prend en compte et s'adapte aux changements aléatoires des conditions du marché, fait mieux que les règles de dépense déterministes classiques proposées dans la littérature. De plus, le ratio de surperformance sur la base de test croît en fonction du nombre d'itérations de l'algorithme d'apprentissage, pour atteindre 100% après moins de 10 minutes d'apprentissage. Deuxièmement, nous avons réalisé une analyse de sensibilité pour révéler comment l'aversion pour le risque et le désir de legs affectent les décisions de consommation au cours de la retraite dans le cadre de ce modèle d'espérance d'utilité cumulée. Troisièmement, nous avons déterminé le taux de consommation optimal en fonction de l'épargne retraite disponible au moment du départ à la retraite. Nous observons que les taux de consommation optimaux ne sont pas proportionnels à l'épargne retraite initiale du fait de la pension de retraite versée par l'État. Quatrièmement, un homme retraité et une femme retraitée avec la même épargne retraite initiale, la même aversion pour le risque et le même désir de legs n'auront néanmoins pas le même taux de consommation optimal, du fait de l'écart de longévité existant entre hommes et femmes.
    Keywords: decumulation,retirement income,deep learning,stochastic control,economic scenario generator,defined-contribution pension,optimal consumption
    Date: 2020–07–31
  8. By: Karim Barigou (SAF - Laboratoire de Sciences Actuarielle et Financière - UCBL - Université Claude Bernard Lyon 1 - Université de Lyon); Lukasz Delong (Warsaw School of Economics - Institut of Econometrics)
    Abstract: This paper considers the pricing of equity-linked life insurance contracts with death and survival benefits in a general model with multiple stochastic risk factors: interest rate, equity, volatility, unsystematic and systematic mortality. We price the equity-linked contracts by assuming that the insurer hedges the risks to reduce the local variance of the net asset value process and requires a compensation for the non-hedgeable part of the liability in the form of an instantaneous standard deviation risk margin. The price can then be expressed as the solution of a system of non-linear partial differential equations. We reformulate the problem as a backward stochastic differential equation with jumps and solve it numerically by the use of efficient neural networks. Sensitivity analysis is performed with respect to initial parameters and an analysis of the accuracy of the approximation of the true price with our neural networks is provided.
    Keywords: Equity-linked contracts,Neural networks,Stochastic mortality,BSDEs with jumps,Hull-White stochastic interest rates,Heston model
    Date: 2020–07–16
  9. By: Hannes Mueller (Institut d'Analisi Economica (CSIC)); Christopher Rauh (Université de Montréal)
    Abstract: There is a rising interest in conflict prevention and this interest provides a strong motivation for better conflict forecasting. A key problem of conflict forecasting for prevention is that predicting the start of conflict in previously peaceful countries is extremely hard. To make progress in this hard problem this project exploits both supervised and unsupervised machine learning. Specifically, the latent Dirichlet allocation (LDA) model is used for feature extraction from 3.8 million newspaper articles and these features are then used in a random forest model to predict conflict. We find that several features are negatively associated with the outbreak of conflict and these gain importance when predicting hard onsets. This is because the decision tree uses the text features in lower nodes where they are evaluated conditionally on conflict history, which allows the random forest to adapt to the hard problem and provides useful forecasts for prevention.
    Date: 2019–04
  10. By: Friberg, Richard
    Abstract: This paper develops a framework using Monte Carlo simulation to examine risk/return properties of intra-industry product portfolio composition and diversification. We use product-level data covering all Swedish sales of alcoholic beverages to describe the risk profiles of wholesalers and how they are affected by actual and hypothetical changes to product portfolios. Using a large number of counterfactual portfolios we quantify the diversification benefits of different product portfolio compositions. In this market the most important reductions in variability come from focusing on domestic products and from focusing on product categories that have low variability. The number of products also has a large effect in the simulations, moving from a portfolio of 10 products to one of 20 products cuts standard deviation of cash flows in relation to mean cash flows by more than half. The concentration of import origins plays a minor quantitative role on risk/return profiles in this market.
    Keywords: Diversification; Enterprise risk management; Monte Carlo; Product portfolios; Risk-return relation
    Date: 2019–11
  11. By: Karim Barigou (SAF); Lukasz Delong
    Abstract: This paper considers the pricing of equity-linked life insurance contracts with death and survival benefits in a general model with multiple stochastic risk factors: interest rate, equity, volatility, unsystematic and systematic mortality. We price the equity-linked contracts by assuming that the insurer hedges the risks to reduce the local variance of the net asset value process and requires a compensation for the non-hedgeable part of the liability in the form of an instantaneous standard deviation risk margin. The price can then be expressed as the solution of a system of non-linear partial differential equations. We reformulate the problem as a backward stochastic differential equation with jumps and solve it numerically by the use of efficient neural networks. Sensitivity analysis is performed with respect to initial parameters and an analysis of the accuracy of the approximation of the true price with our neural networks is provided.
    Date: 2020–07
  12. By: Rui Zhou; Daniel P. Palomar
    Abstract: The first moment and second central moments of the portfolio return, a.k.a. mean and variance, have been widely employed to assess the expected profit and risk of the portfolio. Investors pursue higher mean and lower variance when designing the portfolios. The two moments can well describe the distribution of the portfolio return when it follows the Gaussian distribution. However, the real world distribution of assets return is usually asymmetric and heavy-tailed, which is far from being a Gaussian distribution. The asymmetry and the heavy-tailedness are characterized by the third and fourth central moments, i.e., skewness and kurtosis, respectively. Higher skewness and lower kurtosis are preferred to reduce the probability of extreme losses. However, incorporating high-order moments in the portfolio design is very difficult due to their non-convexity and rapidly increasing computational cost with the dimension. In this paper, we propose a very efficient and convergence-provable algorithm framework based on the successive convex approximation (SCA) algorithm to solve high-order portfolios. The efficiency of the proposed algorithm framework is demonstrated by the numerical experiments.
    Date: 2020–08
  13. By: Hossain, Md Mahbub; McKyer, E. Lisako J.; Ma, Ping
    Abstract: The coronavirus disease (COVID-19) pandemic has impacted mental health globally. It is essential to deploy advanced research methodologies that may use complex data to draw meaningful inferences facilitating mental health research and policymaking during this pandemic. Artificial intelligence (AI) technologies offer a wide range of opportunities to leverage advancements in data sciences in analyzing health records, behavioral data, social media contents, and outcomes data on mental health. Several studies have reported the use of several AI technologies such as vector machines, neural networks, latent Dirichlet allocation, decision trees, and clustering to detect and treat depression, schizophrenia, Alzheimer’s disease, and other mental health problems. The applications of such technologies in the context of COVID-19 is still under development, which calls for further deployment of AI technologies in mental health research in this pandemic using clinical and psychosocial data through technological partnerships and collaborations. Lastly, policy-level commitment and deployment of resources to facilitate the use of robust AI technologies for assessing and addressing mental health problems during the COVID-19 pandemic.
    Date: 2020–06–23
  14. By: Jan Pablo Burgard; Patricia Dörr; Ralf Münnich
    Abstract: Innovations in statistical methodology is often accompanied by Monte-Carlo studies. In the context of survey statistics two types of inferences have to be considered. First, the classical randomization methods used for developments in statistical modelling. Second, survey data is typically gathered using random sampling schemes from a finite population. In this case, the sampling inference under a finite population model drives statistical conclusions. For empirical analyses, in general, mainly survey data is available. So the question arises how best to conduct the simulation study accompanying the empirical research. In addition, economists and social scientists often use statistical models on the survey data where the statistical inference is based on the classical randomization approach based on the model assumptions. This confounds classical randomization with sampling inference. The question arises under which circumstances – if any – the sampling design can then be ignored. In both fields of research – official statistics and (micro-)econometrics – Monte-Carlo studies generally seek to deliver additional information on an estimator’s distribution. The two named inferences obviously impact distributional assumptions and, hence, must be distinguished in the Monte-Carlo set-up. Both, the conclusions to be drawn and comparability between research results, therefore, depend on inferential assumptions and the consequently adapted simulation study. The present paper gives an overview of the different types of inferences and combinations thereof that are possibly applicable on survey data. Additionally, further types of Monte-Carlo methods are elaborated to provide answers in mixed types of randomization in the survey context as well as under statistical modelling using survey data. The aim is to provide a common understanding of Monte-Carlo based studies using survey data including a thorough discussion of advantages and disadvantages of the different types and their appropriate evaluation.
    Keywords: Monte-Carlo simulation, survey sampling, randomization inference, model inference
    Date: 2020
  15. By: Daniel Arribas-Bel (University of Liverpool); Miquel-Àngel Garcia-López (Universitat Autònoma de Barcelona & IEB); Elisabet Viladecans-Marsal (Universitat de Barcelona & IEB)
    Abstract: This paper proposes a novel methodology for delineating urban areas based on a machine learning algorithm that groups buildings within portions of space of sufficient density. To do so, we use the precise geolocation of all 12 million buildings in Spain. We exploit building heights to create a new dimension for urban areas, namely, the vertical land, which provides a more accurate measure of their size. To better understand their internal structure and to illustrate an additional use for our algorithm, we also identify employment centers within the delineated urban areas. We test the robustness of our method and compare our urban areas to other delineations obtained using administrative borders and commuting-based patterns. We show that: 1) our urban areas are more similar to the commuting-based delineations than the administrative boundaries but that they are more precisely measured; 2) when analyzing the urban areas’ size distribution, Zipf’s law appears to hold for their population, surface and vertical land; and 3) the impact of transportation improvements on the size of the urban areas is not underestimated.
    Keywords: Buildings, urban areas, city size, transportation, machine learning
    JEL: R12 R14 R2 R4
    Date: 2019
  16. By: Maximilian Andres (University of Potsdam); Lisa Bruttel (University of Potsdam); Jana Friedrichsen (HU Berlin, WZB Berlin Social Science Center, DIW Berlin)
    Abstract: Numerous studies investigate which sanctioning institutions prevent cartel formation but little is known as to how these sanctions work. We contribute to understanding the inner workings of cartels by studying experimentally the effect of sanctioning institutions on firms’ communication. Using machine learning to organize the chat communication into topics, we find that firms are significantly less likely to communicate explicitly about price fixing when sanctioning institutions are present. At the same time, average prices are lower when communication is less explicit. A mediation analysis suggests that sanctions are effective in hindering cartel formation not only because they introduce a risk of being fined but also by reducing the prevalence of explicit price communication.
    Keywords: cartel, collusion, communication, machine learning, experiment
    JEL: C92 D43 L41
    Date: 2020–07
  17. By: Christopher Rauh (Université de Montréal)
    Abstract: In this paper I present a methodology to provide uncertainty measures at the regional level in real time using the full bandwidth of news. In order to do so I download vast amounts of newspaper articles, summarize these into topics using unsupervised machine learning, and then show that the resulting topics foreshadow fluctuations in economic indicators. Given large regional disparities in economic performance and trends within countries, it is particularly important to have regional measures for a policymaker to tailor policy responses. I use a vector-autoregression model for the case of Canada, a large and diverse country, to show that the generated topics are significantly related to movements in economic performance indicators, inflation, and the unemployment rate at the national and provincial level. Evidence is provided that a composite index of the generated diverse topics can serve as a measure of uncertainty. Moreover, I show that some topics are general enough to have homogenous associations across provinces, while others are specific to fluctuations in certain regions.
    Keywords: Machine learning, Latent Dirichlet allocation, Newspaper text, Economic uncertainty, Topic model, Canada
    Date: 2019–08
  18. By: Deshpande, Advait
    Abstract: This working paper provides an overview of the potential influence of machine learning and data science on economics as a field. The findings presented are drawn from highly cited research which was identified based on Google Scholar searches. For each of the articles reviewed, this working paper covers what is likely to change and what is likely to remain unchanged in economics due to the emergence and increasing influence of machine learning and data science methods.
    Date: 2020–04–30
  19. By: Ariel Lanza (Kellogg School of Management, Northwestern University (PhD student)); Enrico Bernardini (Banca d'Italia); Ivan Faiella (Banca d'Italia)
    Abstract: This work proposes a novel approach for overcoming the current inconsistencies in ESG scores by using Machine Learning (ML) techniques to identify those indicators that better contribute to the construction of efficient portfolios. ML can achieve this result without needing a model-based methodology, typical of the modern portfolio theory approaches. The ESG indicators identified by our approach show a discriminatory power that also holds after accounting for the contribution of the style factors identified by the Fama-French five-factor model and the macroeconomic factors of the BIRR model. The novelty of the paper is threefold: a) the large array of ESG metrics analysed, b) the model-free methodology ensured by ML and c) the disentangling of the contribution of ESG-specific metrics to the portfolio performance from both the traditional style and macroeconomic factors. According to our results, more information content may be extracted from the available raw ESG data for portfolio construction purposes and half of the ESG indicators identified using our approach are environmental. Among the environmental indicators, some refer to companies' exposure and ability to manage climate change risk, namely the transition risk.
    Keywords: portfolio construction, factor models, sustainable investment, ESG, machine learning
    JEL: C63 G11 Q56
    Date: 2020–06
  20. By: Jake Anders (Centre for Education Policy and Equalising Opportunities, UCL Institute of Education, University College London); Catherine Dilnot (Oxford Brookes Business School); Lindsey Macmillan (Centre for Education Policy and Equalising Opportunities, UCL Institute of Education, University College London); Gill Wyness (Centre for Education Policy and Equalising Opportunities, UCL Institute of Education, University College London)
    Abstract: The Covid-19 pandemic has led to unprecedented disruption of England's education system, including the cancellation of all formal examination. Instead of sitting exams, the class of 2020 will be assigned "calculated grades" based on predictions by their teachers. However, teacher predictions of pupil grades are a common feature of the English education system, with such predictions forming the basis of university applications in normal years. But previous research has shown these predictions are highly inaccurate, creating concern for teachers, pupils and parents. In this paper, we ask whether it is possible to improve on teachers' predictions, using detailed measures of pupils' past performance and non-linear and machine learning approaches. Despite lacking their informal knowledge, we can make modest improvements on the accuracy of teacher predictions with our models, with around 1 in 4 pupils being correctly predicted. We show that predictions are improved where we have information on 'related' GCSEs. We also find heterogeneity in the ability to predict successfully, according to student achievement, school type and subject of study. Notably, high achieving non-selective state school pupils are more likely to be under-predicted compared to their selective state and private school counterparts. Overall, the low rates of prediction, regardless of the approach taken, raises the question as to why predicted grades form such a crucial part of our education system.
    Date: 2020–08

General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.