nep-big New Economics Papers
on Big Data
Issue of 2021‒12‒06
29 papers chosen by
Tom Coupé
University of Canterbury

  1. What does machine learning say about the drivers of inflation? By Emanuel Kohlscheen
  2. Predicting Mortality from Credit Reports By Giacomo De Giorgi; Matthew Harding; Gabriel Vasconcelos
  3. Deep Learning Market Microstructure: Dual-Stage Attention-Based Recurrent Neural Networks By Chaeshick Chung; Sukjin Park
  4. Modelling Input Energy USED in Wheat Production in India Using Artificial Neural Network By Kaur, Karman; Mehar, Mamta; Prasad, Narayan
  5. License to Spill: How Do We Discuss Spillovers in Article IV Staff Reports By Mr. Mico Mrkaic; Borislava Mircheva; Jelle Barkema; Yuanchen Yang
  6. Credit Risk Database: Credit Scoring Models for Thai SMEs By Bhumjai Tangsawasdirat; Suranan Tanpoonkiat; Burasakorn Tangsatchanan
  7. Nowcasting euro area GDP with news sentiment: a tale of two crises By Saiz, Lorena; Ashwin, Julian; Kalamara, Eleni
  8. Stock Portfolio Optimization Using a Deep Learning LSTM Model By Jaydip Sen; Abhishek Dutta; Sidra Mehtab
  9. Predicting Fiscal Crises: A Machine Learning Approach By Klaus-Peter Hellwig
  10. Explainable Deep Reinforcement Learning for Portfolio Management: An Empirical Approach By Mao Guan; Xiao-Yang Liu
  11. Do search engines increase concentration in media markets? By Joan Calzada; Nestor Duch-Brown; Ricard Gil
  12. The Impact of Gray-Listing on Capital Flows: An Analysis Using Machine Learning By Simon Paetzold; Mizuho Kida
  13. Technological Progress, Artificial Intelligence, and Inclusive Growth By Mr. Martin Schindler; Mr. Anton Korinek; Joseph Stiglitz
  14. How Have IMF Priorities Evolved? A Text Mining Approach By Leandro Medina; Mr. Andrea Gamba; Gareth Anderson; Paolo Galang; Tianxiao Zheng
  15. American Hate Crime Trends Prediction with Event Extraction By Songqiao Han; Hailiang Huang; Jiangwei Liu; Shengsheng Xiao
  16. On the Limits of Design: What Are the Conceptual Constraints on Designing Artificial Intelligence for Social Good? By Jakob Mokander
  17. Labour-saving automation and occupational exposure: a text-similarity measure By Montobbio, Fabio; Staccioli, Jacopo; Virgillito, Maria Enrica; Vivarelli, Marco
  18. Exposure of occupations to technologies of the fourth industrial revolution By Benjamin Meindl; Morgan R. Frank; Joana Mendon\c{c}a
  19. Forecasting the Artificial Intelligence Index Returns: A Hybrid Approach By Yue-Jun Zhang; Han Zhang; Rangan Gupta
  21. Prescriptive selection of machine learning hyperparameters with applications in power markets: retailer's optimal trading By Corredera, Alberto; Ruiz Mora, Carlos
  22. Modelling the transition to a low-carbon energy supply By Alexander Kell
  23. A machine learning dynamic switching approach to forecasting when there are structural breaks By Jeronymo Marcondes Pinto; Jennifer L. Castle
  24. FinRL-Podracer: High Performance and Scalable Deep Reinforcement Learning for Quantitative Finance By Zechu Li; Xiao-Yang Liu; Jiahao Zheng; Zhaoran Wang; Anwar Walid; Jian Guo
  26. How digital technology affects working conditions in globally fragmented production chains: evidence from Europe. By Aleksandra Parteka; Joanna Wolszczak-Derlacz; Dagmara Nikulin
  27. Quantifying Land Use Regulation and its Determinants - Ease of Residential Development across Swiss Municipalities By Simon Büchler, Maximilian v. Ehrlich
  28. Credit growth, the yield curve and financial crisis prediction: evidence from a machine learning approach By Bluwstein, Kristina; Buckmann, Marcus; Joseph, Andreas; Kapadia, Sujit; Şimşek, Özgür
  29. Impact of COVID-19: Nowcasting and Big Data to Track Economic Activity in Sub-Saharan Africa By Reda Cherif; Karl Walentin; Brandon Buell; Carissa Chen; Jiawen Tang; Nils Wendt

  1. By: Emanuel Kohlscheen
    Abstract: This paper examines the drivers of CPI inflation through the lens of a simple, but computationally intensive machine learning technique. More specifically, it predicts inflation across 20 advanced countries between 2000 and 2021, relying on 1,000 regression trees that are constructed based on six key macroeconomic variables. This agnostic, purely data driven method delivers (relatively) good outcome prediction performance. Out of sample root mean square errors (RMSE) systematically beat even the in-sample benchmark econometric models, with a 28% RMSE reduction relative to a naïve AR(1) model and a 8% RMSE reduction relative to OLS. Overall, the results highlight the role of expectations for inflation outcomes in advanced economies, even though their importance appears to have declined somewhat during the last 10 years.
    Keywords: expectations, forecast, inflation, machine learning, oil price, output gap, Phillips curve
    JEL: E27 E30 E31 E37 E52 F41
    Date: 2021–11
  2. By: Giacomo De Giorgi; Matthew Harding; Gabriel Vasconcelos
    Abstract: Data on hundreds of variables related to individual consumer finance behavior (such as credit card and loan activity) is routinely collected in many countries and plays an important role in lending decisions. We postulate that the detailed nature of this data may be used to predict outcomes in seemingly unrelated domains such as individual health. We build a series of machine learning models to demonstrate that credit report data can be used to predict individual mortality. Variable groups related to credit cards and various loans, mostly unsecured loans, are shown to carry significant predictive power. Lags of these variables are also significant thus indicating that dynamics also matters. Improved mortality predictions based on consumer finance data can have important economic implications in insurance markets but may also raise privacy concerns.
    Date: 2021–11
  3. By: Chaeshick Chung (Department of Economics, Sogang University); Sukjin Park (Department of Economics, Sogang University)
    Abstract: This paper applies the Dual-Stage Attention-Based Recurrent Neural Network(DA- RNN) model to predict future price movements using microstructure variables. The biggest feature of the DA-RNN model is that it adaptively selects relevant variables according to market conditions. We analyze whether microstructure variables have predictive power for future price movements, and what factors in uence this predic- tive power. We nd that microstructure variables possess predictive power against the direction of future price movements. This predictive power depends on how many uninformed traders exist in the market. Moreover, the importance of mi- crostructure variables is negatively related to market liquidity. Thus, while mi- crostructure variables are more important in severe market conditions with high transaction costs, the e ect of trading on price dynamics depends on market struc- ture.
    Keywords: Attention Mechanism, Deep Learning, Machine Learning, Market Mi- crostructure, Informed Trading
    JEL: G10 G14 G17
    Date: 2021
  4. By: Kaur, Karman; Mehar, Mamta; Prasad, Narayan
    Keywords: Crop Production/Industries
    Date: 2021–08
  5. By: Mr. Mico Mrkaic; Borislava Mircheva; Jelle Barkema; Yuanchen Yang
    Abstract: This paper dives into the Fund’s historical coverage of cross-border spillovers in its surveillance. We use a state-of-the-art deep learning model to analyze the discussion of spillovers in all IMF Article IV staff reports between 2010 and 2019. We find that overall, while the discussion of spillovers decreased over time, it was pronounced in the staff reports of some systemically important economies and during periods of global spillover events. Spillover discussions were more prominent in staff reports covering advanced and emerging market economies, possibly reflecting their role as sources of global spillovers. The coverage of spillovers was higher in the context of the real, financial, and external sectors. Also, countries with larger economies, higher trade and capital account openess and lower inflation are more likely to discuss spillovers in their Article IV staff reports.
    Keywords: spillover discussion; model performance; discussion of spillover; General spillover pattern; spillover event; IMF staff calculation; Spillovers; Probit models; Machine learning; Capital account; Inflation; Global
    Date: 2021–05–07
  6. By: Bhumjai Tangsawasdirat; Suranan Tanpoonkiat; Burasakorn Tangsatchanan
    Abstract: This paper aims to provide an introduction to Credit Risk Database (CRD), a collection of financial and non-financial data for SME credit risk analysis, for Thailand. Aligning with the Bank of Thailand (BOT)’s strategic plan to develop the data ecosystem to help reduce asymmetric information problem in the financial sector, CRD is an initiative to effectively utilize data already collected from financial institutions as a part of the BOT’s supervisory mandate. Our first use case is intended to help improve financial access for SMEs, by building credit risk models that can work as a complementary tool to help financial institutions and Credit Guarantee Corporation assess SMEs financial prospects in parallel with internal credit score. Focusing on SMEs who are new borrowers, we use only SME’s financial and non-financial data as our explanatory variables while disregarding past default-related data such as loan repayment behavior. Credit risk models of various methodologies are then built from CRD data to allow financial institutions to conduct effective risk-based pricing, offering different sets of interest rates and loan terms. Statistical methods (i.e. logit regression and credit scoring) and machine learning methods (i.e. decision tree and random forest) are used to build credit risk models that can help quantify the SME’s one-year forward probability of default. Out-of-sample prediction results indicate that the statistical and machine learning models yield reasonably accurate probability of default predictions, with the maximum Area under the ROC Curve (AUC) at approximately 70-80%. The model with the best performance, as compared by the maximum AUC, is the random forest model. However, the credit scoring model that is developed from logistic regression of weighted-of-evidence variables is more user-friendly for credit loan providers to interpret and develop practical application, achieving the second-best AUC.
    Keywords: Credit Risk Database; Credit Score; Credit Risk Assessment; Credit Scoring Model; Thai SMEs
    JEL: C52 C53 C55 D81 G21 G32
    Date: 2021–11
  7. By: Saiz, Lorena; Ashwin, Julian; Kalamara, Eleni
    Abstract: This paper shows that newspaper articles contain timely economic signals that can materially improve nowcasts of real GDP growth for the euro area. Our text data is drawn from fifteen popular European newspapers, that collectively represent the four largest Euro area economies, and are machine translated into English. Daily sentiment metrics are created from these news articles and we assess their value for nowcasting. By comparing to competitive and rigorous benchmarks, we find that newspaper text is helpful in nowcasting GDP growth especially in the first half of the quarter when other lower-frequency soft indicators are not available. The choice of the sentiment measure matters when tracking economic shocks such as the Great Recession and the Great Lockdown. Non-linear machine learning models can help capture extreme movements in growth, but require sufficient training data in order to be effective so become more useful later in our sample. JEL Classification: C43, C45, C55, C82, E37
    Keywords: business cycles, COVID-19, forecasting, machine learning, text analysis
    Date: 2021–11
  8. By: Jaydip Sen; Abhishek Dutta; Sidra Mehtab
    Abstract: Predicting future stock prices and their movement patterns is a complex problem. Hence, building a portfolio of capital assets using the predicted prices to achieve the optimization between its return and risk is an even more difficult task. This work has carried out an analysis of the time series of the historical prices of the top five stocks from the nine different sectors of the Indian stock market from January 1, 2016, to December 31, 2020. Optimum portfolios are built for each of these sectors. For predicting future stock prices, a long-and-short-term memory (LSTM) model is also designed and fine-tuned. After five months of the portfolio construction, the actual and the predicted returns and risks of each portfolio are computed. The predicted and the actual returns of each portfolio are found to be high, indicating the high precision of the LSTM model.
    Date: 2021–11
  9. By: Klaus-Peter Hellwig
    Abstract: In this paper I assess the ability of econometric and machine learning techniques to predict fiscal crises out of sample. I show that the econometric approaches used in many policy applications cannot outperform a simple heuristic rule of thumb. Machine learning techniques (elastic net, random forest, gradient boosted trees) deliver significant improvements in accuracy. Performance of machine learning techniques improves further, particularly for developing countries, when I expand the set of potential predictors and make use of algorithmic selection techniques instead of relying on a small set of variables deemed important by the literature. There is considerable agreement across learning algorithms in the set of selected predictors: Results confirm the importance of external sector stock and flow variables found in the literature but also point to demographics and the quality of governance as important predictors of fiscal crises. Fiscal variables appear to have less predictive value, and public debt matters only to the extent that it is owed to external creditors.
    Date: 2021–05–27
  10. By: Mao Guan; Xiao-Yang Liu
    Abstract: Deep reinforcement learning (DRL) has been widely studied in the portfolio management task. However, it is challenging to understand a DRL-based trading strategy because of the black-box nature of deep neural networks. In this paper, we propose an empirical approach to explain the strategies of DRL agents for the portfolio management task. First, we use a linear model in hindsight as the reference model, which finds the best portfolio weights by assuming knowing actual stock returns in foresight. In particular, we use the coefficients of a linear model in hindsight as the reference feature weights. Secondly, for DRL agents, we use integrated gradients to define the feature weights, which are the coefficients between reward and features under a linear regression model. Thirdly, we study the prediction power in two cases, single-step prediction and multi-step prediction. In particular, we quantify the prediction power by calculating the linear correlations between the feature weights of a DRL agent and the reference feature weights, and similarly for machine learning methods. Finally, we evaluate a portfolio management task on Dow Jones 30 constituent stocks during 01/01/2009 to 09/01/2021. Our approach empirically reveals that a DRL agent exhibits a stronger multi-step prediction power than machine learning methods.
    Date: 2021–11
  11. By: Joan Calzada (Universitat de Barcelona); Nestor Duch-Brown (Joint Research Centre); Ricard Gil (Smith School of Business, Queen’s University)
    Abstract: Search engines are one of the main channels to access news content of traditional newspapers. In the European Union, organic search traffic from Google accounts for 35% of news outlets’ visits. Yet, the effects of Google Search on market competition and information diversity are ambiguous, as the firm indexes news outlets considering both domain authority and information accuracy. Using detailed daily data traffic for 606 news outlets from 15 European countries, we assess the effect of Google Search’s indexation on search visits. Our identification strategy exploits nine core algorithm updates rolled out by Google between 2018 and 2020 in order to achieve exogenous variation in news outlets’ indexation. Several conclusions follow from our estimations. First, Google core updates overall reduce the number of keywords that news outlets have in top positions in search results. Second, keywords ranked in top search position have a positive effect on news outlets’ visits. Third, our results are robust when we focus the analysis on different types of news outlets, but are less conclusive when we consider national markets separately. Our paper also analyzes the effects of Google core updates on media market concentration. We find that the three “big†core updates identified in this period reduced market concentration by 1%, but this effect was mostly compensated by the rest of the updates.
    Keywords: Media market, Google Search, Europe, core algorithm updates.
    JEL: L1 L5
    Date: 2021
  12. By: Simon Paetzold; Mizuho Kida
    Abstract: The Financial Action Task Force’s gray list publicly identifies countries with strategic deficiencies in their AML/CFT regimes (i.e., in their policies to prevent money laundering and the financing of terrorism). How much gray-listing affects a country’s capital flows is of interest to policy makers, investors, and the Fund. This paper estimates the magnitude of the effect using an inferential machine learning technique. It finds that gray-listing results in a large and statistically significant reduction in capital inflows.
    Keywords: capital flows, AML/CFT, gray list, machine learning, emerging market economies; inferential machine learning technique; gray-listing affect; analysis using machine learning; gray list; coefficient estimate; Capital flows; Capital inflows; Anti-money laundering and combating the financing of terrorism (AML/CFT); Machine learning; Foreign direct investment; Global
    Date: 2021–05–27
  13. By: Mr. Martin Schindler; Mr. Anton Korinek; Joseph Stiglitz
    Abstract: Advances in artificial intelligence and automation have the potential to be labor-saving and to increase inequality and poverty around the globe. They also give rise to winner-takes-all dynamics that advantage highly skilled individuals and countries that are at the forefront of technological progress. We analyze the economic forces behind these developments and delineate domestic economic policies to mitigate the adverse effects while leveraging the potential gains from technological advances. We also propose reforms to the global system of governance that make the benefits of advances in artificial intelligence more inclusive.
    Keywords: C. putting AI; competition Policy; employment trend; C. intellectual property right; export-led growth; Technological innovation; Artificial intelligence; Income; Global; East Asia
    Date: 2021–06–11
  14. By: Leandro Medina; Mr. Andrea Gamba; Gareth Anderson; Paolo Galang; Tianxiao Zheng
    Abstract: This paper assess how priorities of the IMF’s membership have evolved over the past two decades, by using text mining techniques on a unique dataset combining IMFC communiqués and constituency statements. Our results reveal significant variation in priorities across time and constituencies. Statements can be characterized by the weight which they place on three key priorities: (i) growth; (ii) debt and development; and (iii) crisis management and quota reform. Sentiment analysis techniques also show that addressing climate change is a topic which is viewed positively by an increasing number of constituencies.
    Keywords: IMFC communiqué; constituency statement; IMF priority; constituency cluster; sentiment analysis technique; Climate change; Mining sector; Global financial crisis of 2008-2009; Crisis management; Global
    Date: 2021–06–04
  15. By: Songqiao Han; Hailiang Huang; Jiangwei Liu; Shengsheng Xiao
    Abstract: Social media platforms may provide potential space for discourses that contain hate speech, and even worse, can act as a propagation mechanism for hate crimes. The FBI's Uniform Crime Reporting (UCR) Program collects hate crime data and releases statistic report yearly. These statistics provide information in determining national hate crime trends. The statistics can also provide valuable holistic and strategic insight for law enforcement agencies or justify lawmakers for specific legislation. However, the reports are mostly released next year and lag behind many immediate needs. Recent research mainly focuses on hate speech detection in social media text or empirical studies on the impact of a confirmed crime. This paper proposes a framework that first utilizes text mining techniques to extract hate crime events from New York Times news, then uses the results to facilitate predicting American national-level and state-level hate crime trends. Experimental results show that our method can significantly enhance the prediction performance compared with time series or regression methods without event-related factors. Our framework broadens the methods of national-level and state-level hate crime trends prediction.
    Date: 2021–11
  16. By: Jakob Mokander
    Abstract: Artificial intelligence AI can bring substantial benefits to society by helping to reduce costs, increase efficiency and enable new solutions to complex problems. Using Floridi's notion of how to design the 'infosphere' as a starting point, in this chapter I consider the question: what are the limits of design, i.e. what are the conceptual constraints on designing AI for social good? The main argument of this chapter is that while design is a useful conceptual tool to shape technologies and societies, collective efforts towards designing future societies are constrained by both internal and external factors. Internal constraints on design are discussed by evoking Hardin's thought experiment regarding 'the Tragedy of the Commons'. Further, Hayek's classical distinction between 'cosmos' and 'taxis' is used to demarcate external constraints on design. Finally, five design principles are presented which are aimed at helping policymakers manage the internal and external constraints on design. A successful approach to designing future societies needs to account for the emergent properties of complex systems by allowing space for serendipity and socio-technological coevolution.
    Date: 2021–11
  17. By: Montobbio, Fabio; Staccioli, Jacopo; Virgillito, Maria Enrica; Vivarelli, Marco
    Abstract: This paper represents one of the first attempts at building a direct measure of occupational exposure to robotic labour-saving technologies. After identifying robotic and LS robotic patents retrieved by Montobbio et al. (2022), the underlying 4-digit CPC definitions are employed in order to detect functions and operations performed by technological artefacts which are more directed to substitute the labour input. This measure allows to obtain fine-grained information on tasks and occupations according to their similarity ranking. Occupational exposure by wage and employment dynamics in the United States is then studied, complemented by investigating industry and geographical penetration rates.
    Keywords: Labour-Saving Technology,Natural Language Processes,Labour Markets,Technological Unemployment
    JEL: O33 J24
    Date: 2021
  18. By: Benjamin Meindl; Morgan R. Frank; Joana Mendon\c{c}a
    Abstract: The fourth industrial revolution (4IR) is likely to have a substantial impact on the economy. Companies need to build up capabilities to implement new technologies, and automation may make some occupations obsolete. However, where, when, and how the change will happen remain to be determined. Robust empirical indicators of technological progress linked to occupations can help to illuminate this change. With this aim, we provide such an indicator based on patent data. Using natural language processing, we calculate patent exposure scores for more than 900 occupations, which represent the technological progress related to them. To provide a lens on the impact of the 4IR, we differentiate between traditional and 4IR patent exposure. Our method differs from previous approaches in that it both accounts for the diversity of task-level patent exposures within an occupation and reflects work activities more accurately. We find that exposure to 4IR patents differs from traditional patent exposure. Manual tasks, and accordingly occupations such as construction and production, are exposed mainly to traditional (non-4IR) patents but have low exposure to 4IR patents. The analysis suggests that 4IR technologies may have a negative impact on job growth; this impact appears 10 to 20 years after patent filing. Further, we compared the 4IR exposure to other automation and AI exposure scores. Whereas many measures refer to theoretical automation potential, our patent-based indicator reflects actual technology diffusion. Our work not only allows analyses of the impact of 4IR technologies as a whole, but also provides exposure scores for more than 300 technology fields, such as AI and smart office technologies. Finally, the work provides a general mapping of patents to tasks and occupations, which enables future researchers to construct individual exposure measures.
    Date: 2021–10
  19. By: Yue-Jun Zhang (Business School, Hunan University, Changsha 410082, China; Center for Resource and Environmental Management, Hunan University, Changsha 410082, China); Han Zhang (Business School, Hunan University, Changsha 410082, China; Center for Resource and Environmental Management, Hunan University, Changsha 410082, China); Rangan Gupta (Department of Economics, University of Pretoria, Private Bag X20, Hatfield 0028, South Africa)
    Abstract: Forecasting of the artificial intelligence index returns is of great significance for financial market stability and the development of artificial intelligence industry. To provide investors more reliable reference in terms of artificial intelligence index investment, this paper selects the Nasdaq CTA Artificial Intelligence and Robotics (AI) Index as the research target, and proposes novel hybrid methods to forecast the AI index returns by considering its nonlinear and time-varying characteristics. Specifically, this paper uses the ensemble empirical mode decomposition (EEMD) method to decompose the AI index returns, and combines the least square support vector machine approach together with the particle swarm optimization (PSO-LSSVM) method and the generalized autoregressive conditional heteroskedasticity (GARCH) model to construct novel hybrid forecasting methods. The empirical results indicate that: first, the decomposition and integration models usually produce superior forecasting accuracy than the single forecasting models, due to the complicated feature of the non-decomposed data. Second, the newly proposed hybrid forecasting method (i.e., the EEMD-PSO-LSSVM-GARCH model) which combines the advantage of traditional econometric models and machine learning techniques can yield the optimal forecasting performance for the AI index returns.
    Keywords: AI index return forecasting, PSO-LSSVM model, GARCH model, Decomposition and integration model, Combination model
    JEL: Q43 G15 E37
    Date: 2021–11
  20. By: Anton Angelgardt (National Research University Higher School of Economics); Elena S. Gorbunova (National Research University Higher School of Economics); Maria Chumakova (National Research University Higher School of Economics)
    Abstract: The current state of technology means people find themselves interacting with and having to trust artificial intelligent agents. However, despite the considerable history of trust studies, there is no agreement on what trust is and what this construct consists of. The study elaborates on the construct of trust in artificial intelligent agents and develops a questionnaire to assess this construct. Reliability, validity, internal consistency, and other essential statistical parameters of the scale are examined. In addition, difficulty and discrimination coefficients analysed to measure their properties. Confirmatory factor analysis is used to verify the theoretical structure of the developed construct on empirical data. As a result, the conclusion about the applicability of the developed scale is made
    Keywords: trust, artificial intelligence, questionnaire development.
    JEL: Z
    Date: 2021
  21. By: Corredera, Alberto; Ruiz Mora, Carlos
    Abstract: We present a data-driven framework for optimal scenario selection in stochastic optimization with applications in power markets. The proposed methodology relies in the existence of auxiliary information and the use of machine learning techniques to narrow the set of possible realizations (scenarios) of the variables of interest. In particular, we implement a novel validation algorithm that allows optimizing each machine learning hyperparameter to further improve the prescriptive power of the resulting set of scenarios. Supervised machine learning techniques are examined, including kNN and decision trees, and the validation process is adapted to work with time-dependent datasets. Moreover, we extend the proposed methodology to work with unsupervised techniques with promising results. We test the proposed methodology in a realistic power market application: optimal trading strategy in forward and spot markets for an electricity retailer under uncertain spot prices. Results indicate that the retailer can greatly benefit from the proposed data-driven methodology and improve its market performance. Moreover, we perform an extensive set of numerical simulations to analyze under which conditions the best machine learning hyperparameters, in terms of prescriptive performance, differ from those that provide the best predictive accuracy.
    Keywords: Or in energy; Data-Driven; Electricity Retailer; Hyperparameter Selection; Machine Learning
    Date: 2021–11–25
  22. By: Alexander Kell
    Abstract: A transition to a low-carbon electricity supply is crucial to limit the impacts of climate change. Reducing carbon emissions could help prevent the world from reaching a tipping point, where runaway emissions are likely. Runaway emissions could lead to extremes in weather conditions around the world -- especially in problematic regions unable to cope with these conditions. However, the movement to a low-carbon energy supply can not happen instantaneously due to the existing fossil-fuel infrastructure and the requirement to maintain a reliable energy supply. Therefore, a low-carbon transition is required, however, the decisions various stakeholders should make over the coming decades to reduce these carbon emissions are not obvious. This is due to many long-term uncertainties, such as electricity, fuel and generation costs, human behaviour and the size of electricity demand. A well choreographed low-carbon transition is, therefore, required between all of the heterogenous actors in the system, as opposed to changing the behaviour of a single, centralised actor. The objective of this thesis is to create a novel, open-source agent-based model to better understand the manner in which the whole electricity market reacts to different factors using state-of-the-art machine learning and artificial intelligence methods. In contrast to other works, this thesis looks at both the long-term and short-term impact that different behaviours have on the electricity market by using these state-of-the-art methods.
    Date: 2021–09
  23. By: Jeronymo Marcondes Pinto; Jennifer L. Castle
    Abstract: Forecasting economic indicators is an important task for analysts. However, many indicators suffer from structural breaks leading to forecast failure. Methods that are robust following a structural break have been proposed in the literature but they come at a cost: an increase in forecast error variance. We propose a method to select between a set of robust and non-robust forecasting models. Our method uses time-series clustering to identify possible structural breaks in a time series, and then switches between forecasting models depending on the series dynamics. We perform a rigorous empirical evaluation with 400 simulated series with an artificial structural break and with real data economic series: Industrial Production and Consumer Prices for all Western European countries available from the OECD database. Our results show that the proposed method statistically outperforms benchmarks in forecast accuracy for most case scenarios, particularly at short horizons.
    Keywords: Machine Learning, Forecasting, Structural Breaks, Model Selection, Cluster Analysis
    Date: 2021–10–13
  24. By: Zechu Li; Xiao-Yang Liu; Jiahao Zheng; Zhaoran Wang; Anwar Walid; Jian Guo
    Abstract: Machine learning techniques are playing more and more important roles in finance market investment. However, finance quantitative modeling with conventional supervised learning approaches has a number of limitations. The development of deep reinforcement learning techniques is partially addressing these issues. Unfortunately, the steep learning curve and the difficulty in quick modeling and agile development are impeding finance researchers from using deep reinforcement learning in quantitative trading. In this paper, we propose an RLOps in finance paradigm and present a FinRL-Podracer framework to accelerate the development pipeline of deep reinforcement learning (DRL)-driven trading strategy and to improve both trading performance and training efficiency. FinRL-Podracer is a cloud solution that features high performance and high scalability and promises continuous training, continuous integration, and continuous delivery of DRL-driven trading strategies, facilitating a rapid transformation from algorithmic innovations into a profitable trading strategy. First, we propose a generational evolution mechanism with an ensemble strategy to improve the trading performance of a DRL agent, and schedule the training of a DRL algorithm onto a GPU cloud via multi-level mapping. Then, we carry out the training of DRL components with high-performance optimizations on GPUs. Finally, we evaluate the FinRL-Podracer framework for a stock trend prediction task on an NVIDIA DGX SuperPOD cloud. FinRL-Podracer outperforms three popular DRL libraries Ray RLlib, Stable Baseline 3 and FinRL, i.e., 12% \sim 35% improvements in annual return, 0.1 \sim 0.6 improvements in Sharpe ratio and 3 times \sim 7 times speed-up in training time. We show the high scalability by training a trading agent in 10 minutes with $80$ A100 GPUs, on NASDAQ-100 constituent stocks with minute-level data over 10 years.
    Date: 2021–11
  25. By: Anders Nõu; Darya Lapitskaya; Mustafa Hakan Eratalay; Rajesh Sharma
    Abstract: For stock market predictions, the essence of the problem is usually predicting the magnitude and direction of the stock price movement as accurately as possible. There are different approaches (e.g., econometrics and machine learning) for predicting stock returns. However, it is non-trivial to find an approach which works the best. In this paper, we make a thorough analysis of the predictive accuracy of different machine learning and econometric approaches for predicting the returns and volatilities on the OMX Baltic Benchmark price index, which is a relatively less researched stock market. Our results show that the machine learning methods, namely the support vector regression and k-nearest neighbours, predict the returns better than autoregressive moving average models for most of the metrics, while for the other approaches, the results were not conclusive. Our analysis also highlighted that training and testing sample size plays an important role on the outcome of machine learning approaches.
    Keywords: machine learning, neural networks, autoregressive moving average, generalized autore- gressive conditional heteroskedasticity
    Date: 2021
  26. By: Aleksandra Parteka (Gdansk University of Technology, Gdansk, Poland); Joanna Wolszczak-Derlacz (Gdansk University of Technology, Gdansk, Poland); Dagmara Nikulin (Gdansk University of Technology, Gdansk, Poland)
    Abstract: This paper uses a sample of over 9.5 million workers from 22 European countries to study the intertwined effects of digital technology and cross-border production links on workers' wellbeing. We compare the social effects of technological change exhibited by three types of innovation: computerisation (software), automation (robots) and artificial intelligence (AI). To fully quantify work-related wellbeing, we propose a new methodology that corrects the information on remuneration by reference to such non-monetary factors as the work environment (physical and social), career development prospects, or work intensity. We show that workers' wellbeing depends on the type of technological exposure. Employees in occupations with high software or robots content face worse working conditions than those exposed to AI. The impact of digitalisation on working conditions depends on participation in global production. To demonstrate this, we estimate a set of augmented models for determination of working conditions, interacting technological factors with Global Value Chain participation. GVC intensification is accompanied by deteriorating working conditions - but only in occupations exposed to robots or software, not in AI-intensive jobs. In other words, we find that AI technologies differ from previous waves of technological progress - also in their impact on workers' wellbeing within global production structures.
    Keywords: digital technologies, working conditions, GVC, Global Value Chains, artificial intelligence, AI
    JEL: F1 F6 J8 O3
    Date: 2021–11
  27. By: Simon Büchler, Maximilian v. Ehrlich
    Abstract: We analyze land use regulation and the determinants thereof across the majority of Swiss municipalities. Based on a comprehensive survey, we construct several indices on the ease of local residential development, which capture various aspects of local regulation and land use coordination across jurisdictions. The indices provide harmonized information about what local regulation entails and the local regulatory environment across municipalities. Our analysis shows that, among others, historical building density, socio-demographic factors, local taxes, cultural aspects, and the quality of natural amenities are important determinants of local land-use regulation. We test the validity of the index with regard to information about the local refusal rates of development projects and show that the index captures a significant part of the variation in local housing supply elasticities. Based on a machine learning cross-validation model, we impute the values for nonresponding municipalities.
    Keywords: Local regulation, zoning, housing markets
    JEL: R1 R14 R31 R52
    Date: 2021–08
  28. By: Bluwstein, Kristina; Buckmann, Marcus; Joseph, Andreas; Kapadia, Sujit; Şimşek, Özgür
    Abstract: We develop early warning models for financial crisis prediction by applying machine learning techniques to macrofinancial data for 17 countries over 1870–2016. Most nonlin-ear machine learning models outperform logistic regression in out-of-sample predictions and forecasting. We identify economic drivers of our machine learning models using a novel framework based on Shapley values, uncovering nonlinear relationships between the predic-tors and crisis risk. Throughout, the most important predictors are credit growth and the slope of the yield curve, both domestically and globally. A flat or inverted yield curve is of most concern when nominal interest rates are low and credit growth is high. JEL Classification: C40, C53, E44, F30, G01
    Keywords: credit growth, machine learning, Shapley values, yield curve, financial crises, financial stability
    Date: 2021–11
  29. By: Reda Cherif; Karl Walentin; Brandon Buell; Carissa Chen; Jiawen Tang; Nils Wendt
    Abstract: The COVID-19 pandemic underscores the critical need for detailed, timely information on its evolving economic impacts, particularly for Sub-Saharan Africa (SSA) where data availability and lack of generalizable nowcasting methodologies limit efforts for coordinated policy responses. This paper presents a suite of high frequency and granular country-level indicator tools that can be used to nowcast GDP and track changes in economic activity for countries in SSA. We make two main contributions: (1) demonstration of the predictive power of alternative data variables such as Google search trends and mobile payments, and (2) implementation of two types of modelling methodologies, machine learning and parametric factor models, that have flexibility to incorporate mixed-frequency data variables. We present nowcast results for 2019Q4 and 2020Q1 GDP for Kenya, Nigeria, South Africa, Uganda, and Ghana, and argue that our factor model methodology can be generalized to nowcast and forecast GDP for other SSA countries with limited data availability and shorter timeframes.
    Keywords: model prediction; quantile plot; ML model; GDP YoY; data variable; YoY percent change; Factor models; Machine learning; Time series analysis; Spot exchange rates; Mobile banking; Africa; Sub-Saharan Africa
    Date: 2021–05–01

This nep-big issue is ©2021 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.