nep-big New Economics Papers
on Big Data
Issue of 2021‒10‒11
twelve papers chosen by
Tom Coupé
University of Canterbury

  1. Autonomous algorithmic collusion: Economic research and policy implications By Stephanie Assad; Emilio Calvano; Giacomo Calzolari; Robert Clark; Vincenzo Denicolo; Daniel Ershov; Justin Pappas Johnson; Sergio Pastorello; Andrew Rhodes; Lei Xu; Matthijs Wildenbeest
  2. A Quantum Generative Adversarial Network for distributions By Amine Assouel; Antoine Jacquier; Alexei Kondratyev
  3. Predicting Credit Risk for Unsecured Lending: A Machine Learning Approach By K. S. Naik
  4. RieszNet and ForestRiesz: Automatic Debiased Machine Learning with Neural Nets and Random Forests By Victor Chernozhukov; Whitney K. Newey; Victor Quintas-Martinez; Vasilis Syrgkanis
  5. Identifying Students At Risk Using Prior Performance Versus a Machine Learning Algorithm By Lindsay Cattell; Julie Bruch
  6. Universal Database for Economic Complexity By Aurelio Patelli; Andrea Zaccaria; Luciano Pietronero
  7. Credit Rating Agencies: Evolution or Extinction? By Dimitriadou, Athanasia; Agrapetidou, Anna; Gogas, Periklis; Papadimitriou, Theophilos
  8. Hierarchical Gaussian Process Models for Regression Discontinuity/Kink under Sharp and Fuzzy Designs By Ximing Wu
  9. Bridging the Divide? Bayesian Artificial Neural Networks for Frontier Efficiency Analysis By Mike Tsionas; Christopher F. Parmeter; Valentin Zelenyuk
  10. Effect or Treatment Heterogeneity? Policy Evaluation with Aggregated and Disaggregated Treatments By Phillip Heiler; Michael C. Knaus
  11. AI Watch: 2020 EU AI investments By Alessandro Dalla Benetta; Maciej Sobolewski; Daniel Nepelski
  12. Deep Learning for Principal-Agent Mean Field Games By Steven Campbell; Yichao Chen; Arvind Shrivats; Sebastian Jaimungal

  1. By: Stephanie Assad (Unknown); Emilio Calvano (Unknown); Giacomo Calzolari (Unknown); Robert Clark (Unknown); Vincenzo Denicolo (Unknown); Daniel Ershov (TSE - Toulouse School of Economics - UT1 - Université Toulouse 1 Capitole - Université Fédérale Toulouse Midi-Pyrénées - EHESS - École des hautes études en sciences sociales - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement); Justin Pappas Johnson (Unknown); Sergio Pastorello (Unknown); Andrew Rhodes (TSE - Toulouse School of Economics - UT1 - Université Toulouse 1 Capitole - Université Fédérale Toulouse Midi-Pyrénées - EHESS - École des hautes études en sciences sociales - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement); Lei Xu (Unknown); Matthijs Wildenbeest (Unknown)
    Abstract: Markets are being populated with new generations of pricing algorithms, powered with Artificial Intelligence, that have the ability to autonomously learn to operate. This ability can be both a source of efficiency and cause of concern for the risk that algorithms autonomously and tacitly learn to collude. In this paper we explore recent developments in the economic literature and discuss implications for policy.
    Keywords: Platforms.,Algorithmic Pricing,Antitrust,Competition Policy,Artificial Intelligence,Collusion
    Date: 2021–09
  2. By: Amine Assouel; Antoine Jacquier; Alexei Kondratyev
    Abstract: Generative Adversarial Networks are becoming a fundamental tool in Machine Learning, in particular in the context of improving the stability of deep neural networks. At the same time, recent advances in Quantum Computing have shown that, despite the absence of a fault-tolerant quantum computer so far, quantum techniques are providing exponential advantage over their classical counterparts. We develop a fully connected Quantum Generative Adversarial network and show how it can be applied in Mathematical Finance, with a particular focus on volatility modelling.
    Date: 2021–10
  3. By: K. S. Naik
    Abstract: Since the 1990s, there have been significant advances in the technology space and the e-Commerce area, leading to an exponential increase in demand for cashless payment solutions. This has led to increased demand for credit cards, bringing along with it the possibility of higher credit defaults and hence higher delinquency rates, over a period of time. The purpose of this research paper is to build a contemporary credit scoring model to forecast credit defaults for unsecured lending (credit cards), by employing machine learning techniques. As much of the customer payments data available to lenders, for forecasting Credit defaults, is imbalanced (skewed), on account of a limited subset of default instances, this poses a challenge for predictive modelling. In this research, this challenge is addressed by deploying Synthetic Minority Oversampling Technique (SMOTE), a proven technique to iron out such imbalances, from a given dataset. On running the research dataset through seven different machine learning models, the results indicate that the Light Gradient Boosting Machine (LGBM) Classifier model outperforms the other six classification techniques. Thus, our research indicates that the LGBM classifier model is better equipped to deliver higher learning speeds, better efficiencies and manage larger data volumes. We expect that deployment of this model will enable better and timely prediction of credit defaults for decision-makers in commercial lending institutions and banks.
    Date: 2021–10
  4. By: Victor Chernozhukov; Whitney K. Newey; Victor Quintas-Martinez; Vasilis Syrgkanis
    Abstract: Many causal and policy effects of interest are defined by linear functionals of high-dimensional or non-parametric regression functions. $\sqrt{n}$-consistent and asymptotically normal estimation of the object of interest requires debiasing to reduce the effects of regularization and/or model selection on the object of interest. Debiasing is typically achieved by adding a correction term to the plug-in estimator of the functional, that is derived based on a functional-specific theoretical derivation of what is known as the influence function and which leads to properties such as double robustness and Neyman orthogonality. We instead implement an automatic debiasing procedure based on automatically learning the Riesz representation of the linear functional using Neural Nets and Random Forests. Our method solely requires value query oracle access to the linear functional. We propose a multi-tasking Neural Net debiasing method with stochastic gradient descent minimization of a combined Riesz representer and regression loss, while sharing representation layers for the two functions. We also propose a Random Forest method which learns a locally linear representation of the Riesz function. Even though our methodology applies to arbitrary functionals, we experimentally find that it beats state of the art performance of the prior neural net based estimator of Shi et al. (2019) for the case of the average treatment effect functional. We also evaluate our method on the more challenging problem of estimating average marginal effects with continuous treatments, using semi-synthetic data of gasoline price changes on gasoline demand.
    Date: 2021–10
  5. By: Lindsay Cattell; Julie Bruch
    Abstract: This report provides information for administrators in local education agencies who are considering early warning systems to identify at-risk students.
    Keywords: Schools, At-risk Students, Machine Learning, Early Warning System
  6. By: Aurelio Patelli; Andrea Zaccaria; Luciano Pietronero
    Abstract: We present an integrated database suitable for the investigations of the Economic development of countries by using the Economic Fitness and Complexity framework. Firstly, we implement machine learning techniques to reconstruct the database of Trade of Services and we integrate it with the database of the Trade of the physical Goods, generating a complete view of the International Trade and denoted the Universal database. Using this data, we derive a statistically significant network of interaction of the Economic activities, where preferred paths of development and clusters of High-Tech industries naturally emerge. Finally, we compute the Economic Fitness, an algorithmic assessment of the competitiveness of countries, removing the unexpected misbehaviour of Economies under-represented by the sole consideration of the Trade of the physical Goods.
    Date: 2021–10
  7. By: Dimitriadou, Athanasia (University of Derby); Agrapetidou, Anna (Democritus University of Thrace, Department of Economics); Gogas, Periklis (Democritus University of Thrace, Department of Economics); Papadimitriou, Theophilos (Democritus University of Thrace, Department of Economics)
    Abstract: Credit Rating Agencies (CRAs) have been around for more than 150 years. Their role evolved from mere information collectors and providers to quasi-official arbitrators of credit risk throughout the global financial system. They compiled information that -at the time- was too difficult and costly for their clients to gather on their own. After the 1929 big market crash, they started to play a more formal role. Since then, we see a growing reliance of investors on the CRAs ratings. After the global financial crisis of 2007, the CRAs became the focal point of criticism by economists, politicians, the media, market participants and official regulatory agencies. The reason was obvious: the CRAs failed to perform the job they were supposed to do financial markets, i.e. efficient, effective and prompt measuring and signaling of financial (default) risk. The main criticism was focusing on the “issuer-pays system”, the relatively loose regulatory oversight from the relevant government agencies, the fact that often ratings change ex-post and the limited liability of CRAs. Many changes were implemented to the operational framework of the CRAs, including public disclosure of CRA information. This is designed to facilitate "unsolicited" ratings of structured securities by rating agencies that are not paid by the issuers. This combined with the abundance of data and the availability of powerful new methodologies and inexpensive computing power can bring us to the new era of independent ratings: The not-for-profit Independent Credit Rating Agencies (ICRAs). These can either compete or be used as an auxiliary risk gauging mechanism free from the problems inherent in the traditional CRAs. This role can be assumed by either public or governmental authorities, national or international specialized entities or universities, research institutions, etc. Several factors facilitate today the transition to the ICRAs: the abundance data, cheaper and faster computer processing the progress in traditional forecasting techniques and the wide use of new forecasting techniques i.e. Machine Learning methodologies and Artificial Intelligence systems.
    Keywords: Credit rating agencies; banking; forecasting; support vector machines; artificial intelligence
    JEL: C02 C15 C40 C45 C54 E02 E17 E27 E44 E58 E61 G20 G23 G28
    Date: 2021–10–04
  8. By: Ximing Wu
    Abstract: We propose nonparametric Bayesian estimators for causal inference exploiting Regression Discontinuity/Kink (RD/RK) under sharp and fuzzy designs. Our estimators are based on Gaussian Process (GP) regression and classification. The GP methods are powerful probabilistic modeling approaches that are advantageous in terms of derivative estimation and uncertainty qualification, facilitating RK estimation and inference of RD/RK models. These estimators are extended to hierarchical GP models with an intermediate Bayesian neural network layer and can be characterized as hybrid deep learning models. Monte Carlo simulations show that our estimators perform similarly and often better than competing estimators in terms of precision, coverage and interval length. The hierarchical GP models improve upon one-layer GP models substantially. An empirical application of the proposed estimators is provided.
    Date: 2021–10
  9. By: Mike Tsionas (Montpellier Business School Université de Montpellier, Montpellier Research in Management and Lancaster University Management School); Christopher F. Parmeter (Miami Herbert Business School, University of Miami, Miami FL); Valentin Zelenyuk (School of Economics and Centre for Efficiency and Productivity Analysis (CEPA) at The University of Queensland, Australia)
    Abstract: The literature on firm efficiency has seen its share of research comparing and contrasting Data Envelopment Analysis (DEA) and Stochastic Frontier Analysis (SFA), the two workhorse estimators. These studies rely on both Monte Carlo experiments and actual data sets to examine a range of performance issues which can be used to elucidate insights on the benefits or weaknesses of one method over the other. As can be imagined, neither method is universally better than the other. The present paper proposes an alternative approach that is quite flexible in terms of functional form and distributional assumptions and it amalgamates the benefits of both DEA and SFA. Specifically, we bridge these two popular approaches via Bayesian Artificial Neural Networks. We examine the performance of this new approach using Monte Carlo experiments. The performance is found to be very good, comparable or often better than the current standards in the literature. To illustrate the new techniques, we provide an application of this approach to a recent data set of large US banks.
    Keywords: Simulation; OR in Banking; Stochastic Frontier Models; Data Envelopment Analysis; Flexible Functional Forms.
    Date: 2021–06
  10. By: Phillip Heiler; Michael C. Knaus
    Abstract: Binary treatments in empirical practice are often (i) ex-post aggregates of multiple treatments or (ii) can be disaggregated into multiple treatment versions after assignment. In such cases it is unclear whether estimated heterogeneous effects are driven by effect heterogeneity or by treatment heterogeneity. This paper provides estimands to decompose canonical effect heterogeneity into the effect heterogeneity driven by different responses to underlying multiple treatments and potentially different compositions of these underlying effective treatments. This allows to avoid spurious discovery of heterogeneous effects, to detect potentially masked heterogeneity, and to evaluate the underlying assignment mechanism of treatment versions. A nonparametric method for estimation and statistical inference of the decomposition parameters is proposed. The framework allows for the use of machine learning techniques to adjust for high-dimensional confounding of the effective treatments. It can be used to conduct simple joint hypothesis tests for effect heterogeneity that consider all effective treatments simultaneously and circumvent multiple testing procedures. It requires weaker overlap assumptions compared to conventional multi-valued treatment effect analysis. The method is applied to a reevaluation of heterogeneous effects of smoking on birth weight. We find that parts of the differences between ethnic and age groups can be explained by different smoking intensities. We further reassess the gender gap in the effectiveness of the Job Corps training program and find that it is largely explained by gender differences in the type of vocational training received.
    Date: 2021–10
  11. By: Alessandro Dalla Benetta (European Commission - JRC); Maciej Sobolewski (European Commission - JRC); Daniel Nepelski (European Commission - JRC)
    Abstract: This report provides estimates of AI investments in EU27 in 2018 and for 2019. It considers AI as a general-purpose technology and, besides direct investments in the development and adoption of AI technologies, includes also investments in complementary assets and capabilities such as skills, data, product design and organisational capital among AI investments. According to the estimates, in 2019, EU invested between EUR 7.9 billion and EUR 9 billion in AI. Compared to 2018, this represents an increase by 39%. If the EU maintains a similar level of growth, by 2025 the AI investments will reach EUR 22.4 billion and surpass the EUR 20 billion target by over 10%. The EU AI investments concentrate in labour and human capital covered by the Skills investment target. Expenditures on AI-related Data and equipment account for 30%. R&D and Intangible assets account for 10% and 7% of the total EU AI investments respectively. The contribution of the European public sector is considerable and accounts for 41% of total AI investments in 2019.
    Keywords: General Purpose Technology, GPT, Artificial Intelligence, AI, digital technologies, investments, intangibles, Europe
    Date: 2021–09
  12. By: Steven Campbell; Yichao Chen; Arvind Shrivats; Sebastian Jaimungal
    Abstract: Here, we develop a deep learning algorithm for solving Principal-Agent (PA) mean field games with market-clearing conditions -- a class of problems that have thus far not been studied and one that poses difficulties for standard numerical methods. We use an actor-critic approach to optimization, where the agents form a Nash equilibria according to the principal's penalty function, and the principal evaluates the resulting equilibria. The inner problem's Nash equilibria is obtained using a variant of the deep backward stochastic differential equation (BSDE) method modified for McKean-Vlasov forward-backward SDEs that includes dependence on the distribution over both the forward and backward processes. The outer problem's loss is further approximated by a neural net by sampling over the space of penalty functions. We apply our approach to a stylized PA problem arising in Renewable Energy Certificate (REC) markets, where agents may rent clean energy production capacity, trade RECs, and expand their long-term capacity to navigate the market at maximum profit. Our numerical results illustrate the efficacy of the algorithm and lead to interesting insights into the nature of optimal PA interactions in the mean-field limit of these markets.
    Date: 2021–10

This nep-big issue is ©2021 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.