|
on Big Data |
By: | Christopher Rauh (Université de Montréal, CIREQ) |
Abstract: | In this paper I present a methodology to provide uncertainty measures at the regional level in real time using the full bandwidth of news. In order to do so I download vast amounts of newspaper articles, summarize these into topics using unsupervised machine learning, and then show that the resulting topics foreshadow fluctuations in economic indicators. Given large regional disparities in economic performance and trends within countries, it is particularly important to have regional measures for a policymaker to tailor policy responses. I use a vector-autoregression model for the case of Canada, a large and diverse country, to show that the generated topics are significantly related to movements in economic performance indicators, inflation, and the unemployment rate at the national and provincial level. Evidence is provided that a composite index of the generated diverse topics can serve as a measure of uncertainty. Moreover, I show that some topics are general enough to have homogenous associations across provinces, while others are specific to fluctuations in certain regions. |
Keywords: | machine learning, latent dirichlet allocation, newspaper text, economic uncertainty, topic model, Canada |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:mtl:montec:09-2019&r=all |
By: | Dominique Guegan (UP1 - Université Panthéon-Sorbonne, CES - Centre d'économie de la Sorbonne - UP1 - Université Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique, University of Ca’ Foscari [Venice, Italy]) |
Abstract: | We are interested in the analysis of the concept of interpretability associated with a ML algorithm. We distinguish between the "How", i.e., how a black box or a very complex algorithm works, and the "Why", i.e. why an algorithm produces such a result. These questions appeal to many actors, users, professions, regulators among others. Using a formal standardized framework , we indicate the solutions that exist by specifying which elements of the supply chain are impacted when we provide answers to the previous questions. This presentation, by standardizing the notations, allows to compare the different approaches and to highlight the specificities of each of them: both their objective and their process. The study is not exhaustive and the subject is far from being closed. |
Keywords: | Interpretability,Counterfactual approach,Artificial Intelligence,Agnostic models,LIME method,Machine learning |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:hal:journl:halshs-02900929&r=all |
By: | Julia M. Puaschunder (The New School, Department of Economics, Schwartz Center for Economic Policy Analysis, New York USA) |
Abstract: | The currently ongoing introduction of Artificial Intelligence (AI), robotics and big data into our contemporary society causes a market transformation that heightens the need for ethics in the wake of an unprecedented outsourcing decision making to machines. Artificial Intelligence (AI) poses historically unique challenges for humankind. This chapter will address legal, economic and societal trends in the contemporary introduction of Artificial Intelligence (AI), Robotics and Big Data derived inferences. In a world, where there is a currently ongoing blend between human beings and AI, the emerging autonomy of AI holds unique potentials of eternal life but also imposes pressing legal and ethical challenges in light of AI gaining citizenship, overpopulation concerns and international development gaps. The current legal status of AI and robotics will be outlined with special attention to consumer protection and ethics in the healthcare sector. The unprecedented economic market revolution of outsourced decision making to AI will be captured in macroeconomic trends outlining AI as corruption free market solution, which is yet only prevalent and efficient in some parts of the world. Finally, a future-oriented perspective on the use of AI for enhancing democracy and diplomacy will be granted but also ethical boundaries envisioned. The mentioned transition appears to hold novel and unprecedentedly-described freedom challenges in our contemporary world. In an homage to freedom, the following paper first lays open these freedom-threatened areas in order to then provide strategies to alleviate these potential freedom deficiencies but also set new freedom potential free. |
Keywords: | AI, Artificial Intelligence, Climate change, Climate justice, Discrimination of excellence, Freedom |
Date: | 2020–04 |
URL: | http://d.repec.org/n?u=RePEc:smo:kpaper:0011jp&r=all |
By: | Julia M. Puaschunder (The New School, New York, NY USA) |
Abstract: | The currently ongoing COVID-19 crisis has challenged healthcare around the world. The call for global solutions in international healthcare pandemic crisis and risk management has reached unprecedented momentum. Digitalization, Artificial Intelligence and big data-derived inferences are supporting human decision making as essential healthcare enhancements as never before in the history of medicine. In today’s healthcare sector and medical profession, AI, algorithms, robotics and big data are used for monitoring of large-scale medical trends by detecting and measuring individual risks based on big data-driven estimations. This article provides a snapshot of the current state-of-the-art of AI, algorithms, big data-derived inferences and robotics in healthcare but also medical responses to COVID-19 in the international arena. International differences in the approaches to combat global pandemics become apparent serving as interesting case study on how to avert global pandemics successfully with AI in the future. Empirically, the article answers what countries have favourable conditions to provide AI solutions for global healthcare and pandemic crises monitoring and alleviation when compared over the entire world? First, an index based on internet connectivity – as a proxy for digitalization and AI advancement– as well as Gross Domestic Product – as indicator for economic productivity – is calculated to outline global pandemic healthcare solution innovation hubs with economic impetus around the world. The parts of the world that feature internet connectivity and high GDP are likely to lead on AI-driven big data monitoring insights for pandemic prevention. When comparing countries worldwide, AI advancement is found to be positively correlated with anti-corruption. AI thus springs from non-corrupt territories of the world. Second, a novel anti-corruption artificial healthcare index is therefore presented that highlights those countries in the world that have vital AI growth in a non-corrupt environment. These non-corrupt AI centres hold comparative advantages to lead on global artificial healthcare solutions against COVID-19 and serve as pandemic crisis and risk management innovators of the future. Anti-corruption is also positively related with better general healthcare. Therefore, finally, a third index that combines internet connectivity, anti-corruption as well as healthcare access and quality is presented. The countries that score high on AI, anti-corruption and healthcare excellence are presented as ultimate world-leading, innovative global pandemic alleviation centres. The advantages but also potential shortfalls and ethical cliffs in the novel use of monitoring Apps, big data inferences and telemedicine to prevent pandemics are discussed. |
Keywords: | Access to healthcare, Advancements, AI-GDP Index, Apps, Artificial Intelligence, Coronavirus, Corruption-free maximization of excellence and precision, Corruption Perception (CPI)-Global Connectivity Index, Corruption Perception-Global Connectivity-Healthcare Index COVID-19, Decentralized grids, Economic growth, Healthcare, Human resemblance, Humanness, Innovation, Market disruption, Market entrance, Pandemic, Rational precision, Social stratification, Supremacy, Targeted aid, Telemedicine |
Date: | 2020–06 |
URL: | http://d.repec.org/n?u=RePEc:smo:spaper:003jp&r=all |
By: | Grzegorz Marcjasz; Jesus Lago; Rafa{\l} Weron |
Abstract: | Recent advancements in the fields of artificial intelligence and machine learning methods resulted in a significant increase of their popularity in the literature, including electricity price forecasting. Said methods cover a very broad spectrum, from decision trees, through random forests to various artificial neural network models and hybrid approaches. In electricity price forecasting, neural networks are the most popular machine learning method as they provide a non-linear counterpart for well-tested linear regression models. Their application, however, is not straightforward, with multiple implementation factors to consider. One of such factors is the network's structure. This paper provides a comprehensive comparison of two most common structures when using the deep neural networks -- one that focuses on each hour of the day separately, and one that reflects the daily auction structure and models vectors of the prices. The results show a significant accuracy advantage of using the latter, confirmed on data from five distinct power exchanges. |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.08006&r=all |
By: | Eduardo Ramos-P\'erez; Pablo J. Alonso-Gonz\'alez; Jos\'e Javier N\'u\~nez-Vel\'azquez |
Abstract: | Currently, legal requirements demand that insurance companies increase their emphasis on monitoring the risks linked to the underwriting and asset management activities. Regarding underwriting risks, the main uncertainties that insurers must manage are related to the premium sufficiency to cover future claims and the adequacy of the current reserves to pay outstanding claims. Both risks are calibrated using stochastic models due to their nature. This paper introduces a reserving model based on a set of machine learning techniques such as Gradient Boosting, Random Forest and Artificial Neural Networks. These algorithms and other widely used reserving models are stacked to predict the shape of the runoff. To compute the deviation around a former prediction, a log-normal approach is combined with the suggested model. The empirical results demonstrate that the proposed methodology can be used to improve the performance of the traditional reserving techniques based on Bayesian statistics and a Chain Ladder, leading to a more accurate assessment of the reserving risk. |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.07564&r=all |
By: | Deininger,Klaus W.; Ali,Daniel Ayalew; Kussul,Nataliia; Lavreniuk,Mykola; Nivievskyi,Oleg |
Abstract: | To overcome the constraints for policy and practice posed by limited availability of data on crop rotation, this paper applies machine learning to freely available satellite imagery to identify the rotational practices of more than 7,000 villages in Ukraine. Rotation effects estimated based on combining these data with survey-based yield information point toward statistically significant and economically meaningful effects that differ from what has been reported in the literature, highlighting the value of this approach. Independently derived indices of vegetative development and soil water content produce similar results, not only supporting the robustness of the results, but also suggesting that the opportunities for spatial and temporal disaggregation inherent in such data offer tremendous unexploited opportunities for policy-relevant analysis. |
Date: | 2020–06–29 |
URL: | http://d.repec.org/n?u=RePEc:wbk:wbrwps:9306&r=all |
By: | Loris Cannelli; Giuseppe Nuti; Marzio Sala; Oleg Szehr |
Abstract: | The construction of replication strategies for contingent claims in the presence of risk and market friction is a key problem of financial engineering. In real markets, continuous replication, such as in the model of Black, Scholes and Merton, is not only unrealistic but it is also undesirable due to high transaction costs. Over the last decades stochastic optimal-control methods have been developed to balance between effective replication and losses. More recently, with the rise of artificial intelligence, temporal-difference Reinforcement Learning, in particular variations of $Q$-learning in conjunction with Deep Neural Networks, have attracted significant interest. From a practical point of view, however, such methods are often relatively sample inefficient, hard to train and lack performance guarantees. This motivates the investigation of a stable benchmark algorithm for hedging. In this article, the hedging problem is viewed as an instance of a risk-averse contextual $k$-armed bandit problem, for which a large body of theoretical results and well-studied algorithms are available. We find that the $k$-armed bandit model naturally fits to the $P\&L$ formulation of hedging, providing for a more accurate and sample efficient approach than $Q$-learning and reducing to the Black-Scholes model in the absence of transaction costs and risks. |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.01623&r=all |
By: | Patryk Gierjatowicz; Marc Sabate-Vidales; David \v{S}i\v{s}ka; Lukasz Szpruch; \v{Z}an \v{Z}uri\v{c} |
Abstract: | Mathematical modelling is ubiquitous in the financial industry and drives key decision processes. Any given model provides only a crude approximation to reality and the risk of using an inadequate model is hard to detect and quantify. By contrast, modern data science techniques are opening the door to more robust and data-driven model selection mechanisms. However, most machine learning models are "black-boxes" as individual parameters do not have meaningful interpretation. The aim of this paper is to combine the above approaches achieving the best of both worlds. Combining neural networks with risk models based on classical stochastic differential equations (SDEs), we find robust bounds for prices of derivatives and the corresponding hedging strategies while incorporating relevant market data. The resulting model called neural SDE is an instantiation of generative models and is closely linked with the theory of causal optimal transport. Neural SDEs allow consistent calibration under both the risk-neutral and the real-world measures. Thus the model can be used to simulate market scenarios needed for assessing risk profiles and hedging strategies. We develop and analyse novel algorithms needed for efficient use of neural SDEs. We validate our approach with numerical experiments using both local and stochastic volatility models. |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.04154&r=all |
By: | Daisuke Miyakawa (Associate Professor, Hitotsubashi University Business School (E-mail: dmiyakawa@hub.hit-u.ac.jp)); Kohei Shintani (Director and Senior Economist, Institute for Monetary and Economic Studies, Bank of Japan (E-mail: kouhei.shintani@boj.or.jp)) |
Abstract: | We document how professional analysts' predictions of firm exits disagree with machine-based predictions. First, on average, human predictions underperform machine predictions. Second, however, the relative performance of human to machine predictions improves for firms with specific characteristics, such as less observable information, possibly due to the unstructured information used only in human predictions. Third, for firms with less information, reallocating prediction tasks from machine to analysts reduces type I error while simultaneously increasing type II error. Under certain conditions, human predictions can outperform machine predictions. |
Keywords: | Machine Learning, Human Prediction, Disagreement |
JEL: | C10 C55 G33 |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:ime:imedps:20-e-11&r=all |
By: | Feras A. Batarseh; Munisamy Gopinath; Anderson Monken |
Abstract: | International trade policies remain in the spotlight given the recent rethink on the benefits of globalization by major economies. Since trade critically affects employment, production, prices and wages, understanding and predicting future patterns of trade is a high-priority for decision making within and across countries. While traditional economic models aim to be reliable predictors, we consider the possibility that Artificial Intelligence (AI) techniques allow for better predictions and associations to inform policy decisions. Moreover, we outline contextual AI methods to decipher trade patterns affected by outlier events such as trade wars and pandemics. Open-government data are essential to providing the fuel to the algorithms that can forecast, recommend, and classify policies. Data collected for this study describe international trade transactions and commonly associated economic factors. Models deployed include Association Rules for grouping commodity pairs; and ARIMA, GBoosting, XGBoosting, and LightGBM for predicting future trade patterns. Models and their results are introduced and evaluated for prediction and association quality with example policy implications. |
Keywords: | AI; International trade; Boosting; Prediction; Data mining; Imports and exports; Outlier events |
JEL: | F13 F17 C55 C80 |
Date: | 2020–08–20 |
URL: | http://d.repec.org/n?u=RePEc:fip:fedgif:1296&r=all |
By: | Dirk Roeder; Georgi Dimitroff |
Abstract: | In a recent paper "Deep Learning Volatility" a fast 2-step deep calibration algorithm for rough volatility models was proposed: in the first step the time consuming mapping from the model parameter to the implied volatilities is learned by a neural network and in the second step standard solver techniques are used to find the best model parameter. In our paper we compare these results with an alternative direct approach where the the mapping from market implied volatilities to model parameters is approximated by the neural network, without the need for an extra solver step. Using a whitening procedure and a projection of the target parameter to [0,1], in order to be able to use a sigmoid type output function we found that the direct approach outperforms the two-step one for the data sets and methods published in "Deep Learning Volatility". For our implementation we use the open source tensorflow 2 library. The paper should be understood as a technical comparison of neural network techniques and not as an methodically new Ansatz. |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.03494&r=all |
By: | John Gibson; Susan Olivia; Geua Boe-Gibson |
Abstract: | Night lights, as detected by satellites, are increasingly used by economists, typically as a proxy for economic activity. The growing popularity of these data reflects either the absence, or the presumed inaccuracy, of more conventional economic statistics, like national or regional GDP. Further growth in use of night lights is likely, as they have been included in the AidData geo-query tool for providing sub-national data, and in geographic data that the Demographic and Health Survey links to anonymised survey enumeration areas. Yet this ease of obtaining night lights data may lead to inappropriate use, if users fail to recognize that most of the satellites providing these data were not designed to assist economists, and have features that may threaten validity of analyses based on these data, especially for temporal comparisons, and for small and rural areas. In this paper we review sources of satellite data on night lights, discuss issues with these data, and survey some of their uses in economics. |
Keywords: | Density, development, DMSP, luminosity, night lights, VIIRS |
JEL: | O15 R12 |
Date: | 2020 |
URL: | http://d.repec.org/n?u=RePEc:lic:licosd:41920&r=all |
By: | Rangan Gupta (Department of Economics, University of Pretoria, Pretoria, 0002, South Africa); Hardik A. Marfatia (Department of Economics, Northeastern Illinois University, 5500 N St Louis Ave, BBH 344G, Chicago, IL 60625, USA); Christian Pierdzioch (Department of Economics, Helmut Schmidt University, Holstenhofweg 85, P.O.B. 700822, 22008 Hamburg, Germany); Afees A. Salisu (Centre for Econometric & Allied Research, University of Ibadan, Ibadan, Nigeria) |
Abstract: | We analyze the role of macroeconomic uncertainty in predicting synchronization in housing price movements across all the United States (US) states plus District of Columbia (DC). We first use a Bayesian dynamic factor model to decompose the house price movements into a national, four regional (Northeast, South, Midwest, and West), and state-specific factors. We then study the ability of macroeconomic uncertainty in forecasting the comovements in housing prices, by controlling for a wide-array of predictors, such as factors derived from a large macroeconomic dataset, oil shocks, and financial market-related uncertainties. To accommodate for multiple predictors and nonlinearities, we take a machine learning approach of random forests. Our results provide strong evidence of forecastability of the national house price factor based on the information content of macroeconomic uncertainties over and above the other predictors. This result also carries over, albeit by a varying degree, to the factors associated with the four census regions, and the overall house price growth of the US economy. Moreover, macroeconomic uncertainty is found to have predictive content for (stochastic) volatility of the national factor and aggregate US house price. Our results have important implications for policymakers and investors. |
Keywords: | Machine learning, Random forests, Bayesian dynamic factor model, Forecasting, Housing markets synchronization, United States |
JEL: | C22 C32 E32 Q02 R30 |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:pre:wpaper:202077&r=all |
By: | Ana Rodrigues Bidarra |
Abstract: | A economia digital revolucionou a estrutura tradicional e o funcionamento dos mercados. Os dados pessoais são considerados o “novo petróleo” da actividade económica, um recurso fundamental cuja recolha e análise em larga escala são potenciadas pelo desenvolvimento das TIC. A simbiose entre Big Data e Big Analytics pode promover um ambiente concorrencial benéfico, para empresas e consumidores, mas à medida que se expandem as fronteiras da inovação e da ciência surgem preocupações que colocam em causa esta nova dinâmica de mercado. Pugnamos que, na circunstância em que as empresas concorrem nos mercados digitais orientados por dados e os consumidores, enquanto titulares de dados pessoais, são negativamente afectados, designadamente pelo decréscimo da qualidade do tratamento dos dados pessoais, há lugar à intersecção entre o direito da concorrência e o direito da protecção de dados que justifica uma intervenção coordenada com vista à análise holística das questões suscitadas. As plataformas digitais multilaterais com modelos de negócio assentes na monetização de Big Data através da publicidade apresentam um desafio aos instrumentos de concorrência tradicionais, baseados no preço, que se encontram desadequados para proceder a uma apreciação cabal destes mercados. Através da análise da decisão da CE na operação de concentração Facebook/WhatsApp demonstramos a necessidade de melhor compreensão do funcionamento das plataformas multilaterais e a premência na adequação das ferramentas de análise jusconcorrenciais à apreciação das concentrações motivadas por dados num contexto digital. |
Keywords: | Big Data; direito da concorrência; protecção de dados pessoais; controlo de concentrações; plataformas multilaterais |
JEL: | K21 L12 L41 |
Date: | 2020–04 |
URL: | http://d.repec.org/n?u=RePEc:mde:wpaper:0148&r=all |
By: | Jacobs, B.J.D.; Fok, D.; Donkers, A.C.D. |
Abstract: | In modern retail contexts, retailers sell products from vast product assortments to a large and heterogeneous customer base. Understanding purchase behavior in such a context is very important. Standard models cannot be used due to the high dimen- sionality of the data. We propose a new model that creates an efficient dimension reduction through the idea of purchase motivations. We only require customer-level purchase history data, which is ubiquitous in modern retailing. The model han- dles large-scale data and even works in settings with shopping trips consisting of few purchases. As scalability of the model is essential for practical applicability, we develop a fast, custom-made inference algorithm based on variational inference. Essential features of our model are that it accounts for the product, customer and time dimensions present in purchase history data; relates the relevance of moti- vations to customer- and shopping-trip characteristics; captures interdependencies between motivations; and achieves superior predictive performance. Estimation re- sults from this comprehensive model provide deep insights into purchase behavior. Such insights can be used by managers to create more intuitive, better informed, and more effective marketing actions. We illustrate the model using purchase history data from a Fortune 500 retailer involving more than 4,000 unique products. |
Keywords: | dynamic purchase behavior, large-scale assortment, purchase history data, topic model, machine learning, variational inference |
Date: | 2020–08–01 |
URL: | http://d.repec.org/n?u=RePEc:ems:eureri:129674&r=all |
By: | Selod,Harris; Soumahoro,Souleymane |
Abstract: | This paper reviews the emerging big data literature applied to urban transportation issues from the perspective of economic research. It provides a typology of big data sources relevant to transportation analyses and describes how these data can be used to measure mobility, associated externalities, and welfare impacts. As an application, it showcases the use of daily traffic conditions data in various developed and developing country cities to estimate the causal impact of stay-at-home orders during the Covid-19 pandemic on traffic congestion in Bogotá, New Dehli, New York, and Paris. In light of the advances in big data analytics, the paper concludes with a discussion on policy opportunities and challenges. |
Date: | 2020–06–30 |
URL: | http://d.repec.org/n?u=RePEc:wbk:wbrwps:9308&r=all |
By: | Kaukin, Andrei (Каукин, Андрей) (The Russian Presidential Academy of National Economy and Public Administration); Kosarev, Vladimir (Косарев, Владимир) (The Russian Presidential Academy of National Economy and Public Administration) |
Abstract: | The paper presents a method for conditional forecasting of the economic cycle taking into account industry dynamics. The predictive model includes a neural network auto-encoder and an adapted deep convolutional network of the «WaveNet» architecture. The first function block reduces the dimension of the data. The second block predicts the phase of the economic cycle of the studied industry. A neural network uses the main components of the explanatory factors as input. The proposed model can be used both as an independent and an additional method for estimating the growth rate of the industrial production index along with dynamic factor models. |
Date: | 2020–05 |
URL: | http://d.repec.org/n?u=RePEc:rnp:wpaper:052019&r=all |
By: | Hannes Mueller (Institut d’Analisi Economica (CSIC), Barcelona GSE); Christopher Rauh (Université de Montréal, CIREQ) |
Abstract: | There is a rising interest in conflict prevention and this interest provides a strong motivation for better conflict forecasting. A key problem of conflict forecasting for preventionis that predicting the start of conflict in previously peaceful countries is extremely hard.To make progress in this hard problem this project exploits both supervised and unsupervised machine learning. Specifically, the latent Dirichlet allocation (LDA) model is usedfor feature extraction from 3.8 million newspaper articles and these features are then usedin a random forest model to predict conflict. We find that several features are negativelyassociated with the outbreak of conflict and these gain importance when predicting hardonsets. This is because the decision tree uses the text features in lower nodes where theyare evaluated conditionally on conflict history, which allows the random forest to adapt tothe hard problem and provides useful forecasts for prevention. |
Date: | 2019–04 |
URL: | http://d.repec.org/n?u=RePEc:mtl:montec:02-2019&r=all |
By: | Paola Tubaro (LRI - Laboratoire de Recherche en Informatique - CNRS - Centre National de la Recherche Scientifique - UP11 - Université Paris-Sud - Paris 11 - CentraleSupélec, TAU - TAckling the Underspecified - LRI - Laboratoire de Recherche en Informatique - CNRS - Centre National de la Recherche Scientifique - UP11 - Université Paris-Sud - Paris 11 - CentraleSupélec - Inria Saclay - Ile de France - Inria - Institut National de Recherche en Informatique et en Automatique, CNRS - Centre National de la Recherche Scientifique); Clément Le Ludec (I3, une unité mixte de recherche CNRS (UMR 9217) - Institut interdisciplinaire de l’innovation - CNRS - Centre National de la Recherche Scientifique - X - École polytechnique - Télécom ParisTech - MINES ParisTech - École nationale supérieure des mines de Paris, SES - Département Sciences Economiques et Sociales - Télécom ParisTech, IP Paris - Institut Polytechnique de Paris, SID - Sociologie Information-Communication Design - I3, une unité mixte de recherche CNRS (UMR 9217) - Institut interdisciplinaire de l’innovation - CNRS - Centre National de la Recherche Scientifique - X - École polytechnique - Télécom ParisTech - MINES ParisTech - École nationale supérieure des mines de Paris); Antonio Casilli (I3, une unité mixte de recherche CNRS (UMR 9217) - Institut interdisciplinaire de l’innovation - CNRS - Centre National de la Recherche Scientifique - X - École polytechnique - Télécom ParisTech - MINES ParisTech - École nationale supérieure des mines de Paris, SES - Département Sciences Economiques et Sociales - Télécom ParisTech, IP Paris - Institut Polytechnique de Paris, SID - Sociologie Information-Communication Design - I3, une unité mixte de recherche CNRS (UMR 9217) - Institut interdisciplinaire de l’innovation - CNRS - Centre National de la Recherche Scientifique - X - École polytechnique - Télécom ParisTech - MINES ParisTech - École nationale supérieure des mines de Paris) |
Abstract: | 'Micro-work' consists of fragmented data tasks that myriad providers execute on online platforms. While crucial to the development of data-based technologies, this little visible and geographically spread activity is particularly difficult to measure. To fill this gap, we combined qualitative and quantitative methods (online surveys, in-depth interviews, capture-recapture techniques, and web traffic analytics) to count micro-workers in a single country, France. On the basis of this analysis, we estimate that approximately 260,000 people are registered with micro-work platforms. Of these some 50,000 are 'regular' workers who do micro-tasks at least monthly and we speculate that using a more restrictive measure of 'very active' workers decreases this figure to 15,000. This analysis contributes to research on platform labour and the labour in the digital economy that lies behind artificial intelligence. |
Keywords: | Micro-work,digital platforms,labour statistics |
Date: | 2020 |
URL: | http://d.repec.org/n?u=RePEc:hal:journl:hal-02898905&r=all |
By: | Edmond Lezmi; Jules Roche; Thierry Roncalli; Jiali Xu |
Abstract: | This article explores the use of machine learning models to build a market generator. The underlying idea is to simulate artificial multi-dimensional financial time series, whose statistical properties are the same as those observed in the financial markets. In particular, these synthetic data must preserve the probability distribution of asset returns, the stochastic dependence between the different assets and the autocorrelation across time. The article proposes then a new approach for estimating the probability distribution of backtest statistics. The final objective is to develop a framework for improving the risk management of quantitative investment strategies, in particular in the space of smart beta, factor investing and alternative risk premia. |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.04838&r=all |
By: | Burn, Ian (University of Liverpool); Button, Patrick (Tulane University); Munguia Corella, Luis (University of California, Irvine); Neumark, David (University of California, Irvine) |
Abstract: | We study the relationships between ageist stereotypes – as reflected in the language used in job ads – and age discrimination in hiring, exploiting the text of job ads and differences in callbacks to older and younger job applicants from a resume (correspondence study) field experiment (Neumark, Burn, and Button, 2019). Our analysis uses methods from computational linguistics and machine learning to directly identify, in a field-experiment setting, ageist stereotypes that underlie age discrimination in hiring. The methods we develop provide a framework for applied researchers analyzing textual data, highlighting the usefulness of various computer science techniques for empirical economics research. We find evidence that language related to stereotypes of older workers sometimes predicts discrimination against older workers. For men, our evidence points to age stereotypes about all three categories we consider – health, personality, and skill – predicting age discrimination, and for women, age stereotypes about personality. In general, the evidence is much stronger for men, and our results for men are quite consistent with the industrial psychology literature on age stereotypes. |
Keywords: | ageist stereotypes, age discrimination, job ads, machine learning |
JEL: | J14 J7 |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:iza:izadps:dp13506&r=all |
By: | Daniel Goller |
Abstract: | We analyse a sequential contest with two players in darts where one of the contestants enjoys a technical advantage. Using methods from the causal machine learning literature, we analyse the built-in advantage, which is the first-mover having potentially more but never less moves. Our empirical findings suggest that the first-mover has an 8.6 percentage points higher probability to win the match induced by the technical advantage. Contestants with low performance measures and little experience have the highest built-in advantage. With regard to the fairness principle that contestants with equal abilities should have equal winning probabilities, this contest is ex-ante fair in the case of equal built-in advantages for both competitors and a randomized starting right. Nevertheless, the contest design produces unequal probabilities of winning for equally skilled contestants because of asymmetries in the built-in advantage associated with social pressure for contestants competing at home and away. |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.07165&r=all |
By: | Moustafa, Khaled |
Abstract: | The ongoing COVID-19 pandemic should teach us some lessons at health, environmental and human levels toward more fairness, human cohesion and environmental sustainability. At a health level, the pandemic raises the importance of housing for everyone particularly vulnerable and homeless people to protect them from the disease and other similar airborne pandemics. Here, I propose to make good use of big data along with 3D construction printers to solve major and pressing needs of housing worldwide. Big data can be used to calculate how many people do need accommodations and 3D construction printers to build houses accordingly and swiftly. The combination of such facilities- big data and 3D printers- can help solve global housing crises more efficiently than traditional and unguided construction plans. This is particularly urgent under environmental and major health crises where health and housing are tightly interrelated. |
Date: | 2020–08–26 |
URL: | http://d.repec.org/n?u=RePEc:osf:arabix:gnhvr&r=all |
By: | Stephanie Assad; Robert Clark (Queen's University); Daniel Ershov; Lei Xu |
Abstract: | Economic theory provides ambiguous and conflicting predictions about the association between algorithmic pricing and competition. In this paper we provide the first empirical analysis of this relationship. We study Germany's retail gasoline market where algorithmic-pricing software became widely available by mid-2017, and for which we have access to comprehensive, high-frequency price data. Because adoption dates are unknown, we identify gas stations that adopt algorithmic-pricing software by testing for structural breaks in markers associated with algorithmic pricing. We nd a large number of station-level structural breaks around the suspected time of large-scale adoption. Using this information we investigate the impact of adoption on outcomes linked to competition. Because station-level adoption is endogenous, we use brand headquarter-level adoption decisions as instruments. Our IV results show that adoption increases margins by 9%, but only in non-monopoly markets. Restricting attention to duopoly markets, we find that market-level margins do not change when only one of the two stations adopts, but increase by 28% in markets where both do. These results suggest that AI adoption has a significant effect on competition. |
Keywords: | Artificial Intelligence, Pricing-Algorithms, Collusion, Retail Gasoline |
JEL: | L41 L13 D43 D83 L71 |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:qed:wpaper:1438&r=all |
By: | Jie Fang; Jianwu Lin; Shutao Xia; Yong Jiang; Zhikang Xia; Xiang Liu |
Abstract: | Instead of conducting manual factor construction based on traditional and behavioural finance analysis, academic researchers and quantitative investment managers have leveraged Genetic Programming (GP) as an automatic feature construction tool in recent years, which builds reverse polish mathematical expressions from trading data into new factors. However, with the development of deep learning, more powerful feature extraction tools are available. This paper proposes Neural Network-based Automatic Factor Construction (NNAFC), a tailored neural network framework that can automatically construct diversified financial factors based on financial domain knowledge and a variety of neural network structures. The experiment results show that NNAFC can construct more informative and diversified factors than GP, to effectively enrich the current factor pool. For the current market, both fully connected and recurrent neural network structures are better at extracting information from financial time series than convolution neural network structures. Moreover, new factors constructed by NNAFC can always improve the return, Sharpe ratio, and the max draw-down of a multi-factor quantitative investment strategy due to their introducing more information and diversification to the existing factor pool. |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.06225&r=all |
By: | Tien Mai; Patrick Jaillet |
Abstract: | We study the relation between different Markov Decision Process (MDP) frameworks in the machine learning and econometrics literatures, including the standard MDP, the entropy and general regularized MDP, and stochastic MDP, where the latter is based on the assumption that the reward function is stochastic and follows a given distribution. We show that the entropy-regularized MDP is equivalent to a stochastic MDP model, and is strictly subsumed by the general regularized MDP. Moreover, we propose a distributional stochastic MDP framework by assuming that the distribution of the reward function is ambiguous. We further show that the distributional stochastic MDP is equivalent to the regularized MDP, in the sense that they always yield the same optimal policies. We also provide a connection between stochastic/regularized MDP and constrained MDP. Our work gives a unified view on several important MDP frameworks, which would lead new ways to interpret the (entropy/general) regularized MDP frameworks through the lens of stochastic rewards and vice-versa. Given the recent popularity of regularized MDP in (deep) reinforcement learning, our work brings new understandings of how such algorithmic schemes work and suggest ideas to develop new ones. |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.07820&r=all |
By: | Benjamin Virrion (CEREMADE - CEntre de REcherches en MAthématiques de la DEcision - CNRS - Centre National de la Recherche Scientifique - Université Paris Dauphine-PSL) |
Abstract: | We present a generic path-dependent importance sampling algorithm where the Girsanov induced change of probability on the path space is represented by a sequence of neural networks taking the past of the trajectory as an input. At each learning step, the neural networks' parameters are trained so as to reduce the variance of the Monte Carlo estimator induced by this change of measure. This allows for a generic path dependent change of measure which can be used to reduce the variance of any path-dependent financial payoff. We show in our numerical experiments that for payoffs consisting of either a call, an asymmetric combination of calls and puts, a symmetric combination of calls and puts, a multi coupon autocall or a single coupon autocall, we are able to reduce the variance of the Monte Carlo estimators by factors between 2 and 9. The numerical experiments also show that the method is very robust to changes in the parameter values, which means that in practice, the training can be done offline and only updated on a weekly basis. |
Keywords: | Path-Dependence,Importance Sampling,Neural Networks |
Date: | 2020–07–06 |
URL: | http://d.repec.org/n?u=RePEc:hal:wpaper:hal-02887331&r=all |
By: | Jesus Lago; Grzegorz Marcjasz; Bart De Schutter; Rafa{\l} Weron |
Abstract: | While the field of electricity price forecasting has benefited from plenty of contributions in the last two decades, it arguably lacks a rigorous approach to evaluating new predictive algorithms. The latter are often compared using unique, not publicly available datasets and across too short and limited to one market test samples. The proposed new methods are rarely benchmarked against well established and well performing simpler models, the accuracy metrics are sometimes inadequate and testing the significance of differences in predictive performance is seldom conducted. Consequently, it is not clear which methods perform well nor what are the best practices when forecasting electricity prices. In this paper, we tackle these issues by performing a literature survey of state-of-the-art models, comparing state-of-the-art statistical and deep learning methods across multiple years and markets, and by putting forward a set of best practices. In addition, we make available the considered datasets, forecasts of the state-of-the-art models, and a specifically designed python toolbox, so that new algorithms can be rigorously evaluated in future studies. |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.08004&r=all |
By: | Keeley, Alexander Ryota; Matsumoto, Ken'ichi; Tanaka, Kenta; Sugiawan, Yogi; Managi, Shunsuke |
Abstract: | This study combines regression analysis with machine learning analysis to study the merit order effect of renewable energy focusing on German market, the largest market in Europe with high renewable energy penetration. The results show that electricity from wind and solar sources reduced the spot market price by 9.64 €/MWh on average during the period from 2010 to 2017. Wind had a relatively stable impact across the day, ranging from 5.88 €/MWh to 8.04 €/MWh, while the solar energy impact varied greatly across different hours, ranging from 0.24 €/MWh to 11.78 €/MWh and having a stronger impact than wind during peak hours. The results also show characteristics of the interactions between renewable energy and spot market prices, including the slightly diminishing merit order effect of renewable energy at high generation volumes. Finally, a scenario-based analysis illustrates how different proportions of wind and solar energies affect the spot market price. |
Keywords: | Renewable energy sources, Electricity spot price, Intermittency, Merit order effect, Boosting. |
JEL: | Q41 Q42 Q47 Q56 |
Date: | 2020 |
URL: | http://d.repec.org/n?u=RePEc:pra:mprapa:102314&r=all |
By: | Benjamin Virrion (CEREMADE) |
Abstract: | We present a generic path-dependent importance sampling algorithm where the Girsanov induced change of probability on the path space is represented by a sequence of neural networks taking the past of the trajectory as an input. At each learning step, the neural networks' parameters are trained so as to reduce the variance of the Monte Carlo estimator induced by this change of measure. This allows for a generic path dependent change of measure which can be used to reduce the variance of any path-dependent financial payoff. We show in our numerical experiments that for payoffs consisting of either a call, an asymmetric combination of calls and puts, a symmetric combination of calls and puts, a multi coupon autocall or a single coupon autocall, we are able to reduce the variance of the Monte Carlo estimators by factors between 2 and 9. The numerical experiments also show that the method is very robust to changes in the parameter values, which means that in practice, the training can be done offline and only updated on a weekly basis. |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.02692&r=all |
By: | Jetter, Michael (University of Western Australia); Mahmood, Rafat (University of Western Australia); Parmeter, Christopher F. (University of Miami); Ramirez Hassan, Andres (Universidad EAFIT) |
Abstract: | Model uncertainty remains a persistent concern when exploring the drivers of civil conflict and civil war. Considering a comprehensive set of 34 potential determinants in 175 post-Cold-War countries (covering 98.2% of the world population), we employ stochastic search variable selection (SSVS) to sort through all 234 possible models. Looking across both cross-sectional and panel data, three robust results emerge. First, past conflict constitutes the most powerful predictor of current conflict: path dependency matters. Second, larger shares of Jewish, Muslim, or Christian citizens are associated with increased chances of conflict incidence and onset - a result that is independent of religious fractionalization, polarization, and dominance. Third, economic and political factors remain less relevant than colonial origin and religion. These results lend credence to several existing schools of thought on civil conflict and provide new avenues for future research. |
Keywords: | civil conflict, civil war, stochastic search variable selection (SSVS), greed versus grievances, religion and conflict |
JEL: | D74 Q34 Z12 F54 |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:iza:izadps:dp13511&r=all |
By: | Miquel-Àngel Garcia-López (Department of Applied Economics, Universidad Autónoma de Barcelona, 08193, Bellaterra, Spain); Jordi Jofre-Monseny (Department of Economics , Universidad de Barcelona, 08034 Barcelona, Spain and Institut d´Economia de Barcelona (IEB)); Rodrigo Martínez-Mazza (Department of Economics , Universidad de Barcelona, 08034 Barcelona, Spain and Institut d´Economia de Barcelona (IEB)); Mariona Segú (RITM, Université Paris Sud, Paris Saclay) |
Abstract: | In this paper, we assess the impact of Airbnb on housing rents and prices in the city of Barcelona. Examining very detailed data on rents and both transaction and posted prices, we use several econometric approaches that exploit the exact timing and geography of Airbnb activity in the city. These include i) panel fixed-effects models, where we run multiple specifications that allow for different forms of heterogeneous time trends across neighborhoods, ii) an instrumental variables shift-share approach in which tourist amenities predict where Airbnb listings will locate and Google searches predict when listings appear, iii) event-study designs, and iv) finally, we present evidence from Sagrada Familia, a major tourist amenity that is not found in the city centre. Our main results imply that for the average neighborhood, Airbnb activity has increased rents by 1.9%, transaction prices by 4.6% and posted prices by 3.7%. The estimated impact in neighborhoods with high Airbnb activity is substantial. For neighborhoods in the top decile of Airbnb activity distribution, rents are estimated to have increased by 7%, while increases in transaction (posted) prices are estimated at 17% (14%). |
Keywords: | Housing markets, short-term rentals, Airbnb. |
Date: | 2020–09 |
URL: | http://d.repec.org/n?u=RePEc:uab:wprdea:wpdea2006&r=all |
By: | Sean F. Ennis (Centre for Competition Policy and Norwich Business School, University of East Anglia); Amelia Fletcher (Centre for Competition Policy and Norwich Business School, University of East Anglia) |
Abstract: | The year 2019 was a turning point in the debate around how to address competition issues in digital platform markets. At the start of the year, the focus was on reform of competition law. By July, there had been calls – on both sides of the Atlantic – for pro-competitive ex ante regulation. This paper considers these developments through the lens of three influential expert reports, from the EC, UK and US. While the reports offer similar diagnoses of the underlying economic drivers of competition concerns in digital platform markets, they reach somewhat different policy conclusions. The EC report, which was commissioned first, highlights recommendations for antitrust. While it recognises that a regulatory regime may be needed in the longer run, this option is not considered in any detail. By contrast, the UK and US expert reports argue strongly for ex ante regulation. There are other differences too. While the US and EC experts were inclined to relax or reverse burdens of proof for both mergers and abuse of dominance, albeit in specified circumstances only, the UK experts did not recommend this. This paper compares these reports under the categories of mergers, dominance, data, regulation, and international. |
Keywords: | Antitrust, Competition Policy, Digital Markets, Platforms, Merger Policy, Regulation, Big Data |
JEL: | K21 L13 L40 L50 L86 |
Date: | 2020–08–24 |
URL: | http://d.repec.org/n?u=RePEc:uea:ueaccp:2020_05&r=all |
By: | Filomena Garcia; Muxin Li |
Abstract: | In this paper we study the effects of the introduction of a new two sided platform endowed with artificial intelligence in a market where a firm provides a brick and mortar platform to buyers and sellers. In our theoretical model we show that the decision of whether to introduce the new platform depends on the reduction of the search cost for the consumers. We also show that the introduction of the platform enlarges the market with more consumers using both platforms. Finally we study the welfare effect of the introduction of the platform opening the discussion on whether certain artificial intelligence devices for shopping should be regulated. |
Keywords: | e-Commerce; Intermediary; Two-sided markets |
JEL: | L1 L2 L8 |
Date: | 2020–04 |
URL: | http://d.repec.org/n?u=RePEc:mde:wpaper:0146&r=all |
By: | Sonoo Thadaney Israni; Michael E. Matheny; Ryan Matlow; Danielle Whicher |
Abstract: | In this supplement, Giovanelli et al. see the hope and promise of carefully leveraging AI to support adolescent health, underscoring the need for multidisciplinary developmental science teams with expertise in cognitive development, medicine, psychology, computer science, and medical informatics. |
Keywords: | artificial intelligence, Adolescent health |
URL: | http://d.repec.org/n?u=RePEc:mpr:mprres:cbbb2f7034fc466f891adcfa4e912f08&r=all |
By: | Youssef M. Aboutaleb; Moshe Ben-Akiva; Patrick Jaillet |
Abstract: | This paper introduces a new data-driven methodology for nested logit structure discovery. Nested logit models allow the modeling of positive correlations between the error terms of the utility specifications of the different alternatives in a discrete choice scenario through the specification of a nesting structure. Current nested logit model estimation practices require an a priori specification of a nesting structure by the modeler. In this we work we optimize over all possible specifications of the nested logit model that are consistent with rational utility maximization. We formulate the problem of learning an optimal nesting structure from the data as a mixed integer nonlinear programming (MINLP) optimization problem and solve it using a variant of the linear outer approximation algorithm. We exploit the tree structure of the problem and utilize the latest advances in integer optimization to bring practical tractability to the optimization problem we introduce. We demonstrate the ability of our algorithm to correctly recover the true nesting structure from synthetic data in a Monte Carlo experiment. In an empirical illustration using a stated preference survey on modes of transportation in the U.S. state of Massachusetts, we use our algorithm to obtain an optimal nesting tree representing the correlations between the unobserved effects of the different travel mode choices. We provide our implementation as a customizable and open-source code base written in the Julia programming language. |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.08048&r=all |
By: | Fafchamps,Marcel; Shilpi,Forhad J. |
Abstract: | This paper uses high-resolution satellite data on the proportion of buildings in a 250x250 meter cell to study the evolution of human settlement in Ghana over a 40-year period. The analysis finds a strong increase in built-up area over time, mostly concentrated in the vicinity of roads, and also directly on the coast. There is strong evidence of agglomeration effects in the static sense -- buildup in one cell predicts buildup in a nearby cell -- and in a dynamic sense -- buildup in a cell predicts buildup in that cell later on, and an increase in buildup in nearby cells. These effects are strongest over a radius of 3 to 15 kilometers. No evidence is found that human settlements are spaced more or less equally over the landscape or along roads. By fitting a transition matrix to the data, this paper predicts a sharp increase in the proportion of the country that is densely built-up by the middle and end of the century, but there is no increase in the proportion of partially built-up locations. |
Date: | 2020–07–01 |
URL: | http://d.repec.org/n?u=RePEc:wbk:wbrwps:9314&r=all |
By: | António Rua; Nuno Lourenço |
Abstract: | The SARS-CoV-2 outbreak has spread worldwide causing unprecedented disruptions in the economies. These unparalleled changes in economic conditions made clear the urgent need to depart from traditional statistics to inform policy responses. Hence, the interest in tracking economic activity in a timely manner has led economic agents to rely on high-frequency data as traditional statistics are released with a lag and available at a lower frequency. Naturally, taking on board such a novel data involves addressing some of the complexities of highfrequency data (e.g. marked seasonal patterns or calendar effects). Herein, we propose a daily economic indicator (DEI), which can be used to assess the behavior of economic activity during the lockdown period in Portugal. The indicator points to a sudden and sharp drop of economic activity around mid-March 2020, when the highest level of alert due to the COVID-19 pandemic was declared in March 12. It declined further after the declaration of the State of Emergency in the entire Portuguese territory in March 18, reflecting the lockdown of several economic activities. The DEI also points to an unprecedented decline of economic activity in the first half of April, with some very mild signs of recovery at the end of the month. |
JEL: | C22 C38 E32 |
Date: | 2020 |
URL: | http://d.repec.org/n?u=RePEc:ptu:wpaper:w202013&r=all |
By: | Mendez-Guerra, Carlos; Santos-Marquez, Felipe |
Abstract: | Satellite nighttime light data are increasingly used for evaluating the performance of economies in which official statics are non-existent, limited, or non-comparable. In this paper,we use a novel luminosity-based measure of GDP per capita to study regional convergence and spatial dependence across 274 subnational regions of the Association of South East Asian Nations(ASEAN) over the 1998-2012 period. Specifically, we first evaluate the usefulness of this new luminosity indicator in the context of ASEAN regions. Results show that almost 60 percent of the differences in (official) GDP per capita can be predicted by this luminosity-based measure of GDP. Next, given its potential usefulness for predicting regional GDP, we evaluate the spatio-temporal dynamics of regional inequality across ASEAN. Results indicate that although there is an overall (average) process of regional convergence, regional inequality within most countries has not significantly decreased. When evaluating the patterns of spatial dependence, we find increasing spatial dependence over time and stable spatial clusters (hotspots and coldspots) that are located across multiple national boundaries. Taken together, these results provide a new and more disaggregated perspective of the integration process of the ASEAN community. |
Keywords: | convergence spatial dependence satellite nighttime light data luminosity subnational regions ASEAN |
JEL: | O57 R10 R11 |
Date: | 2020–08–17 |
URL: | http://d.repec.org/n?u=RePEc:pra:mprapa:102510&r=all |
By: | Paulina Concha Larrauri; Upmanu Lall |
Abstract: | Frozen concentrated orange juice (FCOJ) is a commodity traded in the International Commodity Exchange. The FCOJ future price volatility is high because the world's orange production is concentrated in a few places, which results in extreme sensitivity to weather and disease. Most of the oranges produced in the United States are from Florida. The United States Department of Agriculture (USDA) issues orange production forecasts on the second week of each month from October to July. The October forecast in particular seems to affect FCOJ price volatility. We assess how a prediction of the directionality and magnitude of the error of the USDA October forecast could affect the decision making process of multiple FCOJ market participants, and if the "production uncertainty" of the forecast could be reduced by incorporating other climate variables. The models developed open up the opportunity to assess the application of the resulting probabilistic forecasts of the USDA production forecast error on the trading decisions of the different FCOJ stakeholders, and to perhaps consider the inclusion of climate predictors in the USDA forecast. |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.03015&r=all |