nep-big 2021-05-24 papers

on Big Data

Issue of 2021‒05‒24
25 papers chosen by
Tom Coupé
University of Canterbury

Binary Choice with Asymmetric Loss in a Data-Rich Environment: Theory and an Application to Racial Justice By Babii, Andrii; Chen, Xi; Ghysels, Eric; Kumar, Rohit
Data vs collateral By Chen, Shu; Gambacorta, Leonardo; Huang, Yiping; Li, Zhenhua; Qiu, Han
Urban economics in a historical perspective: Recovering data with machine learning By Combes, Pierre-Philippe; Gobillon, Laurent; Zylberberg, Yanos
Quand l’intelligence artificielle théorisera les organisations By Philippe Baumard
Exchange Rate Prediction with Machine Learning and a Smart Carry Trade Portfolio By Filippou, Ilias; Rapach, David; Taylor, Mark P; Zhou, Guofu
Classifying variety of customer's online engagement for churn prediction with mixed-penalty logistic regression By Petra Posedel \v{S}imovi\'c; Davor Horvatic; Edward W. Sun
Platform Design When Sellers Use Pricing Algorithms By Johnson, Justin; Rhodes, Andrew; Wildenbeest, Matthijs
BBE: Simulating the Microstructural Dynamics of an In-Play Betting Exchange via Agent-Based Modelling By Dave Cliff
Using four different online media sources to forecast the crude oil price By M. Elshendy; A. Fronzetti Colladon; E. Battistoni; P. A. Gloor
Firm-level Risk Exposures and Stock Returns in the Wake of COVID-19 By Davis, Steven J; Hansen, Stephen; Seminario-Amez, Cristhian
Application of Three Different Machine Learning Methods on Strategy Creation for Profitable Trades on Cryptocurrency Markets By Mohsen Asgari; Hossein Khasteh
ALIENs and Continuous Time Economies By Goutham Gopalakrishna
Robo-Advising: Enhancing Investment with Inverse Optimization and Deep Reinforcement Learning By Haoran Wang; Shi Yu
Autonomous algorithmic collusion: Economic research and policy implications By Assad, Stephanie; Calvano, Emilio; Calzolari, Giacomo; Clark, Robert; Denicolò, Vincenzo; Ershov, Daniel; Johnson, Justin; Pastorello, Sergio; Rhodes, Andrew; XU, Lei; Wildenbeest, Matthijs
Applications of artificial intelligence in supply chain management: Identification of main research fields and greatest industry interests By Lechtenberg, Sandra; Hellingrath, Bernd
Deep Learning Classification: Modeling Discrete Labor Choice By Maliar, Lilia; Maliar, Serguei
From Man vs. Machine to Man + Machine: The Art and AI of Stock Analyses By Sean Cao; Wei Jiang; Junbo L. Wang; Baozhong Yang
Reassessing the Resource Curse using Causal Machine Learning By Hodler, Roland; Lechner, Michael; Raschky, Paul A.
Expanding the Measurement of Culture with a Sample of Two Billion Humans By Awad, Edmond; Cebrián, Manuel; Cuevas Rumin, Angel; Cuevas Rumin, Ruben; Desmet, Klaus; Martín, Ignacio; Obradovich, Nick; Ortuño-Ortín, Ignacio; Ozak, Omer; Rahwan, Iyad
Using social network and semantic analysis to analyze online travel forums and forecast tourism demand By A Fronzetti Colladon; B Guardabascio; R Innarella
The Race of Man and Machine: Implications of Technology When Abilities and Demand Constraints Matter By Gries, Thomas; Naudé, Wim
Bad machines corrupt good morals By Köbis, Nils; Bonnefon, Jean-François; Rahwan, Iyad
Flexible Work Arrangements in Low Wage Jobs: Evidence from Job Vacancy Data By Adams-Prassl, Abigail; Balgova, Maria; Qian, Matthias
Collateral Damage: The Legacy of the Secret War in Laos By Riaño, Juan Felipe; Valencia Caicedo, Felipe
Identifying residential consumption patterns using data-mining techniques: A large-scale study of smart meter data in Chengdu, China By Kang, J.; Reiner, D.

Binary Choice with Asymmetric Loss in a Data-Rich Environment: Theory and an Application to Racial Justice

By:	Babii, Andrii; Chen, Xi; Ghysels, Eric; Kumar, Rohit
Abstract:	The importance of asymmetries in prediction problems arising in economics has been recognized for a long time. In this paper, we focus on binary choice problems in a data-rich environment with general loss functions. In contrast to the asymmetric regression problems, the binary choice with general loss functions and high-dimensional datasets is challenging and not well understood. Econometricians have studied binary choice problems for a long time, but the literature does not offer computationally attractive solutions in data-rich environments. In contrast, the machine learning literature has many computationally attractive algorithms that form the basis for much of the automated procedures that are implemented in practice, but it is focused on symmetric loss functions that are independent of individual characteristics. One of the main contributions of our paper is to show that the theoretically valid predictions of binary outcomes with arbitrary loss functions can be achieved via a very simple reweighting of the logistic regression, or other state-of-the-art machine learning techniques, such as boosting or (deep) neural networks. We apply our analysis to racial justice in pretrial detention.
Date:	2020–10
URL:	http://d.repec.org/n?u=RePEc:cpr:ceprdp:15418&r=

Data vs collateral

By:	Chen, Shu; Gambacorta, Leonardo; Huang, Yiping; Li, Zhenhua; Qiu, Han
Abstract:	The use of massive amounts of data by large technology firms (big techs) to assess firms' creditworthiness could reduce the need for collateral in solving asymmetric information problems in credit markets. Using a unique dataset of more than 2 million Chinese firms that received credit from both an important big tech firm (Ant Group) and traditional commercial banks, this paper investigates how different forms of credit correlate with local economic activity, house prices and firm characteristics. We find that big tech credit does not correlate with local business conditions and house prices when controlling for demand factors, but reacts strongly to changes in firm characteristics, such as transaction volumes and network scores used to calculate firm credit ratings. By contrast, both secured and unsecured bank credit react significantly to local house prices, which incorporate useful information on the environment in which clients operate and on their creditworthiness. This evidence implies that a greater use of big tech credit â?? granted on the basis of machine learning and big data â?? could reduce the importance of collateral in credit markets and potentially weaken the financial accelerator mechanism.
Keywords:	asymmetric information; banks; Big Data; big tech; Collateral; credit markets
JEL:	D22 G31 R30
Date:	2020–09
URL:	http://d.repec.org/n?u=RePEc:cpr:ceprdp:15262&r=

Urban economics in a historical perspective: Recovering data with machine learning

By:	Combes, Pierre-Philippe; Gobillon, Laurent; Zylberberg, Yanos
Abstract:	A recent literature has used a historical perspective to better understand fundamental questions of urban economics. However, a wide range of historical documents of exceptional quality remain underutilised: their use has been hampered by their original format or by the massive amount of information to be recovered. In this paper, we describe how and when the flexibility and predictive power of machine learning can help researchers exploit the potential of these historical documents. We first discuss how important questions of urban economics rely on the analysis of historical data sources and the challenges associated with transcription and harmonisation of such data. We then explain how machine learning approaches may address some of these challenges and we discuss possible applications.
Keywords:	History; Machine Learning; Urban Economics
JEL:	C45 C81 N90 R11 R12 R14
Date:	2020–09
URL:	http://d.repec.org/n?u=RePEc:cpr:ceprdp:15308&r=

Quand l’intelligence artificielle théorisera les organisations

By:	Philippe Baumard (ESD R3C - Équipe Sécurité & Défense - Renseignement, Criminologie, Crises, Cybermenaces - CNAM - Conservatoire National des Arts et Métiers [CNAM])
Abstract:	This article explores the feasibility of machines inventing and theorizing organizations. Most machine learning models are automated statistical processes that barely achieve a formal induction. In that sense, most current learning models do not generate new theories, but, instead, recognize a pre-existing order of symbols, signs or data. Most human theories are embodied and incarnated: they spawn from an organic connection to the world, which theorists can hardly escape. This article is organized in three parts. First, we study the history of artificial intelligence, from its foundation in the 19th century to its recent evolution, to understand what an artificial intelligence would be able to do in terms of theorization... Which leads us, in a second step, to question the act of scientific production in order to identify what can be considered a human act, and what can be the subject of modelling and autonomous learning led by a machine. The objective here is to assess the feasibility of substituting man with machine to produce research. In a third and final part, we propose four modes of theoretical exploration that are already the work of machines, or that could see, in the future, a complete substitution of man by machine. We conclude this article by sharing several questions about the future of research in organizational theory, and its utility, human or machine, for organizations and society.
Abstract:	Cet article 1 explore la possibilité qu'une intelligence machine puisse théoriser des organisations ; et qu'elle le fasse mieux qu'une intelligence humaine dans un proche futur. La plupart des modèles d'apprentissage des machines sont des processus statistiques automatisés qui sont à peine capables d'une induction formelle et ne génèrent pas de nouvelles théories, mais reconnaissent plutôt un ordre préexistant. Les théories humaines sont incarnées ; elles naissent d'un lien organique avec le monde, auquel les théoriciens ne peuvent échapper. Cet article envisage de surmonter cet obstacle pour accueillir une révolution théorique apportée par l'IA.
Keywords:	AI,artificial intelligence,organization theory,sociology of knowledge,cognitive theory
Date:	2019–11
URL:	http://d.repec.org/n?u=RePEc:hal:journl:hal-03218196&r=

Exchange Rate Prediction with Machine Learning and a Smart Carry Trade Portfolio

By:	Filippou, Ilias; Rapach, David; Taylor, Mark P; Zhou, Guofu
Abstract:	We establish the out-of-sample predictability of monthly exchange rate changes via machine learning techniques based on 70 predictors capturing country characteristics, global variables, and their interactions. To guard against overfi tting, we use the elastic net to estimate a high-dimensional panel predictive regression and find that the resulting forecast consistently outperforms the naive no-change benchmark, which has proven difficult to beat in the literature. The forecast also markedly improves the performance of a carry trade portfolio, especially during and after the global financial crisis. When we allow for more complex deep learning models, nonlinearities do not appear substantial in the data.
Keywords:	carry trade; deep neural network; Elastic Net; exchange rate predictability
JEL:	C45 F31 F37 G11 G12 G15
Date:	2020–09
URL:	http://d.repec.org/n?u=RePEc:cpr:ceprdp:15305&r=

Classifying variety of customer's online engagement for churn prediction with mixed-penalty logistic regression

By:	Petra Posedel \v{S}imovi\'c; Davor Horvatic; Edward W. Sun
Abstract:	Using big data to analyze consumer behavior can provide effective decision-making tools for preventing customer attrition (churn) in customer relationship management (CRM). Focusing on a CRM dataset with several different categories of factors that impact customer heterogeneity (i.e., usage of self-care service channels, duration of service, and responsiveness to marketing actions), we provide new predictive analytics of customer churn rate based on a machine learning method that enhances the classification of logistic regression by adding a mixed penalty term. The proposed penalized logistic regression can prevent overfitting when dealing with big data and minimize the loss function when balancing the cost from the median (absolute value) and mean (squared value) regularization. We show the analytical properties of the proposed method and its computational advantage in this research. In addition, we investigate the performance of the proposed method with a CRM data set (that has a large number of features) under different settings by efficiently eliminating the disturbance of (1) least important features and (2) sensitivity from the minority (churn) class. Our empirical results confirm the expected performance of the proposed method in full compliance with the common classification criteria (i.e., accuracy, precision, and recall) for evaluating machine learning methods.
Date:	2021–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2105.07671&r=

Platform Design When Sellers Use Pricing Algorithms

By:	Johnson, Justin; Rhodes, Andrew; Wildenbeest, Matthijs
Abstract:	Using both economic theory and Artificial Intelligence (AI) pricing algorithms, we investigate the ability of a platform to design its marketplace to promote competition, improve consumer surplus, and even raise its own profits. We allow sellers to use Q-learning algorithms (a common reinforcement-learning technique from the computer-science literature) to devise pricing strategies in a setting with repeated interactions, and consider the effect of platform rules that reward firms that cut prices with additional exposure to consumers. Overall, the evidence from our experiments suggests that platform design decisions can meaningfully benefit consumers even when algorithmic collusion might otherwise emerge but that achieving these gains may require more than the simplest steering policies when algorithms value the future highly. We also find that policies that raise consumer surplus can raise the profits of the platform, depending on the platform's revenue model. Finally, we document several learning challenges faced by the algorithms.
Keywords:	Algorithms; artificial intelligence; Collusion; platform design
JEL:	K21 L00
Date:	2020–11
URL:	http://d.repec.org/n?u=RePEc:cpr:ceprdp:15504&r=

BBE: Simulating the Microstructural Dynamics of an In-Play Betting Exchange via Agent-Based Modelling

By:	Dave Cliff
Abstract:	I describe the rationale for, and design of, an agent-based simulation model of a contemporary online sports-betting exchange: such exchanges, closely related to the exchange mechanisms at the heart of major financial markets, have revolutionized the gambling industry in the past 20 years, but gathering sufficiently large quantities of rich and temporally high-resolution data from real exchanges - i.e., the sort of data that is needed in large quantities for Deep Learning - is often very expensive, and sometimes simply impossible; this creates a need for a plausibly realistic synthetic data generator, which is what this simulation now provides. The simulator, named the "Bristol Betting Exchange" (BBE), is intended as a common platform, a data-source and experimental test-bed, for researchers studying the application of AI and machine learning (ML) techniques to issues arising in betting exchanges; and, as far as I have been able to determine, BBE is the first of its kind: a free open-source agent-based simulation model consisting not only of a sports-betting exchange, but also a minimal simulation model of racetrack sporting events (e.g., horse-races or car-races) about which bets may be made, and a population of simulated bettors who each form their own private evaluation of odds and place bets on the exchange before and - crucially - during the race itself (i.e., so-called "in-play" betting) and whose betting opinions change second-by-second as each race event unfolds. BBE is offered as a proof-of-concept system that enables the generation of large high-resolution data-sets for automated discovery or improvement of profitable strategies for betting on sporting events via the application of AI/ML and advanced data analytics techniques. This paper offers an extensive survey of relevant literature and explains the motivation and design of BBE, and presents brief illustrative results.
Date:	2021–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2105.08310&r=

Using four different online media sources to forecast the crude oil price

By:	M. Elshendy; A. Fronzetti Colladon; E. Battistoni; P. A. Gloor
Abstract:	This study looks for signals of economic awareness on online social media and tests their significance in economic predictions. The study analyses, over a period of two years, the relationship between the West Texas Intermediate daily crude oil price and multiple predictors extracted from Twitter, Google Trends, Wikipedia, and the Global Data on Events, Language, and Tone database (GDELT). Semantic analysis is applied to study the sentiment, emotionality and complexity of the language used. Autoregressive Integrated Moving Average with Explanatory Variable (ARIMAX) models are used to make predictions and to confirm the value of the study variables. Results show that the combined analysis of the four media platforms carries valuable information in making financial forecasting. Twitter language complexity, GDELT number of articles and Wikipedia page reads have the highest predictive power. This study also allows a comparison of the different fore-sighting abilities of each platform, in terms of how many days ahead a platform can predict a price movement before it happens. In comparison with previous work, more media sources and more dimensions of the interaction and of the language used are combined in a joint analysis.
Date:	2021–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2105.09154&r=

Firm-level Risk Exposures and Stock Returns in the Wake of COVID-19

By:	Davis, Steven J; Hansen, Stephen; Seminario-Amez, Cristhian
Abstract:	Firm-level stock returns differ enormously in reaction to COVID-19 news. We characterize these reactions using the Risk Factors discussions in pre-pandemic 10-K filings and two text-analytic approaches: expert-curated dictionaries and supervised machine learning (ML). Bad COVID-19 news lowers returns for firms with high exposures to travel, traditional retail, aircraft production and energy supply -- directly and via downstream demand linkages -- and raises them for firms with high exposures to healthcare policy, e-commerce, web services, drug trials and materials that feed into supply chains for semiconductors, cloud computing and telecommunications. Monetary and fiscal policy responses to the pandemic strongly impact firm-level returns as well, but differently than pandemic news. Despite methodological differences, dictionary and ML approaches yield remarkably congruent return predictions. Importantly though, ML operates on a vastly larger feature space, yielding richer characterizations of risk exposures and outperforming the dictionary approach in goodness-of-fit. By integrating elements of both approaches, we uncover new risk factors and sharpen our explanations for firm-level returns. To illustrate the broader utility of our methods, we also apply them to explain firm-level returns in reaction to the March 2020 Super Tuesday election results.
Date:	2020–09
URL:	http://d.repec.org/n?u=RePEc:cpr:ceprdp:15314&r=

Application of Three Different Machine Learning Methods on Strategy Creation for Profitable Trades on Cryptocurrency Markets

By:	Mohsen Asgari; Hossein Khasteh
Abstract:	AI and data driven solutions have been applied to different fields with outperforming and promising results. In this research work we apply k-Nearest Neighbours, eXtreme Gradient Boosting and Random Forest classifiers to direction detection problem of three cryptocurrency markets. Our input data includes price data and technical indicators. We use these classifiers to design a strategy to trade in those markets. Our test results on unseen data shows a great potential for this approach in helping investors with an expert system to exploit the market and gain profit. Our highest gain for an unseen 66 day span is 860 dollars per 1800 dollars investment. We also discuss limitations of these approaches and their potential impact to Efficient Market Hypothesis.
Date:	2021–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2105.06827&r=

ALIENs and Continuous Time Economies

By:	Goutham Gopalakrishna (Swiss Finance Institute (EPFL); Ecole Polytechnique Fédérale de Lausanne)
Abstract:	I develop a new computational framework called Actively Learned and Informed Equilibrium Nets (ALIENs) to solve continuous time economic models with endogenous state variables and highly non-linear policy functions. I employ neural networks that are trained to solve supervised learning problems that respect the laws governed by the economic system in the form of general parabolic partial differential equations. The economic information is encoded as regularizers that disciplines the deep neural network in the learning process. The sub-domain of the high dimensional state space that carries the most economic information is learned actively in an iterative loop, enforcing the random training points to be sampled from areas that matter the most to ensure convergence. I utilize a state-of-the art distributed framework to train the network that speeds up computation time significantly. The method is applied to successfully solve a model of macro-finance that is notoriously difficult to handle using traditional finite difference schemes.
Date:	2021–05
URL:	http://d.repec.org/n?u=RePEc:chf:rpseri:rp2134&r=

Robo-Advising: Enhancing Investment with Inverse Optimization and Deep Reinforcement Learning

By:	Haoran Wang; Shi Yu
Abstract:	Machine Learning (ML) has been embraced as a powerful tool by the financial industry, with notable applications spreading in various domains including investment management. In this work, we propose a full-cycle data-driven investment robo-advising framework, consisting of two ML agents. The first agent, an inverse portfolio optimization agent, infers an investor's risk preference and expected return directly from historical allocation data using online inverse optimization. The second agent, a deep reinforcement learning (RL) agent, aggregates the inferred sequence of expected returns to formulate a new multi-period mean-variance portfolio optimization problem that can be solved using deep RL approaches. The proposed investment pipeline is applied on real market data from April 1, 2016 to February 1, 2021 and has shown to consistently outperform the S&P 500 benchmark portfolio that represents the aggregate market optimal allocation. The outperformance may be attributed to the the multi-period planning (versus single-period planning) and the data-driven RL approach (versus classical estimation approach).
Date:	2021–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2105.09264&r=

Autonomous algorithmic collusion: Economic research and policy implications

By:	Assad, Stephanie; Calvano, Emilio; Calzolari, Giacomo; Clark, Robert; Denicolò, Vincenzo; Ershov, Daniel; Johnson, Justin; Pastorello, Sergio; Rhodes, Andrew; XU, Lei; Wildenbeest, Matthijs
Abstract:	Markets are being populated with new generations of pricing algorithms, powered with Artificial Intelligence, that have the ability to autonomously learn to operate. This ability can be both a source of efficiency and cause of concern for the risk that algorithms autonomously and tacitly learn to collude. In this paper we explore recent developments in the economic literature and discuss implications for policy.
Keywords:	Algorithmic Pricing; Antitrust; Competition Policy; Artificial Intelligence; Collusion; Platforms.
JEL:	D42 D82 L42
Date:	2021–03
URL:	http://d.repec.org/n?u=RePEc:tse:wpaper:125584&r=

Applications of artificial intelligence in supply chain management: Identification of main research fields and greatest industry interests

By:	Lechtenberg, Sandra; Hellingrath, Bernd
Abstract:	Advances in the area of computing power, data storage capabilities, etc., are changing the way business is done, particularly regarding how businesses use and apply artificial intelligence. To better understand how artificial intelligence is used in supply chain management, this paper identifies and compares the main research fields investigating this topic as well as the primary industry interests in it. For this, we performed a structured literature review that shows which methods of artificial intelligence are applied to which problems of supply chain management in the scientific literature. Then, we present industry-driven applications to provide an overview of fields that are most relevant to industry. Based on these results, indications for future research are derived.
Keywords:	artificial intelligence,supply chain management,logistics,applications,industry-driven
Date:	2021
URL:	http://d.repec.org/n?u=RePEc:zbw:ercisw:37&r=

Deep Learning Classification: Modeling Discrete Labor Choice

By:	Maliar, Lilia; Maliar, Serguei
Abstract:	We introduce a deep learning classification (DLC) method for analyzing equilibrium in discrete-continuous choice dynamic models. As an illustration, we apply the DLC method to solve a version of Krusell and Smith's (1998) heterogeneous-agent model with incomplete markets, borrowing constraint and indivisible labor choice. The novel feature of our analysis is that we construct discontinuous decision functions that tell us when the agent switches from one employment state to another, conditional on the economy's state. We use deep learning not only to characterize the discrete indivisible choice but also to perform model reduction and to deal with multicollinearity. Our TensorFlow-based implementation of DLC is tractable in models with thousands of state variables.
Keywords:	classification; deep learning; discrete choice; Indivisible labor; intensive and extensive margins; logistic regression; neural network
Date:	2020–10
URL:	http://d.repec.org/n?u=RePEc:cpr:ceprdp:15346&r=

From Man vs. Machine to Man + Machine: The Art and AI of Stock Analyses

By:	Sean Cao; Wei Jiang; Junbo L. Wang; Baozhong Yang
Abstract:	An AI analyst we build to digest corporate financial information, qualitative disclosure and macroeconomic indicators is able to beat the majority of human analysts in stock price forecasts and generate excess returns compared to following human analyst. In the contest of “man vs machine,” the relative advantage of the AI Analyst is stronger when the firm is complex, and when information is high-dimensional, transparent and voluminous. Human analysts remain competitive when critical information requires institutional knowledge (such as the nature of intangible assets). The edge of the AI over human analysts declines over time when analysts gain access to alternative data and to in-house AI resources. Combining AI’s computational power and the human art of understanding soft information produces the highest potential in generating accurate forecasts. Our paper portraits a future of “machine plus human” (instead of human displacement) in high-skill professions.
JEL:	G11 G12 G14 G31 M41
Date:	2021–05
URL:	http://d.repec.org/n?u=RePEc:nbr:nberwo:28800&r=

Reassessing the Resource Curse using Causal Machine Learning

By:	Hodler, Roland; Lechner, Michael; Raschky, Paul A.
Abstract:	We reassess the effects of natural resources on economic development and conflict, applying a causal forest estimator and data from 3,800 Sub-Saharan African districts. We find that, on average, mining activities and higher world market prices of locally mined minerals both increase economic development and conflict. Consistent with the previous literature, mining activities have more positive effects on economic development and weaker effects on conflict in places with low ethnic diversity and high institutional quality. In contrast, the effects of changes in mineral prices vary little in ethnic diversity and institutional quality, but are non-linear and largest at relatively high prices.
Keywords:	Africa; Causal machine learning; conflict; economic development; mining; resource curse
JEL:	C21 O13 O55 Q34 R12
Date:	2020–09
URL:	http://d.repec.org/n?u=RePEc:cpr:ceprdp:15272&r=

Expanding the Measurement of Culture with a Sample of Two Billion Humans

By:	Awad, Edmond; Cebrián, Manuel; Cuevas Rumin, Angel; Cuevas Rumin, Ruben; Desmet, Klaus; Martín, Ignacio; Obradovich, Nick; Ortuño-Ortín, Ignacio; Ozak, Omer; Rahwan, Iyad
Abstract:	Culture has played a pivotal role in human evolution. Yet, the ability of social scientists to study culture is limited by the currently available measurement instruments. Scholars of culture must regularly choose between scalable but sparse survey-based methods or restricted but rich ethnographic methods. Here, we demonstrate that massive online social networks can advance the study of human culture by providing quantitative, scalable, and high-resolution measurement of behaviorally revealed cultural values and preferences. We employ publicly available data across nearly 60,000 topic dimensions drawn from two billion Facebook users across 225 countries and territories. We first validate that cultural distances calculated from this measurement instrument correspond to traditional survey-based and objective measures of cross-national cultural differences. We then demonstrate that this expanded measure enables rich insight into the cultural landscape globally at previously impossible resolution. We analyze the importance of national borders in shaping culture, explore unique cultural markers that identify subnational population groups, and compare subnational divisiveness to gender divisiveness across countries. The global collection of massive data on human behavior provides a high-dimensional complement to traditional cultural metrics. Further, the granularity of the measure presents enormous promise to advance scholars' understanding of additional fundamental questions in the social sciences. The measure enables detailed investigation into the geopolitical stability of countries, social cleavages within both small and large-scale human groups, the integration of migrant populations, and the disaffection of certain population groups from the political process, among myriad other potential future applications.
Keywords:	Cultural distance; Culture; gender differences; identity; Regional Culture; Subnational Differences
JEL:	C80 F1 J1 O10 R10 Z10
Date:	2020–09
URL:	http://d.repec.org/n?u=RePEc:cpr:ceprdp:15315&r=

Using social network and semantic analysis to analyze online travel forums and forecast tourism demand

By:	A Fronzetti Colladon; B Guardabascio; R Innarella
Abstract:	Forecasting tourism demand has important implications for both policy makers and companies operating in the tourism industry. In this research, we applied methods and tools of social network and semantic analysis to study user-generated content retrieved from online communities which interacted on the TripAdvisor travel forum. We analyzed the forums of 7 major European capital cities, over a period of 10 years, collecting more than 2,660,000 posts, written by about 147,000 users. We present a new methodology of analysis of tourism-related big data and a set of variables which could be integrated into traditional forecasting models. We implemented Factor Augmented Autoregressive and Bridge models with social network and semantic variables which often led to a better forecasting performance than univariate models and models based on Google Trend data. Forum language complexity and the centralization of the communication network, i.e. the presence of eminent contributors, were the variables that contributed more to the forecasting of international airport arrivals.
Date:	2021–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2105.07727&r=

The Race of Man and Machine: Implications of Technology When Abilities and Demand Constraints Matter

By:	Gries, Thomas (University of Paderborn); Naudé, Wim (University College Cork)
Abstract:	In "The Race between Man and Machine: Implications of Technology for Growth, Factor Shares, and Employment," Acemoglu and Restrepo (2018b) combine the task-based model of the labor market with an endogenous growth model to model the economic consequences of artificial intelligence (AI). This paper provides an alternative endogenous growth model that addresses two shortcomings of their model. First, we replace the assumption of a representative household with the premise of two groups of households with different preferences. This allows our model to be demand constrained and able to model the consequences of higher income inequality due to AI. Second, we model AI as providing abilities, arguing that "abilities" better characterises the nature of the services that AI provide, rather than tasks or skills. The dynamics of the model regarding the impact of AI on jobs, inequality, wages, labor productivity and long-run GDP growth are explored.
Keywords:	technology, artificial intelligence, productivity, labor demand, income distribution, growth theory
JEL:	O47 O33 J24 E21 E25
Date:	2021–04
URL:	http://d.repec.org/n?u=RePEc:iza:izadps:dp14341&r=

Bad machines corrupt good morals

By:	Köbis, Nils; Bonnefon, Jean-François; Rahwan, Iyad
Abstract:	Machines powered by Artificial Intelligence (AI) are now influencing the behavior of humans in ways that are both like and unlike the ways humans influence each other. In light of recent research showing that other humans can exert a strong corrupting influence on people’s ethical behavior, worry emerges about the corrupting power of AI agents. To estimate the empirical validity of these fears, we review the available evidence from behavioral science, human-computer interaction, and AI research. We propose that the main social roles through which both humans and machines can influence ethical behavior are (a) role model, (b) advisor, (c) partner, and (d) delegate. When AI agents become influencers (role models or advisors), their corrupting power may not exceed (yet) the corrupting power of humans. However, AI agents acting as enablers of unethical behavior (partners or delegates) have many characteristics that may let people reap unethical benefits while feeling good about themselves, indicating good reasons for worry. Based on these insights, we outline a research agenda that aims at providing more behavioral insights for better AI oversight.
Keywords:	machine behavior; behavioral ethics; corruption; artificial intelligence
Date:	2021–05
URL:	http://d.repec.org/n?u=RePEc:tse:wpaper:125602&r=

Flexible Work Arrangements in Low Wage Jobs: Evidence from Job Vacancy Data

By:	Adams-Prassl, Abigail; Balgova, Maria; Qian, Matthias
Abstract:	In this paper, we analyze firm demand for flexible jobs by exploiting the language used to describe work arrangements in job vacancies. We take a supervised machine learning approach to classify the work arrangements described in more than 46 million UK job vacancies. We highlight the existence of very different types of flexibility amongst low and high wage vacancies. Job flexibility at low wages is more likely to be offered alongside a wage-contract that exposes workers to earnings risk, while flexibility at higher wages and in more skilled occupations is more likely to be offered alongside a fixed salary that shields workers from earnings variation. We show that firm demand for flexible work arrangements is partly driven by a desire to reduce labor costs; we find that a large and unexpected change to the minimum wage led to a 7 percentage point increase in the proportion of flexible and non-salaried vacancies at low wages
Keywords:	job vacancies; Labour Demand; labour market flexibility; minimum wage
JEL:	C45 C81 J21 J23 J32 J33
Date:	2020–09
URL:	http://d.repec.org/n?u=RePEc:cpr:ceprdp:15263&r=

Collateral Damage: The Legacy of the Secret War in Laos

By:	Riaño, Juan Felipe; Valencia Caicedo, Felipe
Abstract:	As part of its Cold War counterinsurgency operations in Southeast Asia, the U.S. government conducted a "Secret War" in Laos from 1964-1973. This war constituted one of the most intensive bombing campaigns in human history. As a result, Laos is now severely contaminated with UXO (Unexploded Ordnance) and remains one of the poorest countries in the world. In this paper we document the negative long-term impact of conflict on economic development, using highly disaggregated and newly available data on bombing campaigns, satellite imagery and development outcomes. We find a negative, significant and economically meaningful impact of bombings on nighttime lights, expenditures and poverty rates. Almost 50 years after the conflict officially ended, bombed regions are poorer today and are growing at slower rates than unbombed areas. A one standard deviation increase in the total pounds of bombs dropped is associated with a 9.3% fall in GDP per capita. To deal with the potential endogeneity of bombing, we use as instruments the distance to the Vietnamese Ho Chi Minh Trail as well as US military airbases outside Laos. Using census data at the village and individual levels, we show the deleterious impact of UXOs in terms of health, as well as education, structural transformation and rural-urban migration.
Keywords:	Cold War; conflict; Development; growth; health; Human Capital; Laos; migration; structural transformation; UXO
JEL:	D74 N10 N15 O10 O53
Date:	2020–10
URL:	http://d.repec.org/n?u=RePEc:cpr:ceprdp:15349&r=

Identifying residential consumption patterns using data-mining techniques: A large-scale study of smart meter data in Chengdu, China

By:	Kang, J.; Reiner, D.
Abstract:	The fine-grained electricity consumption data created by advanced metering technologies offers an opportunity to understand residential demand from new angles. Although there exists a large body of research on demand response in short- and long-term forecasting, a comprehensive analysis to identify household consumption behaviour in different scenarios has not been conducted. The studyâ€™s novelty lies in its use of unsupervised machine learning tools to explore residential customersâ€™ demand patterns and response without the assistance of traditional survey tools. We investigate behavioural response in three different contexts: 1) seasonal (using weekly consumption profiles); 2) holidays/festivals; and 3) extreme weather situations. The analysis is based on the smart metering data of 2,000 households in Chengdu, China over three years from 2014 to 2016. Workday/weekend profiles indicate that there are two distinct groups of households that appear to be white-collar or relatively affluent families. Demand patterns at the major festivals in China, especially the Spring Festival, reveal various types of lifestyle and households. In terms of extreme weather response, the most striking finding was that in summer, at night-time, over 72% of households doubled (or more) their electricity usage, while consumption changes in winter do not seem to be significant. Our research offers more detailed insight into Chinese residential consumption and provides a practical framework to understand householdsâ€™ behaviour patterns in different settings.
Keywords:	Residential electricity, household consumption behaviour, China, machine learning
JEL:	C55 D12 R22 Q41
Date:	2021–05–12
URL:	http://d.repec.org/n?u=RePEc:cam:camdae:2143&r=

This nep-big issue is ©2021 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.