nep-big New Economics Papers
on Big Data
Issue of 2022‒11‒07
forty-two papers chosen by
Tom Coupé
University of Canterbury

  1. Sentiment Analysis on Inflation after Covid-19 By Xinyu Li; Zihan Tang
  2. What makes a satisfying life? Prediction and interpretation with machine-learning algorithms By Andrew E. Clark; Conchita D'Ambrosio; Niccolo Gentile; Alexandre Tkatchenko
  3. A Survey: Credit Sentiment Score Prediction By A. N. M. Sajedul Alam; Junaid Bin Kibria; Arnob Kumar Dey; Zawad Alam; Shifat Zaman; Motahar Mahtab; Mohammed Julfikar Ali Mahbub; Annajiat Alim Rasel
  4. Where Are All the Jobs ? A Machine Learning Approach for High Resolution Urban Employment Prediction inDeveloping Countries By Barzin,Samira; Avner,Paolo; Maruyama Rentschler,Jun Erik; O’Clery,Neave
  5. Impact of Increasing Firms' Consumer Demand Perceptions on Market Outcomes By TANAKA Kenta; HIGASHIDA Keisaku; MANAGI Shunsuke
  6. Preparation, Practice, and Beliefs : A Machine Learning Approach to Understanding Teacher Effectiveness By Filmer,Deon P.; Nahata,Vatsal; Sabarwal,Shwetlena
  7. Human wellbeing and machine learning By Ekaterina Oparina; Caspar Kaiser; Niccolo Gentile; Alexandre Tkatchenko; Andrew E. Clark; Jan-Emmanuel De Neve; Conchita D'Ambrosio
  8. Minimax Optimal Kernel Operator Learning via Multilevel Training By Jikai Jin; Yiping Lu; Jose Blanchet; Lexing Ying
  9. DNN-ForwardTesting: A New Trading Strategy Validation using Statistical Timeseries Analysis and Deep Neural Networks By Ivan Letteri; Giuseppe Della Penna; Giovanni De Gasperis; Abeer Dyoub
  10. Glossary of human-centric artificial intelligence By ESTEVEZ ALMENZAR Marina; FERNANDEZ LLORCA David; GOMEZ Emilia; MARTINEZ PLUMED Fernando
  11. Algorithmic Trading Using Continuous Action Space Deep Reinforcement Learning By Naseh Majidi; Mahdi Shamsi; Farokh Marvasti
  12. Using Machine Learning to Promote Proactive Human Resources Management: A Case Study By Zakarya Laghzal; Lamya Temnati
  13. Impact of the Rapid Expansion of Renewable Energy on Electricity Market Price: Using machine learning and shapley additive explanation By LI Chao; MANAGI Shunsuke
  14. Embedding-based neural network for investment return prediction By Jianlong Zhu; Dan Xian; Fengxiao; Yichen Nie
  15. Forecasting Cryptocurrencies Log-Returns: a LASSO-VAR and Sentiment Approach By Federico D'Amario; Milos Ciganovic
  16. Impact of the Rapid Expansion of Renewable Energy on Electricity Market Price: Using machine learning and shapley additive explanation By SHIMOMURA Mizue; KEELEY Alexander Ryota; MATSUMOTO Ken'ichi; TANAKA Kenta; MANAGI Shunsuke
  17. Lowering Prices of Pharmaceuticals, Medical Supplies, and Equipment : Insights from Big Data for Better Procurement Strategies in Latin America By Fazekas,Mihály; Oliveira,Alexandre Borges De; Regös,Nóra
  18. Tweeting for money: Social media and mutual fund flows By Javier Gil-Bazo; Juan F. Imbet
  19. Survey Measurement Errors and the Assessment of the Relationship between Yields and Inputs inSmallholder Farming Systems : Evidence from Mali By Yacoubou Djima,Ismael; Kilic,Talip
  20. Debt Vulnerability Analysis : A Multi-Angle Approach By Doemeland,Doerte; Estevão,Marcello; Jooste,Charl; Sampi Bravo,James Robert Ezequiel; Tsiropoulos,Vasileios
  21. Estimating Food Price Inflation from Partial Surveys By Andree,Bo Pieter Johannes
  22. "Non-Crossing Dual Neural Network: Joint Value at Risk and Conditional Tail Expectation estimations with non-crossing conditions". By Xenxo Vidal-Llana; Carlos Salort Sánchez; Vincenzo Coia; Montserrat Guillen
  23. AI Watch: AI for enhancing Robotics. The intersection of Robotics with the AI landscape. By Riccardo Righi; Michail Papazoglou; Sofia Samoili; Miguel Vazquez-Prada Baillet; Melisande Cardona; Montserrat Lopez-Cobo; Giuditta De-Prato; Nestor Duch-Brown
  24. Drivers of Utilization, Quality of Care, and RMNCH-N Services in Bangladesh : A Comparative Analysis of Demand and Supply-Side Determinants Using Machine Learning for Investment Decision-Making By Gopalan,Saji Saraswathy,Mohammed-Roberts,Rianna L.,Zanetti Matarazzo,Hellen Chrystine
  25. Stock Volatility Prediction using Time Series and Deep Learning Approach By Ananda Chatterjee; Hrisav Bhowmick; Jaydip Sen
  26. The Role of Justice in Development : The Data Revolution By Ramos Maqueda,Manuel; Chen,Daniel Li
  27. MetaTrader: An Reinforcement Learning Approach Integrating Diverse Policies for Portfolio Optimization By Hui Niu; Siyuan Li; Jian Li
  28. Multiclass Sentiment Prediction for Stock Trading By Marshall R. McCraw
  29. The demand for language skills in the European labour market: Evidence from online job ads By Gabriele Marconi; Loris Vergolini
  30. Quantifying the role of interest rates, the Dollar and Covid in oil prices By Emanuel Kohlscheen
  31. The Impact of Visibility on School Athletic Finances: An Empirical Analysis using Google Trends By Behera, Sarthak; Sadana, Divya
  32. Nowcasting Global Poverty By Mahler,Daniel Gerszon; Castaneda Aguilar,Raul Andres; Newhouse,David Locke
  33. Measuring Quarterly Economic Growth from Outer Space By Beyer,Robert Carl Michael; Hu,Yingyao; Yao,Jiaxiong
  34. Towards Multi-Agent Reinforcement Learning driven Over-The-Counter Market Simulations By Nelson Vadori; Leo Ardon; Sumitra Ganesh; Thomas Spooner; Selim Amrouni; Jared Vann; Mengda Xu; Zeyu Zheng; Tucker Balch; Manuela Veloso
  35. The rise of China's technological power: the perspective from frontier technologies By Antonin Bergeaud; Cyril Verluise
  36. Personalization of Web Search During the 2020 US Elections By Ulrich Matter; Roland Hodler; Johannes Ladwig
  37. The Impact of Ethiopia’s Road Investment Program on Economic Development and Land Use :Evidence from Satellite Data By Alder,Simon; Croke,Kevin; Duhaut,Alice; Marty,Robert Andrew; Vaisey,Ariana Brynn
  38. Displacement and Return in the Internet Era : How Social Media Captures Migration Decisionsin Northern Syria By Walk,Erin Elizabeth; Garimella,Kiran; Christia,Fotini
  39. Rohingya Refugee Camps and Forest Loss in Cox’s Bazar, Bangladesh : An Inquiry Using Remote Sensingand Econometric Approaches By Dampha,Nfamara K; Salemi,Colette; Polasky,Stephen
  40. Cultural homophily and collaboration in superstar teams By Gabor Bekes; Gianmarco I. P. Ottaviano
  41. Credit Information in Earnings Calls By Harry Mamaysky; Yiwen Shen; Hongyu Wu
  42. Urban CO2 Emissions : A Global Analysis with New Satellite Data By Dasgupta,Susmita; Lall,Somik V.; Wheeler,David R.

  1. By: Xinyu Li; Zihan Tang
    Abstract: Based on global tweets from 2017 to 2022, we implement traditional machine learning and deep learning methods to build high-frequency measures of the public's sentiment index towards inflation and analyze the correlation with other online data sources such as google trend and market-oriented inflation index. First, we test out several machine learning approaches using manually labelled tri-grams and finally choose Bert model for our research. Second, we calculate inflation sentiment index through sentiment score of the tweets applying Bert model and analyse the regional and pre/post covid pattern. Lastly, we take other online data sources of inflation into consideration and prove that twitter-based inflation sentiment analysis method has an outstanding capability to predict inflation. The results suggest that Twitter combined with deep learning methods can be a novel and timely method to utilise existing abundant data sources on inflation expectations and provide daily and weekly indicators of consumers' perception on inflation.
    Date: 2022–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2209.14737&r=
  2. By: Andrew E. Clark; Conchita D'Ambrosio; Niccolo Gentile; Alexandre Tkatchenko
    Abstract: Machine Learning (ML) methods are increasingly being used across a variety of fields and have led to the discovery of intricate relationships between variables. We here apply ML methods to predict and interpret life satisfaction using data from the UK British Cohort Study. We discuss the application of first Penalized Linear Models and then one non-linear method, Random Forests. We present two key model-agnostic interpretative tools for the latter method: Permutation Importance and Shapley Values. With a parsimonious set of explanatory variables, neither Penalized Linear Models nor Random Forests produce major improvements over the standard Non-penalized Linear Model. However, once we consider a richer set of controls these methods do produce a non-negligible improvement in predictive accuracy. Although marital status, and emotional health continue to be the most important predictors of life satisfaction, as in the existing literature, gender becomes insignificant in the non-linear analysis.
    Keywords: life satisfaction, well-being, machine learning, British cohort study
    Date: 2022–06–07
    URL: http://d.repec.org/n?u=RePEc:cep:cepdps:dp1853&r=
  3. By: A. N. M. Sajedul Alam; Junaid Bin Kibria; Arnob Kumar Dey; Zawad Alam; Shifat Zaman; Motahar Mahtab; Mohammed Julfikar Ali Mahbub; Annajiat Alim Rasel
    Abstract: Manual approvals are still used by banks and other NGOs to approve loans. It takes time and is prone to mistakes because it is controlled by a bank employee. Several fields of machine learning mining technologies have been utilized to enhance various areas of credit rating forecast. A major goal of this research is to look at current sentiment analysis techniques that are being used to generate creditworthiness.
    Date: 2022–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2209.15293&r=
  4. By: Barzin,Samira; Avner,Paolo; Maruyama Rentschler,Jun Erik; O’Clery,Neave
    Abstract: Globally, both people and economic activity are increasingly concentrated in urban areas. Yet,for the vast majority of developing country cities, little is known about the granular spatial organization of such activity despite its key importance to policy and urbanplanning. This paper adapts a machine learning based algorithm to predict the spatial distribution of employmentusing input data from open access sources such as Open Street Map and Google Earth Engine. The algorithm is trainedon 14 test cities, ranging from Buenos Aires in Argentina to Dakar in Senegal. A spatial adaptation of the random forestalgorithm is used to predict within-city cells in the 14 test cities with extremely high accuracy (R- squared greaterthan 95 percent), and cells in out-of-sample ”unseen” cities with high accuracy (mean R-squared of 63 percent). Thisapproach uses open data to produce high resolution estimates of the distribution of urban employment for cities wheresuch information does not exist, making evidence-based planning more accessible than ever before.
    Date: 2022–03–22
    URL: http://d.repec.org/n?u=RePEc:wbk:wbrwps:9979&r=
  5. By: TANAKA Kenta; HIGASHIDA Keisaku; MANAGI Shunsuke
    Abstract: The rapid evolution and spread of artificial intelligence (AI) and algorithms significantly improve companies’ recognition of consumer demands. AI and algorithmic big data analyses have been introduced into firms’ practical decision-making and marketing activities. However, there are insufficient empirical analyses available to determine the impact of improving a firm’s cognitive ability (via algorithmic data analyses) on actual market outcomes (price formation, each firm’s surplus, and social surplus). Using a laboratory experimental approach, this study examines the market outcomes, such as the degree of product differentiation and prices, when firms utilize an algorithmic demand-forecasting system in a duopoly. The results indicate that the forecasting system increases the cognitive abilities of the participants regarding their consumers’ preferences. Additionally, the introduction of the algorithmic demand-forecasting system increases the consumer surplus in the market.
    Date: 2022–09
    URL: http://d.repec.org/n?u=RePEc:eti:dpaper:22095&r=
  6. By: Filmer,Deon P.; Nahata,Vatsal; Sabarwal,Shwetlena
    Abstract: This paper uses machine learning methods to identify key predictors of teacher effectiveness,proxied by student learning gains linked to a teacher over an academic year. Conditional inference forests and theleast absolute shrinkage and selection operator are applied to matched student-teacher data for math and Kiswahili fromgrades 2 and 3 in 392 schools across Tanzania. These two machine learning methods produce consistent results andoutperform standard ordinary least squares in out-of-sample prediction by 14–24 percent. As in previous research,commonly used teacher covariates like teacher gender, education, experience, and so forth are not good predictorsof teacher effectiveness. Instead, teacher practice (what teachers do, measured through classroom observations andstudent surveys) and teacher beliefs (measured through teacher surveys) emerge as much more important. Overall,teacher covariates are stronger predictors of teacher effectiveness in math than in Kiswahili. Teacher beliefsthat they can help disadvantaged and struggling studentslearn (for math) and they have good relationships within schools (for Kiswahili), teacher practice of providingwritten feedback and reviewing key concepts at the end of class (for math), and spending extra time with strugglingstudents (for Kiswahili) are highly predictive of teacher effectiveness. As is teacher preparation on how to teachfoundational topics (for both Math and Kiswahili). These results demonstrate the need to pay more systematicattention to teacher preparation, practice, and beliefs in teacher research and policy.
    Date: 2021–11–15
    URL: http://d.repec.org/n?u=RePEc:wbk:wbrwps:9847&r=
  7. By: Ekaterina Oparina; Caspar Kaiser; Niccolo Gentile; Alexandre Tkatchenko; Andrew E. Clark; Jan-Emmanuel De Neve; Conchita D'Ambrosio
    Abstract: There is a vast literature on the determinants of subjective wellbeing. International organisations and statistical offices are now collecting such survey data at scale. However, standard regression models explain surprisingly little of the variation in wellbeing, limiting our ability to predict it. In response, we here assess the potential of Machine Learning (ML) to help us better understand wellbeing. We analyse wellbeing data on over a million respondents from Germany, the UK, and the United States. In terms of predictive power, our ML approaches perform better than traditional models. Although the size of the improvement is small in absolute terms, it is substantial when compared to that of key variables like health. We moreover find that drastically expanding the set of explanatory variables doubles the predictive power of both OLS and the ML approaches on unseen data. The variables identified as important by our ML algorithms - i.e. material conditions, health, and meaningful social relations - are similar to those that have already been identified in the literature. In that sense, our data-driven ML results validate the findings from conventional approaches.
    Keywords: subjective wellbeing, prediction methods, machine learning
    Date: 2022–07–20
    URL: http://d.repec.org/n?u=RePEc:cep:cepdps:dp1863&r=
  8. By: Jikai Jin; Yiping Lu; Jose Blanchet; Lexing Ying
    Abstract: Learning mappings between infinite-dimensional function spaces has achieved empirical success in many disciplines of machine learning, including generative modeling, functional data analysis, causal inference, and multi-agent reinforcement learning. In this paper, we study the statistical limit of learning a Hilbert-Schmidt operator between two infinite-dimensional Sobolev reproducing kernel Hilbert spaces. We establish the information-theoretic lower bound in terms of the Sobolev Hilbert-Schmidt norm and show that a regularization that learns the spectral components below the bias contour and ignores the ones that are above the variance contour can achieve the optimal learning rate. At the same time, the spectral components between the bias and variance contours give us flexibility in designing computationally feasible machine learning algorithms. Based on this observation, we develop a multilevel kernel operator learning algorithm that is optimal when learning linear operators between infinite-dimensional function spaces.
    Date: 2022–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2209.14430&r=
  9. By: Ivan Letteri; Giuseppe Della Penna; Giovanni De Gasperis; Abeer Dyoub
    Abstract: In general, traders test their trading strategies by applying them on the historical market data (backtesting), and then apply to the future trades the strategy that achieved the maximum profit on such past data. In this paper, we propose a new trading strategy, called DNN-forwardtesting, that determines the strategy to apply by testing it on the possible future predicted by a deep neural network that has been designed to perform stock price forecasts and trained with the market historical data. In order to generate such an historical dataset, we first perform an exploratory data analysis on a set of ten securities and, in particular, analize their volatility through a novel k-means-based procedure. Then, we restrict the dataset to a small number of assets with the same volatility coefficient and use such data to train a deep feed-forward neural network that forecasts the prices for the next 30 days of open stocks market. Finally, our trading system calculates the most effective technical indicator by applying it to the DNNs predictions and uses such indicator to guide its trades. The results confirm that neural networks outperform classical statistical techniques when performing such forecasts, and their predictions allow to select a trading strategy that, when applied to the real future, increases Expectancy, Sharpe, Sortino, and Calmar ratios with respect to the strategy selected through traditional backtesting.
    Date: 2022–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2210.11532&r=
  10. By: ESTEVEZ ALMENZAR Marina (European Commission - JRC); FERNANDEZ LLORCA David (European Commission - JRC); GOMEZ Emilia (European Commission - JRC); MARTINEZ PLUMED Fernando (European Commission - JRC)
    Abstract: Over the last few years, Artificial Intelligence (AI) has become a very active research topic, moving from a purely technical field to an interdisciplinary research domain and a very active topic in terms of policy developments. The European approach for AI focuses on two main areas: excellence and trust, enabling the development and uptake of AI while ensuring people’s safety and fundamental rights. However, research and policy documentations do not always use the same vocabulary, often generating misunderstandings among researchers, policy makers, and the general public. Based on existing literature in the intersection between research, industry and policy, and given the expertise and know-now developed at the European Commission’s Joint Research Centre, we present here a glossary of terms on AI, with a focus on a human-centric approach, covering concepts related to trustworthy artificial intelligence such as transparency, accountability or fairness. We have collected 230 different terms from more than 10 different general sources including standards, policy documents and legal texts, as well as multiple scientific references. Each term is accompanied by one or several definitions linked to references and complemented with our own definitions when no relevant source was found. We humbly hope that the work presented here can contribute to establishing the necessary common ground for the interdisciplinary and policy-centred debate on artificial intelligence.
    Keywords: artificial intelligence, human-centric AI, digital transformation
    Date: 2022–09
    URL: http://d.repec.org/n?u=RePEc:ipt:iptwpa:jrc129614&r=
  11. By: Naseh Majidi; Mahdi Shamsi; Farokh Marvasti
    Abstract: Price movement prediction has always been one of the traders' concerns in financial market trading. In order to increase their profit, they can analyze the historical data and predict the price movement. The large size of the data and complex relations between them lead us to use algorithmic trading and artificial intelligence. This paper aims to offer an approach using Twin-Delayed DDPG (TD3) and the daily close price in order to achieve a trading strategy in the stock and cryptocurrency markets. Unlike previous studies using a discrete action space reinforcement learning algorithm, the TD3 is continuous, offering both position and the number of trading shares. Both the stock (Amazon) and cryptocurrency (Bitcoin) markets are addressed in this research to evaluate the performance of the proposed algorithm. The achieved strategy using the TD3 is compared with some algorithms using technical analysis, reinforcement learning, stochastic, and deterministic strategies through two standard metrics, Return and Sharpe ratio. The results indicate that employing both position and the number of trading shares can improve the performance of a trading system based on the mentioned metrics.
    Date: 2022–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2210.03469&r=
  12. By: Zakarya Laghzal (ENCG El Jadida, UCD - Université Chouaib Doukkali); Lamya Temnati (UCD - Université Chouaib Doukkali)
    Abstract: Since the Industrial Revolution the function of human resources (HR) has undergone several changes, today it is asked to adopt a long-term and proactive approach which serves to anticipate the needs of the company and the problems likely to impact its productivity and performance in order to implement long-term adaptation actions and be effective in its strategic approach. Otherwise Machine learning has been for some time, a trending technology that has seen massive use in many areas, according to a study conducted by IT decision makers from over 15 different business sectors in the UK, France, Germany and Spain, 87% of the samples have implemented this technology or plan to do so. This technology has allowed companies to improve their processes, increase their competitiveness and help decision-making in several management areas such as finance and marketing.This present work seeks to highlight the potential of this technology to promote proactive management of human resources by using Machine Learning algorithms in the analysis of turnover and the prediction of employees tending to leave their jobs using a IBM corporate database published as part of a competition for the development of an internal model used to identify employees intending to leave their jobs. The results of this study have shown that this technology can play a crucial role in the proactive management of human resources by providing information that makes it possible to pro-act and anticipate actions related to human resources management.
    Abstract: Depuis la révolution industrielle,la fonction des ressources humaines (RH) a subi plusieurs changements, aujourd'hui elle est sollicitée d'adopter une approche à long terme et proactive qui sert à anticiper les besoins de l'entreprise et les problèmes susceptibles d'impacter sa productivité et sa performance afin de mettre en place des actions d'adaptation à long terme et être efficace dans sa démarche stratégique. Par ailleurs la machine Learning (ou auto-apprentissage) est une technologie tendance depuis quelque temps, qui a connu une utilisation massive dans de nombreux domaines. Selon une étude menée par les décideurs IT issus de plus de 15 secteurs d'activités différents dans la Royaume uni, la France, l'Allemand et l'Espagne, 87% de l'échantillon ont implémenté cette technologie ou prévoient de le faire. Cette technologie a permis aux entreprises d'améliorer leur processus, augmenter leur compétitivité et aider à la prise de décision dans plusieurs domaines de gestion tels que la finance et le marketing. Ce présent travail cherche à mettre en exergue le potentiel de cette technologie pour favoriser une gestion proactive des ressources humaines en utilisant les algorithmes du Machine Learning dans l'analyse de turn-over et la prédiction des employés ayant tendance de quitter leurs emplois en se basant sur une base de données de l'entreprise IBM publiée dans le cadre d'une compétition pour le développement d'un modèle interne qui sert à identifier les employés ayant l'intention de quitter leurs emplois. Les résultats de cette étude ont montré que cette technologie peut jouer un rôle crucial dans la gestion proactive des ressources humaines en offrant des informations qui permettent de pro-agir et anticiper les actions liées à la gestion des ressources humaines.
    Keywords: Machine learning,proactive management of human resources,Turn-Over,Turn-over,Machine Learning,La gestion proactive des ressources humaines
    Date: 2022–05–31
    URL: http://d.repec.org/n?u=RePEc:hal:journl:hal-03787323&r=
  13. By: LI Chao; MANAGI Shunsuke
    Abstract: The positive effects of greenness in living environments on human well-being are known. As a widely used proxy, the nighttime light (NTL) indicates the regional socio-economic status and development level. Higher development levels and economic status are related to more opportunity and higher income, ultimately leading to greater human well-being. However, whether simple increases in greenness and NTL always produce positive results remains inconclusive. Here, we demonstrate the complex relationships between human well-being and greenness and NTL by employing the random forest method. The accuracy of this model is 81.83%, exceeding most previous studies. According to the analysis results, the recommended ranges of greenness and NTL in living environments are 10.91% - 32.99% and 0 – 17.92 nW/cm 2 ・sr , respectively. Moreover, the current average monetary values of greenness and NTL are 3351.96 USD/% and 658.11 USD/(nW/cm 2 ・sr) , respectively. The residential areas are far away from the abundant natural resources, which makes the main population desire more greenness in their living environments. Furthermore, high urban development density, represented by NTL, has caused adverse effects on human well-being in metropolitan areas. Therefore, retaining a moderate development intensity is an effective way to achieve a sustainable society and improve human well-being.
    Date: 2022–09
    URL: http://d.repec.org/n?u=RePEc:eti:dpaper:22093&r=
  14. By: Jianlong Zhu; Dan Xian; Fengxiao; Yichen Nie
    Abstract: In addition to being familiar with policies, high investment returns also require extensive knowledge of relevant industry knowledge and news. In addition, it is necessary to leverage relevant theories for investment to make decisions, thereby amplifying investment returns. A effective investment return estimate can feedback the future rate of return of investment behavior. In recent years, deep learning are developing rapidly, and investment return prediction based on deep learning has become an emerging research topic. This paper proposes an embedding-based dual branch approach to predict an investment's return. This approach leverages embedding to encode the investment id into a low-dimensional dense vector, thereby mapping high-dimensional data to a low-dimensional manifold, so that highdimensional features can be represented competitively. In addition, the dual branch model realizes the decoupling of features by separately encoding different information in the two branches. In addition, the swish activation function further improves the model performance. Our approach are validated on the Ubiquant Market Prediction dataset. The results demonstrate the superiority of our approach compared to Xgboost, Lightgbm and Catboost.
    Date: 2022–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2210.00876&r=
  15. By: Federico D'Amario; Milos Ciganovic
    Abstract: Cryptocurrencies have become a trendy topic recently, primarily due to their disruptive potential and reports of unprecedented returns. In addition, academics increasingly acknowledge the predictive power of Social Media in many fields and, more specifically, for financial markets and economics. In this paper, we leverage the predictive power of Twitter and Reddit sentiment together with Google Trends indexes and volume to forecast the log returns of ten cryptocurrencies. Specifically, we consider $Bitcoin$, $Ethereum$, $Tether$, $Binance Coin$, $Litecoin$, $Enjin Coin$, $Horizen$, $Namecoin$, $Peercoin$, and $Feathercoin$. We evaluate the performance of LASSO-VAR using daily data from January 2018 to January 2022. In a 30 days recursive forecast, we can retrieve the correct direction of the actual series more than 50% of the time. We compare this result with the main benchmarks, and we see a 10% improvement in Mean Directional Accuracy (MDA). The use of sentiment and attention variables as predictors increase significantly the forecast accuracy in terms of MDA but not in terms of Root Mean Squared Errors. We perform a Granger causality test using a post-double LASSO selection for high-dimensional VARs. Results show no "causality" from Social Media sentiment to cryptocurrencies returns
    Date: 2022–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2210.00883&r=
  16. By: SHIMOMURA Mizue; KEELEY Alexander Ryota; MATSUMOTO Ken'ichi; TANAKA Kenta; MANAGI Shunsuke
    Abstract: The increase in variable renewable energy (VRE) has brought significant changes in the power system, including a decrease in the average electricity market price owing to the merit order effect (MOE). In this study, we use machine learning and Shapley additive explanation (SHAP) to comprehensively examine the drivers of market price volatility, including the interaction between VRE and demand, fuel prices, and operation capacity in the Japanese electricity market which solar power installation is expanding rapidly. The results of SHAP reveal that there is a large decline effect for market price in solar power during daytime; however, the effect varies depending on the time of day, season, and demand. In addition, the results suggest that the market price increases when demand is high and solar generation is low, such as during summer evenings, which may be because of natural gas generation with higher marginal costs. The study reveals that impact of expanded VRE will not only have the MOE which decreasing average market prices, but may also prompt structural changes in electricity supply, causing market instability and price spikes in the transition process.
    Date: 2022–09
    URL: http://d.repec.org/n?u=RePEc:eti:dpaper:22090&r=
  17. By: Fazekas,Mihály; Oliveira,Alexandre Borges De; Regös,Nóra
    Abstract: Containing rapidly growing health care costs in the Latin American and the Caribbean region, especially amid the COVID-19 pandemic, requires an in-depth analysis of prices from a novel perspective. This paper documents hitherto understudied variations in prices paid for pharmaceuticals, equipment, and medical supplies within countries and markets. It also identifies effective procurement strategies for lowering prices within existing regulatory frameworks. The analysis uses public procurement data gathered by governments’ electronic procurement systems in nine countries and territories across the region. The data are uniquely detailed and complete, encompassing the minute detail of purchasing decisions and processes made across all regulated public entities in the study countries and territories. Traditional regression analysis and machine learning (random forests) methods are used to explain prices as a function of procurement decisions and outputs, such as the number of bidders. Based on in-depth discussions with policy makers, the paper also devises realistic policy interventions, which in turn can be used to estimate savings scenarios. First, the findings show that the prices paid vary greatly across and within countries. The latter is surprising given that the regulatory and institutional framework is largely fixed within each country. Second, a high proportion of within-country and -market variation can be explained by standard features of procurement policy implementation, such as the length of advertising tenders. Third, the explanatory models point to the potential for lowering prices across the region by about 14 percent by implementing low-level, yet impactful changes to how purchasing is done.
    Keywords: Health Care Services Industry,Pharmaceuticals&Pharmacoeconomics,Pharmaceuticals Industry,Public Finance Decentralization and Poverty Reduction,Legal Reform,Legislation,Regulatory Regimes,Judicial System Reform,Social Policy,Public Sector Economics,Legal Products,Public Health Promotion
    Date: 2021–06–04
    URL: http://d.repec.org/n?u=RePEc:wbk:wbrwps:9689&r=
  18. By: Javier Gil-Bazo; Juan F. Imbet
    Abstract: We investigate whether asset management firms use social media to persuade investors. Combining a database of almost 1.6 million Twitter posts by U.S. mutual fund families with textual analysis, we find that flows of money to mutual funds respond positively to tweets with a positive tone. Consistently with the persuasion hypothesis, positive tweets work best when they convey advice or views on the market and when investor sentiment is higher. Using a high-frequency approach, we are able to identify a short-lived impact of families' tweets on ETF share prices. Finally, we reject the alternative hypothesis that asset management companies use social media to alleviate information asymmetries by either lowering search costs or disclosing privately observed information.
    Keywords: Social media, Twitter, persuasion, mutual funds, mutual fund, flows, machine learning, textual analysis
    JEL: G11 G23 D83
    Date: 2022–10
    URL: http://d.repec.org/n?u=RePEc:upf:upfgen:1846&r=
  19. By: Yacoubou Djima,Ismael; Kilic,Talip
    Abstract: An accurate understanding of how input use affects agricultural productivity in smallholderfarming systems is key to designing policies that can improve productivity, food security, and living standards inrural areas. Studies examining the relationships between agricultural productivity and inputs typically rely on landproductivity measures, such as crop yields, that are informed by self-reported survey data on crop production.This paper leverages unique survey data from Mali to demonstrate that self-reported crop yields, vis-à-vis(objective) crop cut yields, are subject to non-classical measurement error that in turn biases the estimatesof returns to inputs, including land, labor, fertilizer, andseeds. The analysis validates an alternative approach to estimate the relationship between crop yields andagricultural inputs using large-scale surveys, namely a within-survey imputation exercise that derives predicted,otherwise unobserved, objective crop yields that stem from a machine learning model that is estimated with a randomsubsample of plots for which crop cutting and self-reported yields are both available. Using data from a methodologicalsurvey experiment and a nationally representative survey conducted in Mali, the analysis demonstrates that it ispossible to obtain predicted objective sorghum yields with attenuated non-classical measurement error, resulting in aless biased assessment of the relationship between yields and agricultural inputs. The discussion expands on theimplications of the findings for (i) future research on agricultural intensification, and (ii) the design of futuresurveys in which objective data collection could be limited to a subsample to save costs, with the intention to applythe suggested machine learning approach.
    Keywords: Crops and Crop Management Systems,Climate Change and Agriculture,Food Security,Gender and Development,Labor & Employment Law,Agricultural Economics
    Date: 2021–11–05
    URL: http://d.repec.org/n?u=RePEc:wbk:wbrwps:9841&r=
  20. By: Doemeland,Doerte; Estevão,Marcello; Jooste,Charl; Sampi Bravo,James Robert Ezequiel; Tsiropoulos,Vasileios
    Abstract: Countries with high debt exposure are vulnerable to economic and financial shocks that could leadto sovereign defaults. This paper develops a methodology to identify countries that are at risk of debt default based onfour elements of debt vulnerability. These elements capture the different ways in which risks associated with high debtare assessed, namely: (i) the fundamental, (ii) the subjective, (iii) the judgmental, and (iv) the theoretical.The fundamental element considers the liquidity, solvency, and institutional risk elements of debt vulnerability. Thesubjective element captures the investors’ perceptions of debt default, while the judgmental element is based on thedebt thresholds as defined by Debt Sustainability Frameworks. Finally, the theoretical element is normativeand captures what ought to be. The methodology constructs an index for each of these four elements and uses them aspredictors in a model of public debt default. The methodology flags countries that are at risk of default bymeans of machine learning techniques and delivers outputs that point to underlying causes of vulnerability. Themethodology complements existing monitoring tools for assessing debt sustainability.
    Keywords: Financial Sector Policy,Economic Adjustment and Lending,Macro-Fiscal Policy,Public Sector Economics,Public Finance Decentralization and Poverty Reduction,Inequality,Industrial Economics,Economic Theory & Research,Economic Growth,International Trade and Trade Rules
    Date: 2022–02–07
    URL: http://d.repec.org/n?u=RePEc:wbk:wbrwps:9929&r=
  21. By: Andree,Bo Pieter Johannes
    Abstract: The traditional consumer price index is often produced at an aggregate level, using data fromfew, highly urbanized, areas. As such, it poorly describes price trends in rural or poverty-stricken areas, where largepopulations may reside in fragile situations. Traditional price data collection also follows a deliberate sampling andmeasurement process that is not well suited for monitoring during crisis situations, when price stability maydeteriorate rapidly. To gain real-time insights beyond what can be formally measured by traditional methods, this paperdevelops a machine-learning approach for imputation of ongoing subnational price surveys. The aim is to monitorinflation at the market level, relying only on incomplete and intermittent survey data. The capabilities arehighlighted using World Food Programme surveys in 25 fragile and conflict-affected countries where real-time monthly foodprice data are not publicly available from official sources. The results are made available as a data set that coversmore than 1200 markets and 43 food types. The local statistics provide a new granular view on importantinflation events, including the World Food Price Crisis of 2007–08 and the surge in global inflation following the 2020pandemic. The paper finds that imputations often achieve accuracy similar to direct measurement of prices. Theestimates may provide new opportunities to investigate local price dynamics in markets where prices are sensitive tolocalized shocks and traditional data are not available.
    Keywords: Inflation,Nutrition,Food Security,Inequality,International Trade and Trade Rules
    Date: 2021–12–16
    URL: http://d.repec.org/n?u=RePEc:wbk:wbrwps:9886&r=
  22. By: Xenxo Vidal-Llana (Universitat de Barcelona. Gran Via de les Corts Catalanes 585. 08007 Barcelona, Spain.); Carlos Salort Sánchez (Universitat de Barcelona. Gran Via de les Corts Catalanes 585. 08007 Barcelona, Spain.); Vincenzo Coia (University of British Columbia. West Mall 2329. Vancouver, BC Canada.); Montserrat Guillen (Gran Via de les Corts Catalanes 585. 08007 Barcelona, Spain.)
    Abstract: When datasets present long conditional tails on their response variables, algorithms based on Quantile Regression have been widely used to assess extreme quantile behaviors. Value at Risk (VaR) and Conditional Tail Expectation (CTE) allow the evaluation of extreme events to be easily interpretable. The state-of-the-art methodologies to estimate VaR and CTE controlled by covariates are mainly based on linear quantile regression, and usually do not have in consideration non-crossing conditions across VaRs and their associated CTEs. We implement a non-crossing neural network that estimates both statistics simultaneously, for several quantile levels and ensuring a list of non-crossing conditions. We illustrate our method with a household energy consumption dataset from 2015 for quantile levels 0.9, 0.925, 0.95, 0.975 and 0.99, and show its improvements against a Monotone Composite Quantile Regression Neural Network approximation.
    Keywords: Risk evaluation, Deep learning, Extreme quantiles. JEL classification: C31, C45, C52.
    Date: 2022–10
    URL: http://d.repec.org/n?u=RePEc:ira:wpaper:202215&r=
  23. By: Riccardo Righi (European Commission - JRC); Michail Papazoglou (European Commission - JRC); Sofia Samoili (European Commission - JRC); Miguel Vazquez-Prada Baillet (European Commission - JRC); Melisande Cardona (European Commission - JRC); Montserrat Lopez-Cobo (European Commission - JRC); Giuditta De-Prato (European Commission - JRC); Nestor Duch-Brown (European Commission - JRC)
    Abstract: This report provides insights on the composition and status of the worldwide AI-enhanced Robotics landscape, with a specific focus on the EU landscape. Based on the JRC Techno-Economic Segment analytical approach, we detect economic players active in the AI-enhanced Robotics global landscape. We identify not only AI players directly involved in core Autonomous Robotics, which is the most technologically advanced thematic area of the AI Techno-Economic Segment thematic areas, but also those players that produce or commercialize products and services related to AI-enhanced robotics. As a result of a bottom-up exercise with a supply perspective , we identify three main subdomains: ‘AI enhanced Robotics Industry’, ‘AI technological support for Robotics’, and ‘AI enhanced Robotics Research & Innovation: Publications and Projects’. We also observe how the presence of ancillary B2B services integrating AI-enhanced Robots in existing economic activities is emerging. The report findings confirm that the US, China and EU27 are leading in the AI-Enhanced Robotics landscape. The EU27 shows an important specialization in most of the thematic areas. Considering EU Member States individually, the highest numbers of players are concentrated in Germany, France, Spain and Italy. Other countries such as Belgium, Netherlands, Austria, Denmark, Finland and Sweden also have strong positions in the ranking of specific thematic areas.
    Keywords: artificial intelligence, Robotics, techno-economic landscape
    Date: 2022–09
    URL: http://d.repec.org/n?u=RePEc:ipt:iptwpa:jrc128846&r=
  24. By: Gopalan,Saji Saraswathy,Mohammed-Roberts,Rianna L.,Zanetti Matarazzo,Hellen Chrystine
    Abstract: Amid noticeable improvements and achievements in the reproductive, maternal, neonatal, child health, and nutrition landscape in Bangladesh, existing evidence suggests that further accelerated progress hinges on strategic investment decision making. Addressing the top service utilization determinants that are both context- and time-specific is one cost-effective way of improving the unmet reproductive, maternal, neonatal, child health, and nutrition outcomes in a short timeframe. Against this backdrop, using machine learning analysis, the overall aim of this study was to help Bangladesh identify priority investment areas that could accelerate reproductive, maternal, neonatal, child health, and nutrition utilization, quality, and outcomes over the short run, by comparing the relative importance of demand- and-supply-side determinants of key reproductive, maternal, neonatal, child health, and nutrition indicators over the past decade (across two time points). Two rounds of the Bangladesh Health Facility Survey and the Demographic and Health Survey (2014 and 2017) were analyzed. The findings indicate that the relative importance of the demand-side determinants (except wealth and education status) have recently declined. Conversely, investments in key supply-side determinants (for example, availability of skilled staff, readiness for care, and quality of care) could provide a thrust toward further increases in utilization. Immediate attention is needed to address the regressive role of wealth status on utilization through, for example, demand-side financing that goes beyond user fee exemptions. Further, developing strategies to improve the engagement of community health workers in reproductive, maternal, neonatal, child health, and nutrition utilization and tapping into the potential of mobile health technology to support community health workers’ performance and women’s awareness could help to boost utilization patterns.
    Keywords: Health Care Services Industry,Nutrition,Educational Sciences,Pharmaceuticals Industry,Pharmaceuticals&Pharmacoeconomics
    Date: 2021–09–24
    URL: http://d.repec.org/n?u=RePEc:wbk:wbrwps:9783&r=
  25. By: Ananda Chatterjee; Hrisav Bhowmick; Jaydip Sen
    Abstract: Volatility clustering is a crucial property that has a substantial impact on stock market patterns. Nonetheless, developing robust models for accurately predicting future stock price volatility is a difficult research topic. For predicting the volatility of three equities listed on India's national stock market (NSE), we propose multiple volatility models depending on the generalized autoregressive conditional heteroscedasticity (GARCH), Glosten-Jagannathan-GARCH (GJR-GARCH), Exponential general autoregressive conditional heteroskedastic (EGARCH), and LSTM framework. Sector-wise stocks have been chosen in our study. The sectors which have been considered are banking, information technology (IT), and pharma. yahoo finance has been used to obtain stock price data from Jan 2017 to Dec 2021. Among the pulled-out records, the data from Jan 2017 to Dec 2020 have been taken for training, and data from 2021 have been chosen for testing our models. The performance of predicting the volatility of stocks of three sectors has been evaluated by implementing three different types of GARCH models as well as by the LSTM model are compared. It has been observed the LSTM performed better in predicting volatility in pharma over banking and IT sectors. In tandem, it was also observed that E-GARCH performed better in the case of the banking sector and for IT and pharma, GJR-GARCH performed better.
    Date: 2022–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2210.02126&r=
  26. By: Ramos Maqueda,Manuel; Chen,Daniel Li
    Abstract: This paper summarizes the empirical evidence on the role of justice in economic development, conflict, and trust in institutions. It finds that justice institutions play a significant role in economic development, particularly through their impact on credit markets and firm growth, the protection of vulnerable populations, their capacity to deter violence, and their influence over people’s trust in formal institutions. The paper then considers the promise of administrative data, machine learning, and randomized controlled trials to enhance the efficiency, access, and quality of justice. The paper concludes by discussing new avenues for research and the potential for data to improve the functioning of justice systems in the age of COVID-19.
    Keywords: Judicial System Reform,Law and Justice Institutions,Justice for the Poor,Crime and Society,Social Policy,Regulatory Regimes,Legal Reform,Legal Products,Legislation,Common Property Resource Development
    Date: 2021–06–29
    URL: http://d.repec.org/n?u=RePEc:wbk:wbrwps:9720&r=
  27. By: Hui Niu; Siyuan Li; Jian Li
    Abstract: Portfolio management is a fundamental problem in finance. It involves periodic reallocations of assets to maximize the expected returns within an appropriate level of risk exposure. Deep reinforcement learning (RL) has been considered a promising approach to solving this problem owing to its strong capability in sequential decision making. However, due to the non-stationary nature of financial markets, applying RL techniques to portfolio optimization remains a challenging problem. Extracting trading knowledge from various expert strategies could be helpful for agents to accommodate the changing markets. In this paper, we propose MetaTrader, a novel two-stage RL-based approach for portfolio management, which learns to integrate diverse trading policies to adapt to various market conditions. In the first stage, MetaTrader incorporates an imitation learning objective into the reinforcement learning framework. Through imitating different expert demonstrations, MetaTrader acquires a set of trading policies with great diversity. In the second stage, MetaTrader learns a meta-policy to recognize the market conditions and decide on the most proper learned policy to follow. We evaluate the proposed approach on three real-world index datasets and compare it to state-of-the-art baselines. The empirical results demonstrate that MetaTrader significantly outperforms those baselines in balancing profits and risks. Furthermore, thorough ablation studies validate the effectiveness of the components in the proposed approach.
    Date: 2022–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2210.01774&r=
  28. By: Marshall R. McCraw
    Abstract: Python was used to download and format NewsAPI article data relating to 400 publicly traded, low cap. Biotech companies. Crowd-sourcing was used to label a subset of this data to then train and evaluate a variety of models to classify the public sentiment of each company. The best performing models were then used to show that trading entirely off public sentiment could provide market beating returns.
    Date: 2022–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2210.00870&r=
  29. By: Gabriele Marconi; Loris Vergolini
    Abstract: We investigate foreign language skill demand and its determinants with a novel dataset, the Web Intelligence Hub's Online Job Advertisement (OJA) database, with information on about 53 million ads posted in 2021 for jobs in Europe. This unique dataset has been built crawling hundreds of job search engines and websites of public employment services, allowing us to identify foreign language requirements in OJAs at the NUTS-3 regional level. Moreover, we analyse how the demand for foreign languages varies at occupational level in the European countries as well as the possible macro factors (GDP, population density; participation rate in education and training; percentage of people employed in the high-tech sector and in the touristic sector) that could influence the request for foreign languages.
    Keywords: Language skills, Labour market, Occupational groups, NUTS, Online job ads, Eurostat, English, German, Chinese, French, Spanish, big data, web scraping
    JEL: J20 J24 R10
    Date: 2022–10
    URL: http://d.repec.org/n?u=RePEc:fbk:wpaper:2022-08&r=
  30. By: Emanuel Kohlscheen
    Abstract: This study analyses oil price movements through the lens of an agnostic random forest model, which is based on 1,000 regression trees. It shows that this highly disciplined, yet flexible computational model reduces in-sample root mean square errors (RMSEs) by 65% relative to a standard linear least square model that uses the same set of 11 explanatory factors. In forecasting exercises the RMSE reduction ranges between 51% and 68%, highlighting the relevance of non-linearities in oil markets. The results underscore the importance of incorporating financial factors into oil models: US interest rates, the dollar and the VIX together account for 39% of the models' RMSE reduction in the post-2010 sample, rising to 48% in the post-2020 sample. If Covid-19 is also considered as a risk factor, these shares become even larger.
    Keywords: dollar, forecasting, machine learning, oil, risk.
    JEL: C40 F30 Q40 Q41 Q47
    Date: 2022–09
    URL: http://d.repec.org/n?u=RePEc:bis:biswps:1040&r=
  31. By: Behera, Sarthak; Sadana, Divya
    Abstract: Many papers in the past literature provide evidence on the impact of athletic performance on various school outcomes. This paper uses the weekly college football poll by the organization Associated Press (AP), to investigate the effect of a college team ranked in top 25 on various school outcomes such as revenues and expenses of school, coaches’ salary, and enrollment. The college football poll also known as AP poll conducts weekly voting to assign the teams certain points based on which these teams are ranked. The results are twofold: First, I verify the visibility of a school using google trends by exploiting the discontinuity arising due to the points of 25th ranked versus 26th ranked team. And second, the results provide evidence of the impact of this visibility of being in top 25 on positive school outcomes.
    Keywords: College Football; Google Trends; School Finances; Enrollment
    JEL: I21 I22 Z00
    Date: 2022–06
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:114818&r=
  32. By: Mahler,Daniel Gerszon; Castaneda Aguilar,Raul Andres; Newhouse,David Locke
    Abstract: This paper evaluates different methods for nowcasting country-level poverty rates,including methods that apply statistical learning to large-scale country-level data obtained from the WorldDevelopment Indicators and Google Earth Engine. The methods are evaluated by withholding measured poverty rates anddetermining how accurately the methods predict the held-out data. A simple approach that scales the last observedwelfare distribution by a fraction of real GDP per capita growth—a method that departs slightly from current WorldBank practice—performs nearly as well as models using statistical learning on 1,000+ variables. This GDP-basedapproach outperforms all models that predict poverty rates directly, even when the last survey is up to five years old.The results indicate that in this context, the additional complexity introduced by applying statistical learningtechniques to a large set of variables yields only marginal improvements in accuracy.
    Keywords: Inequality,Labor & Employment Law,Food Security,Employment and Unemployment
    Date: 2021–11–01
    URL: http://d.repec.org/n?u=RePEc:wbk:wbrwps:9860&r=
  33. By: Beyer,Robert Carl Michael; Hu,Yingyao; Yao,Jiaxiong
    Abstract: This paper presents a novel framework to estimate the elasticity between nighttime lights andquarterly economic activity. The relationship is identified by accounting for varying degrees of measurement errors innighttime light data across countries. The elasticity is 1.55 for emerging markets and developing economies, withonly small deviations across country groups and different model specifications. The paper uses a light-adjustedmeasure of quarterly economic activity to show that higher levels of development, statistical capacity, and voice andaccountability are associated with more precise national accounts data. The elasticity allows quantification ofsubnational economic impacts. During the COVID-19 pandemic, regions with higher levels of development and populationdensity experienced larger declines in economic activity.
    Keywords: Food Security,Industrial Economics,Economic Theory & Research,Economic Growth,International Trade and Trade Rules
    Date: 2022–01–06
    URL: http://d.repec.org/n?u=RePEc:wbk:wbrwps:9893&r=
  34. By: Nelson Vadori; Leo Ardon; Sumitra Ganesh; Thomas Spooner; Selim Amrouni; Jared Vann; Mengda Xu; Zeyu Zheng; Tucker Balch; Manuela Veloso
    Abstract: We study a game between liquidity provider and liquidity taker agents interacting in an over-the-counter market, for which the typical example is foreign exchange. We show how a suitable design of parameterized families of reward functions coupled with associated shared policy learning constitutes an efficient solution to this problem. Precisely, we show that our deep-reinforcement-learning-driven agents learn emergent behaviors relative to a wide spectrum of incentives encompassing profit-and-loss, optimal execution and market share, by playing against each other. In particular, we find that liquidity providers naturally learn to balance hedging and skewing as a function of their incentives, where the latter refers to setting their buy and sell prices asymmetrically as a function of their inventory. We further introduce a novel RL-based calibration algorithm which we found performed well at imposing constraints on the game equilibrium, both on toy and real market data.
    Date: 2022–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2210.07184&r=
  35. By: Antonin Bergeaud; Cyril Verluise
    Abstract: We use patent data to study the contribution of the US, Europe, China and Japan to frontier technology using automated patent landscaping. We find that China's contribution to frontier technology has become quantitatively similar to the US in the late 2010s while overcoming the European and Japanese contributions respectively. Although China still exhibits the stigmas of a catching up economy, these stigmas are on the downside. The quality of frontier technology patents published at the Chinese Patent Office has leveled up to the quality of patents published at the European and Japanese patent offices. At the same time, frontier technology patenting at the Chinese Patent Office seems to have been increasingly supported by domestic patentees, suggesting the build up of domestic capabilities.
    Keywords: frontier technologies, China, patent landscaping, machine learning, patents
    Date: 2022–10–14
    URL: http://d.repec.org/n?u=RePEc:cep:cepdps:dp1876&r=
  36. By: Ulrich Matter; Roland Hodler; Johannes Ladwig
    Abstract: Search engines play a central role in routing political information to citizens. The algorithmic personalization of search results by large search engines like Google implies that different users may be offered systematically different information. However, measuring the causal effect of user characteristics and behavior on search results in a politically relevant context is challenging. We set up a population of 150 synthetic internet users ("bots") who are randomly located across 25 US cities and are active for several months during the 2020 US Elections and their aftermath. These users differ in their browsing preferences and political ideology, and they build up realistic browsing and search histories. We run daily experiments in which all users enter the same election-related queries. Search results to these queries differ substantially across users. Google prioritizes previously visited websites and local news sites. Yet, it does not generally prioritize websites featuring the user's ideology.
    Date: 2022–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2209.14000&r=
  37. By: Alder,Simon; Croke,Kevin; Duhaut,Alice; Marty,Robert Andrew; Vaisey,Ariana Brynn
    Abstract: This paper studies the impacts of the large-scale Road Sector Development Program in Ethiopiabetween 1997 and 2016 on local economic activity and land cover (urbanization and cropland). It exploits spatial andtemporal variation in road upgrades across Ethiopia, together with high-resolution panel data derived fromsatellite imagery. The findings show that road upgrades contributed to increases in local economic activity, asproxied by nighttime lights and urban land area. However, there is significant heterogeneity in the results acrossbaseline levels of economic activity. Specifically, gains from road upgrades are concentrated in areas withmoderate-to-high initial levels of economic activity. By contrast, there was little, or even negative, growth inareas with low levels of initial economic activity. Finally, the findings show that road upgrades contributed to areduction in cropland in areas with medium-to-high baseline nighttime lights. The results suggest that Ethiopia'sambitious road infrastructure development program overall increased local economic activity and urbanization, but thatit also had important distributional implications that need to be taken into account when planning such infrastructure programs.
    Date: 2022–04–06
    URL: http://d.repec.org/n?u=RePEc:wbk:wbrwps:10000&r=
  38. By: Walk,Erin Elizabeth; Garimella,Kiran; Christia,Fotini
    Abstract: Starting in 2011, the Syrian civil war has resulted in the displacement of over 80% of theSyrian population. This paper analyzes how the widespread use of social media has recorded migration considerationsfor Syrian refugees using social media text and image data from three popular platforms (Twitter, Telegram, andFacebook). Leveraging survey data as a source of ground truth on the presence of IDPs and returnees, it uses topicmodeling and image analysis to find that areas without return have a higher prevalence of violence-relateddiscourse and images while areas with return feature content related to services and the economy. Building on thesefindings, the paper first uses mixed effects models to show that these results hold pre- and post- return as well aswhen migration is quantified as monthly population flows. Second, it leverages mediation analysis to find thatdiscussion on social media mediates the relationship between violence and return in months where there are fewer violentevents. Monitoring refugee return in war prone areas is a complex task and social media may provide researchers, aidgroups, and policymakers with tools for assessing return in areas where survey or other data is unavailable or difficultto obtain.
    Date: 2022–04–26
    URL: http://d.repec.org/n?u=RePEc:wbk:wbrwps:10024&r=
  39. By: Dampha,Nfamara K; Salemi,Colette; Polasky,Stephen
    Abstract: How do refugee camps impact the natural environment This paper examines the case study ofCox’s Bazar, Bangladesh, a district that hosts nearly 1 million Rohingya refugees in refugee camps. Using spatiallyexplicit data on land-use / land cover and proximity to a camp boundary, the paper quantifies land-use changes acrossthe district over time. To evaluate the extent to which the camps triggered additional forest loss, the analysiscalculates total forest loss in the district and uses a difference-in-difference model that compares areas 0–5kilometers from a camp boundary (treatment) to areas 10–15 kilometers away (control). The findings show that the rateof forest loss intensified near camps relative to the control area. The analysis reveals that areas experiencingcamp-stimulated reductions in forest cover are also experiencing faster settlement expansion relative to thecontrol area. Settlement expansion is largely concentrated in areas outside protected areas. This enhanced settlementexpansion still occurs when pixels 0–1 kilometer from the camps are omitted, which is evidence that the results arenot due to camp settlements expanding beyond the official camp borders. The results suggest that camps stimulatein-migration as Bangladeshis seek new economic opportunities and improved access to resources.
    Keywords: Post Conflict Reconstruction,Social Cohesion,Hydrology
    Date: 2022–02–28
    URL: http://d.repec.org/n?u=RePEc:wbk:wbrwps:9948&r=
  40. By: Gabor Bekes; Gianmarco I. P. Ottaviano
    Abstract: One may reasonably think that cultural preferences affect collaboration in multinational teams in general, but not in superstar teams of professionals at the top of their industry. We reject this hypothesis by creating and analyzing an exhaustive dataset recording all 10.7 million passes by 7 thousand professional European football players from 138 countries fielded by all 154 teams competing in the top 5 men leagues over 8 sporting seasons, together with full information on players' and teams' characteristics. We use a discrete choice model of players' passing behavior as a baseline to separately identify collaboration due to cultural preferences (`choice homophily') from collaboration due to opportunities (`induced homophily'). The outcome we focus on is the `pass rate', defined as the count of passes from a passer to a receiver relative to the passer's total passes when both players are fielded together in a half-season. We find strong evidence of choice homophily. Relative to the baseline, player pairs of same culture have a 2.42 percent higher pass rate due to choice, compared with a 6.16 percent higher pass rate due to both choice and opportunity. This shows that choice homophily based on culture is pervasive and persistent even in teams of very high skill individuals with clear common objectives and aligned incentives, who are involved in interactive tasks that are well defined, readily monitored and not particularly language intensive.
    Keywords: organizations, teams, culture, homophily, diversity, language, globalization, big data, panel data, sport
    Date: 2022–10–07
    URL: http://d.repec.org/n?u=RePEc:cep:cepdps:dp1873&r=
  41. By: Harry Mamaysky; Yiwen Shen; Hongyu Wu
    Abstract: We develop a novel technique to extract credit-relevant information from the text of quarterly earnings calls. This information is not spanned by fundamental or market variables and forecasts future credit spread changes. One reason for such forecastability is that our text-based measure predicts future credit spread risk and firm profitability. More firm- and call-level complexity increase the forecasting power of our measure for spread changes. Out-of-sample portfolio tests show the information in our measure is valuable for investors. Both results suggest that investors do not fully internalize the credit-relevant information contained in earnings calls.
    Date: 2022–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2209.11914&r=
  42. By: Dasgupta,Susmita; Lall,Somik V.; Wheeler,David R.
    Abstract: This paper estimates an urban carbon dioxide emissions model using satellite-measured carbondioxide concentrations from 2014 to 2020, for 1,236 cities in 138 countries. The model incorporates the global trend incarbon dioxide concentration, seasonal fluctuations by hemisphere, and a large set of georeferenced variables thatincorporate carbon dioxide–intensive industry structure, emissions from agricultural and forest fires in neighboringareas, demography, the component of income that is uncorrelated with industry structure, and relevantgeographic conditions. The income results provide the first test of an Environmental Kuznets Curve relationship forcarbon dioxide based on actual observations. They suggest an environmental Kuznets curve that reaches a peak near orabove $40,000 per capita, which is at the 90th percentile internationally. The research also finds that economicdevelopment has a significant effect on the direction of the relationship between population density and carbon dioxideemissions. The relationship is positive at very low incomesbut becomes negative at higher incomes. The paper also uses cities’ mean regression residuals to index their carbondioxide emissions performance within and across regions, decomposes model carbon dioxide predictions into broadsource categories for each city, and uses the regression residuals to explore the impact of subway systems. Thefindings show significantly lower carbon dioxide emissions for subway cities.
    Keywords: Railways Transport,Transport Services,Energy and Environment,Energy and Mining,Energy Demand,Transport in Urban Areas,Urban Transport
    Date: 2021–11–10
    URL: http://d.repec.org/n?u=RePEc:wbk:wbrwps:9845&r=

This nep-big issue is ©2022 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.