nep-big New Economics Papers
on Big Data
Issue of 2022‒12‒12
thirty papers chosen by
Tom Coupé
University of Canterbury

  1. Predicting football outcomes from Spanish league using machine learning models By Michał Lewandowski; Marcin Chlebus
  2. Application of machine learning in quantitative investment strategies on global stock markets By Jan Grudniewicz; Robert Ślepaczuk
  3. The Impact of Patent Applications on Technological Innovation in European Countries By Leogrande, Angelo; Costantiello, Alberto; Laureti, Lucio
  4. AI, Skill, and Productivity: The Case of Taxi Drivers By Kanazawa, Kyogo; Kawaguchi, Daiji; Shigeoka, Hitoshi; Watanabe, Yasutora
  5. Forecasting the Stability and Growth Pact compliance using Machine Learning By Kea Baret; Amelie Barbier-Gauchard; Theophilos Papadimitriou
  6. Efficient Integration of Multi-Order Dynamics and Internal Dynamics in Stock Movement Prediction By Thanh Trung Huynh; Minh Hieu Nguyen; Thanh Tam Nguyen; Phi Le Nguyen; Matthias Weidlich; Quoc Viet Hung Nguyen; Karl Aberer
  7. Using the web to predict regional trade flows: data extraction, modelling, and validation By Tranos, Emmanouil; Incera, Andre Carrascal; Willis, George
  8. HGV4Risk: Hierarchical Global View-guided Sequence Representation Learning for Risk Prediction By Youru Li; Zhenfeng Zhu; Xiaobo Guo; Shaoshuai Li; Yuchen Yang; Yao Zhao
  9. DSLOB: A Synthetic Limit Order Book Dataset for Benchmarking Forecasting Algorithms under Distributional Shift By Defu Cao; Yousef El-Laham; Loc Trinh; Svitlana Vyetrenko; Yan Liu
  10. PROPOSAL OF A SOCIO-ENVIRONMENTAL VULNERABILITY SCALE OF MUNICIPALITIES THAT RECEIVE FINANCIAL COMPENSATION – BRAZIL By Soares, César Pedrosa; da Penha Vasconcellos, Maria
  11. Regulation relevant to (long-form) audio recordings gathered in Brazil By Korwin-Zmijowski, Marion; Cristia, Alejandrina
  12. Regulation relevant to (long-form) audio recordings gathered in Denmark By Korwin-Zmijowski, Marion; Cristia, Alejandrina
  13. Regulation relevant to (long-form) audio recordings gathered in Finland By Korwin-Zmijowski, Marion; Cristia, Alejandrina
  14. Remotely (and wrongly) too equal: Popular night-time lights data understate spatial inequality By Xiaoxuan Zhang; John Gibson; Xiangzheng Deng
  15. Deep learning and American options via free boundary framework By Chinonso Nwankwo; Nneka Umeorah; Tony Ware; Weizhong Dai
  16. Education Expansion and High-Skill Job Opportunities for Workers: Does a Rising Tide Lift All Boats? By Schultheiss, Tobias; Pfister, Curdin; Gnehm, Ann-Sophie; Backes-Gellner, Uschi
  17. Regulation relevant to (long-form) audio recordings gathered in Papua New Guinea By Korwin-Zmijowski, Marion; Cristia, Alejandrina
  18. Using Artificial Intelligence to Benefit Society in Asia: Opportunities and Challenges By Singhal, Bhoomika
  19. Different Degrees of Skill Obsolescence across Hard and Soft Skills and the Role of Lifelong Learning for Labor Market Outcomes By Schultheiss, Tobias; Backes-Gellner, Uschi
  20. Predicting Household Resilience Before and During Pandemic with Classifier Algorithms By Surjaningsih, Ndari; Werdaningtyas, Hesti; Rahman, Faizal; Falaqh, Romadhon
  21. Using Recurrent Neural Networks for the Performance Analysis and Optimization of Stochastic Milkrun-Supplied Flow Lines By Südbeck, Insa; Mindlina, Julia; Schnabel, André; Helber, Stefan
  22. Identification and Auto-debiased Machine Learning for Outcome Conditioned Average Structural Derivatives By Zequn Jin; Lihua Lin; Zhengyu Zhang
  23. The effect of decentralization of government power on the character of public goods provision By Olga Marut; Jacek Lewkowicz
  24. The Heterogeneous Response of Real Estate Asset Prices to a Global Shock By Sandro Heiniger; Winfried Koeniger; Michael Lechner
  25. The Heterogeneous Response of Real Estate Asset Prices to a Global Shock By Heiniger, Sandro; Koeniger, Winfried; Lechner, Michael
  26. FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning By Xiao-Yang Liu; Ziyi Xia; Jingyang Rui; Jiechao Gao; Hongyang Yang; Ming Zhu; Christina Dan Wang; Zhaoran Wang; Jian Guo
  27. The impact of moving expenses on social segregation: a simulation with RL and ABM By Xinyu Li
  28. Deep Signature Algorithm for Path-Dependent American option pricing By Erhan Bayraktar; Qi Feng; Zhaoyu Zhang
  29. EU Cohesion Policy on the Ground: Analyzing Small-Scale Effects Using Satellite Data By Julia Bachtrögler-Unger; Mathias Dolls; Carla Krolage; Paul Schüle; Hannes Taubenböck; Matthias Weigand
  30. Sentiment in Bank Examination Reports and Bank Outcomes By Maureen Cowhey; Seung Jung Lee; Thomas Popeck Spiller; Cindy M. Vojtech

  1. By: Michał Lewandowski (Faculty of Economic Sciences, University of Warsaw); Marcin Chlebus (Faculty of Economic Sciences, University of Warsaw)
    Abstract: High-quality football predictive models can be very useful and profitable. Therefore, in this research, we undertook to construct machine learning models to predict football outcomes in games from Spanish LaLiga and then we compared them with historical forecasts extracted from bookmakers, which knowledge is commonly considered to be deep and high-quality. The aim of the paper was to design models with the highest possible predictive performances, get results close to bookmakers or even building better estimators. The work included detailed feature engineering based on previous achievements of this domain and own proposals. A built and selected set of variables was used with four machine learning methods, namely Random Forest, AdaBoost, XGBoost and CatBoost. The algorithms were compared based on: Area Under the Curve (AUC) and Ranked Probability Score (RPS). RPS was used as a benchmark in the comparison of estimated probabilities from trained models and forecasts from bookmakers' odds. For a deeper understanding and explanation of the demonstrated methods, which are considered as black-box approaches, Permutation Feature Importance (PFI) was used to evaluate the impacts of individual variables. Features extracted from bookmakers odds’ occurred the most important in terms of PFI. Furthermore, XGBoost achieved the best results on the validation set (RPS equals 0.1989), which obtained similar predictive power to bookmakers' odds (their RPS between 0.1977 and 0.1984). Results of the trained estimators were promising and this article showed that competition with bookmakers is possible using demonstrated techniques.
    Keywords: predicting football outcomes, machine learning, betting, adaboost, random forest, xgboost, catboost, ranked probability score, auc, permutation feature importance
    JEL: C13 C51 C52 C53 C61 L83 Z29
    Date: 2021
    URL: http://d.repec.org/n?u=RePEc:war:wpaper:2021-22&r=big
  2. By: Jan Grudniewicz (University of Warsaw, Faculty of Economic Sciences, Quantitative Finance Research Group); Robert Ślepaczuk (University of Warsaw, Faculty of Economic Sciences, Quantitative Finance Research Group, Department of Quantitative Finance)
    Abstract: The thesis undertakes the subject of machine learning based quantitative investment strategies. Several technical analysis indicators were employed as inputs to machine learning models such as Neural Networks, K Nearest Neighbor, Regression Trees, Random Forests, Naïve Bayes classifiers, Bayesian Generalized Linear Models and Support Vector Machines. Models were used to generate trading signals on WIG20, DAX, S&P500 and selected CEE indices in the period between 2002-01-01 to 2020-10-30. Strategies were compared with each other and with the benchmark buy-and-hold strategy in terms of achieved levels of risk and return. Quality of estimation was evaluated on independent subsets and with the use of sensitivity analysis. The research results indicated that quantitative strategies generate better risk adjusted returns than passive strategies and that for the analysed indices predominantly Bayesian Generalized Linear Model and Naïve Bayes were the best performing models. More comprehensive rank approach based on the results for all analysed models and indices allowed to select Bayesian Generalized Linear Model as the model which on average generated the best results.
    Keywords: quantitative investment strategies, machine learning, neural networks, regression trees, random forests, support vector machine, technical analysis, equity stock indices, developed and emerging markets, information ratio
    JEL: C4 C14 C45 C53 C58 G13
    Date: 2021
    URL: http://d.repec.org/n?u=RePEc:war:wpaper:2021-23&r=big
  3. By: Leogrande, Angelo; Costantiello, Alberto; Laureti, Lucio
    Abstract: We investigate the innovational determinants of “Patent Applications” in Europe. We use data from the European Innovation Scoreboard-EIS of the European Commission for 36 countries in the period 2010-2019. We use Panel Data with Fixed Effects, Panel Data with Random Effects, Pooled OLS, WLS and Dynamic Panel. We found that the variables that have a deeper positive association with “Patent Applications” are “Human Resources” and “Intellectual Assets”, while the variables that show a more intense negative relation with Patent Applications are “Employment Share in Manufacturing” and “Total Entrepreneurial Activity”. A cluster analysis with the k-Means algorithm optimized with the Silhouette Coefficient has been realized. The results show the presence of two clusters. A network analysis with the distance of Manhattan has been performed and we find three different complex network structures. Finally, a comparison is made among eight machine learning algorithms for the prediction of the future value of the “Patent Applications”. We found that PNN-Probabilistic Neural Network is the best performing algorithm. Using PNN the results show that the mean future value of “Patent Applications” in the estimated countries is expected to decrease of -0.1%.
    Keywords: Innovation, and Invention: Processes and Incentives; Management of Technological Innovation and R&D; Diffusion Processes; Open Innovation.
    JEL: O30 O31 O32 O33 O34
    Date: 2022–11–12
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:115346&r=big
  4. By: Kanazawa, Kyogo (University of Tokyo); Kawaguchi, Daiji (University of Tokyo); Shigeoka, Hitoshi (Simon Fraser University); Watanabe, Yasutora (University of Tokyo)
    Abstract: We examine the impact of Articial Intelligence (AI) on productivity in the context of taxi drivers. The AI we study assists drivers with finding customers by suggesting routes along which the demand is predicted to be high. We find that AI improves drivers' productivity by shortening the cruising time, and such gain is accrued only to low-skilled drivers, narrowing the productivity gap between high- and low-skilled drivers by 14%. The result indicates that AI's impact on human labor is more nuanced and complex than a job displacement story, which was the primary focus of existing studies.
    Keywords: artificial intelligence, skill, productivity, taxi-drivers, prediction, demand forecasting, machine learning
    JEL: J22 J24 L92 R41
    Date: 2022–10
    URL: http://d.repec.org/n?u=RePEc:iza:izadps:dp15677&r=big
  5. By: Kea Baret (University of Strasbourg); Amelie Barbier-Gauchard (University of Strasbourg); Theophilos Papadimitriou (Democritus University of Thrace)
    Abstract: Since the reinforcement of the Stability and Growth Pact (1996), the European Commission closely monitors public finance in the EU members. A failure to comply with the 3% limit rule on the public deficit by a country triggers an audit. In this paper, we present a Machine Learning based forecasting model for the compliance with the 3% limit rule. To do so, we use data spanning the period from 2006 to 2018 (a turbulent period including the Global Financial Crisis and the Sovereign Debt Crisis) for the 28 EU member states. A set of eight features are identified as predictors from 138 variables through a feature selection procedure. The forecasting is performed using the Support Vector Machines (SVM). The proposed model reached 91.7% forecasting accuracy and outperformed the Logit model that was used as benchmark.
    Keywords: Fiscal Rules, Fiscal Compliance, Stability and Growth Pact, Machine learning.
    JEL: F
    Date: 2022
    URL: http://d.repec.org/n?u=RePEc:inf:wpaper:2022.11&r=big
  6. By: Thanh Trung Huynh; Minh Hieu Nguyen; Thanh Tam Nguyen; Phi Le Nguyen; Matthias Weidlich; Quoc Viet Hung Nguyen; Karl Aberer
    Abstract: Advances in deep neural network (DNN) architectures have enabled new prediction techniques for stock market data. Unlike other multivariate time-series data, stock markets show two unique characteristics: (i) \emph{multi-order dynamics}, as stock prices are affected by strong non-pairwise correlations (e.g., within the same industry); and (ii) \emph{internal dynamics}, as each individual stock shows some particular behaviour. Recent DNN-based methods capture multi-order dynamics using hypergraphs, but rely on the Fourier basis in the convolution, which is both inefficient and ineffective. In addition, they largely ignore internal dynamics by adopting the same model for each stock, which implies a severe information loss. In this paper, we propose a framework for stock movement prediction to overcome the above issues. Specifically, the framework includes temporal generative filters that implement a memory-based mechanism onto an LSTM network in an attempt to learn individual patterns per stock. Moreover, we employ hypergraph attentions to capture the non-pairwise correlations. Here, using the wavelet basis instead of the Fourier basis, enables us to simplify the message passing and focus on the localized convolution. Experiments with US market data over six years show that our framework outperforms state-of-the-art methods in terms of profit and stability. Our source code and data are available at \url{https://github.com/thanhtrunghuynh9 3/estimate}.
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.07400&r=big
  7. By: Tranos, Emmanouil; Incera, Andre Carrascal; Willis, George
    Abstract: Despite the importance of interregional trade for building effective regional economic policies, there is very little hard data to illustrate such interdependencies. We propose here a novel research framework to predict interregional trade flows by utilising freely available web data and machine learning algorithms. Specifically, we extract hyperlinks between archived websites in the UK and we aggregate these data to create an interregional network of hyperlinks between geolocated and commercial webpages over time. We also use some existing interregional trade data to train our models using random forests and then make out-of-sample predictions of interregional trade flows using a rolling-forecasting framework. Our models illustrative great predictive capability with $R^2$ greater than 0.9. We are also able to disaggregate our predictions in terms of industrial sectors, but also at a sub-regional level, for which trade data are not available. In total, our models provide a proof of concept that the digital traces left behind by physical trade can help us capture such economic activities at a more granular level and, consequently, inform regional policies.
    Date: 2022–07–06
    URL: http://d.repec.org/n?u=RePEc:osf:osfxxx:9bu5z&r=big
  8. By: Youru Li; Zhenfeng Zhu; Xiaobo Guo; Shaoshuai Li; Yuchen Yang; Yao Zhao
    Abstract: Risk prediction, as a typical time series modeling problem, is usually achieved by learning trends in markers or historical behavior from sequence data, and has been widely applied in healthcare and finance. In recent years, deep learning models, especially Long Short-Term Memory neural networks (LSTMs), have led to superior performances in such sequence representation learning tasks. Despite that some attention or self-attention based models with time-aware or feature-aware enhanced strategies have achieved better performance compared with other temporal modeling methods, such improvement is limited due to a lack of guidance from global view. To address this issue, we propose a novel end-to-end Hierarchical Global View-guided (HGV) sequence representation learning framework. Specifically, the Global Graph Embedding (GGE) module is proposed to learn sequential clip-aware representations from temporal correlation graph at instance level. Furthermore, following the way of key-query attention, the harmonic $\beta$-attention ($\beta$-Attn) is also developed for making a global trade-off between time-aware decay and observation significance at channel level adaptively. Moreover, the hierarchical representations at both instance level and channel level can be coordinated by the heterogeneous information aggregation under the guidance of global view. Experimental results on a benchmark dataset for healthcare risk prediction, and a real-world industrial scenario for Small and Mid-size Enterprises (SMEs) credit overdue risk prediction in MYBank, Ant Group, have illustrated that the proposed model can achieve competitive prediction performance compared with other known baselines.
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.07956&r=big
  9. By: Defu Cao; Yousef El-Laham; Loc Trinh; Svitlana Vyetrenko; Yan Liu
    Abstract: In electronic trading markets, limit order books (LOBs) provide information about pending buy/sell orders at various price levels for a given security. Recently, there has been a growing interest in using LOB data for resolving downstream machine learning tasks (e.g., forecasting). However, dealing with out-of-distribution (OOD) LOB data is challenging since distributional shifts are unlabeled in current publicly available LOB datasets. Therefore, it is critical to build a synthetic LOB dataset with labeled OOD samples serving as a testbed for developing models that generalize well to unseen scenarios. In this work, we utilize a multi-agent market simulator to build a synthetic LOB dataset, named DSLOB, with and without market stress scenarios, which allows for the design of controlled distributional shift benchmarking. Using the proposed synthetic dataset, we provide a holistic analysis on the forecasting performance of three different state-of-the-art forecasting methods. Our results reflect the need for increased researcher efforts to develop algorithms with robustness to distributional shifts in high-frequency time series data.
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.11513&r=big
  10. By: Soares, César Pedrosa; da Penha Vasconcellos, Maria
    Abstract: The objective of this paper was to propose a scale of socio-environmental vulnerability capable of presenting the municipalities with exploration activities of natural resources in a hierarchical way, considering the level of criticality of this factor. Theoretically developed, the socio-environmental vulnerability was operationalized by observable socioeconomic and environmental indicators from machine learning techniques and the Rasch model. The value presented by each municipality for these aspects served as the basis for obtaining the measure of the socio-environmental vulnerability of the local population and elaboration of vulnerability levels. From the scale, the importance of Financial Compensation (FC) was observed to deal with the socio-environmental vulnerability. The results pointed to a scenario marked by municipalities with medium and high vulnerability, with weak or almost non-existent correlations between FC and these factors. The scale offers subsidies that can encourage more equitably FC arrangements, considering the respective levels of vulnerability of the municipalities.
    Date: 2022–11–15
    URL: http://d.repec.org/n?u=RePEc:osf:socarx:jg4a8&r=big
  11. By: Korwin-Zmijowski, Marion; Cristia, Alejandrina (Centre Nationale de la Recherche Scientifique)
    Abstract: In the context of research using machine-learning tools on audio-recordings gathered in several countries, the LAAC Team sought to systematize regulation relevant to such data. The most important legal issue is data protection. Data protection is an important part of using and operating technology to protect human rights, and both at the international level and at other levels in many countries, a great deal of regulation has been created to address it. In addition, we also considered regulation referencing issues on which there are fewer regulations as of yet: informed consent, machine-learning bias and the possibility of discrimination, duty to report illegal activities, and intellectual property (potentially) emerging from aboriginal resources. In this document, we first provide an overview of international law applicable to the protection of data from Brazilian citizens. We then summarize relevant areal legislation, before turning to national legislation in Brazil specifically. Finally, after briefly describing the organization of legislation within Brazil with respect to local and regional legislation, we summarize these. Whenever possible, we explain in what way a given piece of regulation is relevant to long-form audio-recordings.
    Date: 2022–09–15
    URL: http://d.repec.org/n?u=RePEc:osf:osfxxx:dxkcf&r=big
  12. By: Korwin-Zmijowski, Marion; Cristia, Alejandrina (Centre Nationale de la Recherche Scientifique)
    Abstract: In the context of research using machine-learning tools on audio-recordings gathered in several countries, the LAAC Team sought to systematize regulation relevant to such data. The most important legal issue is data protection. Data protection is an important part of using and operating technology to protect human rights, and both at the international level and at other levels in many countries, a great deal of regulation has been created to address it. In addition, we also considered regulation referencing issues on which there are fewer regulations as of yet: informed consent, machine-learning bias and the possibility of discrimination, duty to report illegal activities, and intellectual property (potentially) emerging from aboriginal resources. In this document, we first provide an overview of international law applicable to the protection of data from Danish citizens. We then summarize relevant areal legislation, before turning to national legislation in Denmark specifically. Finally, after briefly describing the organization of legislation within Denmark with respect to local and regional legislation, we summarize these. Whenever possible, we explain in what way a given piece of regulation is relevant to long-form audio-recordings.
    Date: 2022–09–15
    URL: http://d.repec.org/n?u=RePEc:osf:osfxxx:az2qj&r=big
  13. By: Korwin-Zmijowski, Marion; Cristia, Alejandrina (Centre Nationale de la Recherche Scientifique)
    Abstract: In the context of research using machine-learning tools on audio-recordings gathered in several countries, the LAAC Team sought to systematize regulation relevant to such data. The most important legal issue is data protection. Data protection is an important part of using and operating technology to protect human rights, and both at the international level and at other levels in many countries, a great deal of regulation has been created to address it. In addition, we also considered regulation referencing issues on which there are fewer regulations as of yet: informed consent, machine-learning bias and the possibility of discrimination, duty to report illegal activities, and intellectual property (potentially) emerging from aboriginal resources. In this document, we first provide an overview of international law applicable to the protection of data from Finish citizens. We then summarize relevant areal legislation, before turning to national legislation in Finland specifically. Finally, after briefly describing the organization of legislation within Finland with respect to local and regional legislation, we summarize these. Whenever possible, we explain in what way a given piece of regulation is relevant to long-form audio-recordings.
    Date: 2022–09–15
    URL: http://d.repec.org/n?u=RePEc:osf:osfxxx:e54t9&r=big
  14. By: Xiaoxuan Zhang (University of Waikato); John Gibson (University of Waikato); Xiangzheng Deng (IGSNRR, Chinese Academy of Sciences)
    Abstract: Several studies in economics and regional science use Defense Meteorological Satellite Program (DMSP) night-time lights data to measure spatial inequality. These DMSP data are a poor proxy in this context because they have spatially mean-reverting errors, yielding significantly lower inequality estimates than what sub-national GDP data show. Inequality estimates from DMSP are also lower than what newer, research-focused and more accurate, satellites show from their observations of the earth at night. In this paper, county-level data from the United States and China are used to demonstrate the understatement of spatial inequality when DMSP data are used. In both settings, benchmark data on sub-national GDP are available for establishing the level and trend in spatial inequality, which is then used to assess the accuracy of the estimates coming from remote sensing sources. In the rush to use big data it is important to not lose sight of basic measurement error features of some of these data sources.
    Keywords: DMSP;mean-reverting error;night lights;spatial inequality;VIIRS
    JEL: E01 R12
    Date: 2022–11–26
    URL: http://d.repec.org/n?u=RePEc:wai:econwp:22/13&r=big
  15. By: Chinonso Nwankwo; Nneka Umeorah; Tony Ware; Weizhong Dai
    Abstract: We propose a deep learning method for solving the American options model with a free boundary feature. To extract the free boundary known as the early exercise boundary from our proposed method, we introduce the Landau transformation. For efficient implementation of our proposed method, we further construct a dual solution framework consisting of a novel auxiliary function and free boundary equations. The auxiliary function is formulated to include the feed forward deep neural network (DNN) output and further mimic the far boundary behaviour, smooth pasting condition, and remaining boundary conditions due to the second-order space derivative and first-order time derivative. Because the early exercise boundary and its derivative are not a priori known, the boundary values mimicked by the auxiliary function are in approximate form. Concurrently, we then establish equations that approximate the early exercise boundary and its derivative directly from the DNN output based on some linear relationships at the left boundary. Furthermore, the option Greeks are obtained from the derivatives of this auxiliary function. We test our implementation with several examples and compare them to the highly accurate sixth-order compact scheme with left boundary improvement. All indicators show that our proposed deep learning method presents an efficient and alternative way of pricing options with early exercise features.
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.11803&r=big
  16. By: Schultheiss, Tobias (University of Zurich); Pfister, Curdin (University of Zurich); Gnehm, Ann-Sophie (University of Zurich); Backes-Gellner, Uschi (University of Zurich)
    Abstract: We examine how education expansions affect the job opportunities for workers with and without the new education. To identify causal effects, we exploit a quasi-random establishment of Universities of Applied Sciences (UASs), bachelor-granting three-year colleges that teach and conduct applied research. By applying machine-learning methods to job advertisement data, we analyze job content before and after the education expansion. We find that, in regions with the newly established UASs, not only job descriptions of the new UAS graduates but also job descriptions of workers without this degree (i.e., middle-skilled workers with vocational training) contain more high-skill job content. This upskilling in job content is driven by an increase in high-skill R&Drelated tasks and linked to employment and wage gains. The task spillovers likely occur because UAS graduates with applied research skills build a bridge between middle-skilled workers and traditional university graduates, facilitating the integration of the former into R&D-related tasks.
    Keywords: educational expansion, worker demand, upskilling, spillover effects, vocational training
    JEL: I23 J23 J24
    Date: 2022–10
    URL: http://d.repec.org/n?u=RePEc:iza:izadps:dp15687&r=big
  17. By: Korwin-Zmijowski, Marion; Cristia, Alejandrina (Centre Nationale de la Recherche Scientifique)
    Abstract: In the context of research using machine-learning tools on audio-recordings gathered in several countries, the LAAC Team sought to systematize regulation relevant to such data. The most important legal issue is data protection. Data protection is an important part of using and operating technology to protect human rights, and both at the international level and at other levels in many countries, a great deal of regulation has been created to address it. In addition, we also considered regulation referencing issues on which there are fewer regulations as of yet: informed consent, machine-learning bias and the possibility of discrimination, duty to report illegal activities, and intellectual property (potentially) emerging from aboriginal resources. In this document, we first provide an overview of international law applicable to the protection of data from Papua New Guinean citizens. We then summarize relevant areal legislation, before turning to national legislation in Papua New Guinea specifically. Finally, after briefly describing the organization of legislation within Papua New Guinea with respect to local and regional legislation, we summarize these. Whenever possible, we explain in what way a given piece of regulation is relevant to long-form audio-recordings.
    Date: 2022–09–15
    URL: http://d.repec.org/n?u=RePEc:osf:osfxxx:ywbtc&r=big
  18. By: Singhal, Bhoomika
    Abstract: There is a bright future ahead for Asia, which has the highest population density of any continent in the world. There are strong indications that the Artificial Intelligence new technologies revolution will play a major role in shaping Asia's growth and development in the future, as these new technologies sweep through societies and become an integral part of our daily lives as a consequence of the revolution that will soon sweep across all societies. There is no doubt that new technologies have the potential to help speed up progress across the region by creating mechanisms that can be used to overcome traditional obstacles, such as a lack of infrastructure and bureaucracy, which have been a hindrance in the past. Investing in new technology can also come with a number of risks that can have serious consequences for society in the long run, so it is vital that these risks are evaluated at this early stage of the development process in order to minimize the impact of those risks in the future. Our primary focus in this paper will be on the opportunities and challenges associated with the development of new technologies in Asia during the course of our discussion. Among the number of technological opportunities there are, there are a number of them that are both cross-cutting as they bridge Asia's cultural divides, mining public data, as well as specific to one specific sector that has to do with education. The following is a list of some of the challenges that are a result of the prevailing social circumstances of today, such as the diversity and socioeconomic disparities that exist in our society. In order to ensure a robust and inclusive growth in this region, we will distill out those measures and safeguards that we believe are necessary to ensure that we enter the era of new technologies in Asia.
    Keywords: Artificial intelligence impact on Asia, artificial intelligence and ethics, artificial intelligence opportunities, artificial intelligence challenges, AI benefits to society
    JEL: K0 O1 O4
    Date: 2022–10–16
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:115421&r=big
  19. By: Schultheiss, Tobias (University of Zurich); Backes-Gellner, Uschi (University of Zurich)
    Abstract: This paper examines the role of lifelong learning in counteracting skill depreciation and obsolescence. We build on findings showing that different skill types have structurally different depreciation rates. We differentiate between occupations with more hard skills versus more soft skills. To do so, we draw on representative job advertisement data that contain machine-learning categorized skill requirements and cover the Swiss job market in great detail across occupations (from 1950–2019). We examine lifelong learning effects for "harder" versus "softer" occupations, thereby analyzing the role of training in counteracting skill depreciation in occupations that are differently affected by skill depreciation. Our results reveal novel patterns regarding the benefits from lifelong learning across occupations: In harder occupations, with large shares of fast-depreciating hard skills, the role of lifelong learning is primarily as a hedge against unemployment risks rather than a boost to wages. In contrast, in softer occupations, in which workers build on more value-stable soft skill foundations, the role of lifelong learning instead lies mostly in acting as a boost for upward career mobility and leads to larger wage gains.
    Keywords: skill depreciation, lifelong learning, soft vs. hard occupations, hedging against unemployment, boosting wages
    JEL: M53 J24 I2
    Date: 2022–10
    URL: http://d.repec.org/n?u=RePEc:iza:izadps:dp15688&r=big
  20. By: Surjaningsih, Ndari; Werdaningtyas, Hesti; Rahman, Faizal; Falaqh, Romadhon
    Abstract: One of the lessons learned from the global financial crisis in 2008 was raising attention to monitoring and maintaining household vulnerability, particularly household credit risk, by using the default rate as the indicator. The indicator would be worsening at the economic recession, likewise, recently happened caused by the pandemic. The default event has a complex nonlinearity relationship among the determinants. To tackle the complex relationship, this study suggests exploiting machine learning approach in modeling the probability of default, especially the individual and ensemble classifiers. Therefore, this study aims to investigate changes of the Indonesian household financial resilience before and during the pandemic, supported by the individual-level data of the Financial Information Service System. This study finds that the ensemble classifiers, notably extreme gradient boosting, have a more predominant performance than the individual classifiers. The best model, then has the feature importance analysis to identify the variable pattern in explaining the default event periodically which reveals the pattern changes before and during the pandemic. The cost of debt/repayment capability and the policy mix is significant in explaining the default event. At the same time, the project location feature weakens in discriminating the target class.
    Date: 2022–07–23
    URL: http://d.repec.org/n?u=RePEc:osf:osfxxx:w5q9g&r=big
  21. By: Südbeck, Insa; Mindlina, Julia; Schnabel, André; Helber, Stefan
    Abstract: Long-term throughput, as a key performance indicator of a stochastic flow line, is affected by numerous parameters describing the features of the flow line, such as processing time and buffer size. Fast and accurate evaluation methods for a given set of values for those parameters are a prerequisite to systematically optimize such a flow line. In this paper, we consider the case of a flow line with random processing times, limited buffer capacities and so-called milkruns that supply the machines with material parts that are required to perform, e.g., assembly operations on workpieces. In such a system, shortages in the supply of material parts can limit the performance of the flow line. Up to now, there are no accurate analytical approaches to quantify the complex interactions in such milkrun-supplied flow lines for realistic problem sizes. We propose to use recurrent neural networks to determine the long-term throughput of such flow lines enabling us to evaluate production systems of flexible size. Our results show that the throughput can be determined accurately and quickly via recurrent neural networks. Furthermore, we use this new evaluation procedure as a building block to optimize this type of flow line using gradient and local search techniques.
    Keywords: Recurrent neural networks; Milkrun material supply; Stochastic flow lines; Gradient search; Simulated annealing
    JEL: C44 C45 M11
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:han:dpaper:dp-703&r=big
  22. By: Zequn Jin; Lihua Lin; Zhengyu Zhang
    Abstract: This paper proposes a new class of heterogeneous causal quantities, named \textit{outcome conditioned} average structural derivatives (OASD) in a general nonseparable model. OASD is the average partial effect of a marginal change in a continuous treatment on the individuals located at different parts of the outcome distribution, irrespective of individuals' characteristics. OASD combines both features of ATE and QTE: it is interpreted as straightforwardly as ATE while at the same time more granular than ATE by breaking the entire population up according to the rank of the outcome distribution. One contribution of this paper is that we establish some close relationships between the \textit{outcome conditioned average partial effects} and a class of parameters measuring the effect of counterfactually changing the distribution of a single covariate on the unconditional outcome quantiles. By exploiting such relationship, we can obtain root-$n$ consistent estimator and calculate the semi-parametric efficiency bound for these counterfactual effect parameters. We illustrate this point by two examples: equivalence between OASD and the unconditional partial quantile effect (Firpo et al. (2009)), and equivalence between the marginal partial distribution policy effect (Rothe (2012)) and a corresponding outcome conditioned parameter. Because identification of OASD is attained under a conditional exogeneity assumption, by controlling for a rich information about covariates, a researcher may ideally use high-dimensional controls in data. We propose for OASD a novel automatic debiased machine learning estimator, and present asymptotic statistical guarantees for it. We prove our estimator is root-$n$ consistent, asymptotically normal, and semiparametrically efficient. We also prove the validity of the bootstrap procedure for uniform inference on the OASD process.
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.07903&r=big
  23. By: Olga Marut (University of Warsaw, Faculty of Economic Sciences); Jacek Lewkowicz (University of Warsaw, Faculty of Economic Sciences)
    Abstract: What are the institutional drivers of public goods provision? What do we know about the impact of concentration of power on their distribution? The current literature proves the relevance of the allocation of public goods, mostly in the context of economic and social progress. A growing number of empirical studies is focused primarily on public policies that may matter in this context. However, we still know relatively little about institutional factors that may affect public goods provision. In this article we apply econometric and machine learning tools to verify the importance of governmental power decentralization for distribution of public goods. The obtained output implies that indeed concentration of power impacts public goods provision and the results are robust across various quantitative methods. Our conclusions may be of practical relevance also for policymakers.
    Keywords: public goods, power decentralization, politics, institutional economics, political economy
    JEL: B52 H41 H72 P48
    Date: 2022
    URL: http://d.repec.org/n?u=RePEc:war:wpaper:2022-11&r=big
  24. By: Sandro Heiniger (University of St. Gallen); Winfried Koeniger (University of St. Gallen; CESifo (Center for Economic Studies and Ifo Institute); Center for Financial Studies (CFS); IZA Institute of Labor Economics; Swiss Finance Institute); Michael Lechner (University of St. Gallen - Swiss Institute for Empirical Economic Research)
    Abstract: We estimate the transmission of the pandemic shock in 2020 to prices in the residential and commercial real estate market by causal machine learning, using new granular data at the municipal level for Germany. We exploit differences in the incidence of Covid infections or short-time work at the municipal level for identification. In contrast to evidence for other countries, we find that the pandemic had only temporary negative effects on rents for some real estate types and increased asset prices of real estate particularly in the top price segment of commercial real estate.
    Keywords: Real estate, Asset prices, Rents, Covid pandemic, Short-time work, Affordability crisis.
    JEL: E21 E22 G12 G51 R21 R31
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:chf:rpseri:rp2286&r=big
  25. By: Heiniger, Sandro (University of St. Gallen); Koeniger, Winfried (University of St. Gallen); Lechner, Michael (University of St. Gallen)
    Abstract: We estimate the transmission of the pandemic shock in 2020 to prices in the residential and commercial real estate market by causal machine learning, using new granular data at the municipal level for Germany. We exploit differences in the incidence of Covid infections or short-time work at the municipal level for identification. In contrast to evidence for other countries, we find that the pandemic had only temporary negative effects on rents for some real estate types and increased asset prices of real estate particularly in the top price segment of commercial real estate.
    Keywords: real estate, asset prices, rents, short-time work, affordability crisis, COVID-19
    JEL: E21 E22 G12 G51 R21 R31
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:iza:izadps:dp15699&r=big
  26. By: Xiao-Yang Liu; Ziyi Xia; Jingyang Rui; Jiechao Gao; Hongyang Yang; Ming Zhu; Christina Dan Wang; Zhaoran Wang; Jian Guo
    Abstract: Finance is a particularly difficult playground for deep reinforcement learning. However, establishing high-quality market environments and benchmarks for financial reinforcement learning is challenging due to three major factors, namely, low signal-to-noise ratio of financial data, survivorship bias of historical data, and model overfitting in the backtesting stage. In this paper, we present an openly accessible FinRL-Meta library that has been actively maintained by the AI4Finance community. First, following a DataOps paradigm, we will provide hundreds of market environments through an automatic pipeline that collects dynamic datasets from real-world markets and processes them into gym-style market environments. Second, we reproduce popular papers as stepping stones for users to design new trading strategies. We also deploy the library on cloud platforms so that users can visualize their own results and assess the relative performance via community-wise competitions. Third, FinRL-Meta provides tens of Jupyter/Python demos organized into a curriculum and a documentation website to serve the rapidly growing community. FinRL-Meta is available at: https://github.com/AI4Finance-Foundation /FinRL-Meta
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.03107&r=big
  27. By: Xinyu Li
    Abstract: Over the past decades, breakthroughs such as Reinforcement Learning (RL) and Agent-based modeling (ABM) have made simulations of economic models feasible. Recently, there has been increasing interest in applying ABM to study the impact of residential preferences on neighborhood segregation in the Schelling Segregation Model. In this paper, RL is combined with ABM to simulate a modified Schelling Segregation model, which incorporates moving expenses as an input parameter. In particular, deep Q network (DQN) is adopted as RL agents' learning algorithm to simulate the behaviors of households and their preferences. This paper studies the impact of moving expenses on the overall segregation pattern and its role in social integration. A more comprehensive simulation of the segregation model is built for policymakers to forecast the potential consequences of their policies.
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.12475&r=big
  28. By: Erhan Bayraktar; Qi Feng; Zhaoyu Zhang
    Abstract: In this work, we study the deep signature algorithms for path-dependent FBSDEs with reflections. We follow the backward scheme in [Hur\'e-Pham-Warin. Mathematics of Computation 89, no. 324 (2020)] for state-dependent FBSDEs with reflections, and combine it with the signature layer to solve American type option pricing problems while the payoff function depends on the whole paths of the underlying forward stock process. We prove the convergence analysis of our numerical algorithm and provide numerical example for Amerasian option under the Black-Scholes model.
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.11691&r=big
  29. By: Julia Bachtrögler-Unger; Mathias Dolls; Carla Krolage; Paul Schüle; Hannes Taubenböck; Matthias Weigand
    Abstract: We present a novel approach for analyzing the effects of EU cohesion policy on local economic activity. For all municipalities in the border area of the Czech Republic, Germany and Poland, we collect project-level data on EU funding in the period between 2007 and 2013. Using night light emission data as a proxy for economic development, we show that the receipt of a higher amount of EU funding is associated with increased economic activity at the municipal level. Our paper demonstrates that remote sensing data can provide an effective way to model local economic development also in Europe, where no comprehensive cross-border data is available at such a spatially granular level.
    Keywords: Regional Development, EU Cohesion Policy, Remote Sensing
    Date: 2022–11–29
    URL: http://d.repec.org/n?u=RePEc:wfo:wpaper:y:2022:i:653&r=big
  30. By: Maureen Cowhey; Seung Jung Lee; Thomas Popeck Spiller; Cindy M. Vojtech
    Abstract: We investigate whether the bank examination process provides useful insight into bank future outcomes. We do this by conducting textual analysis on about 5,500 small to medium-sized commercial bank examination reports from 2004 to 2016. These confidential examination reports provide textual context to the components of supervisory ratings: capital adequacy, asset quality, management, earnings, and liquidity. Each component is given a categorical rating, and each bank is assigned an overall composite rating, which are used to determine the safety and soundness of banks. We find that, controlling for a variety of factors, including the ratings themselves, the sentiment supervisors express in describing most of the components predict relevant future bank outcomes. The sentiment conveyed in the asset quality, management, and earnings sections provides significant information in predicting future outcomes for problem loans, supervisory actions, and profitability, respectively, for all banks. Sentiment conveyed in the capital adequacy section appears to be predictive of future capital ratios for weak banks. These relationships suggest that bank supervisors play a meaningful role in the surveillance of the banking system.
    Keywords: CAMELS; Bank examination reports; Natural language processing; Private supervisory information
    JEL: G21 G28
    Date: 2022–11–17
    URL: http://d.repec.org/n?u=RePEc:fip:fedgfe:2022-77&r=big

This nep-big issue is ©2022 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.