nep-big 2023-10-02 papers

on Big Data

Issue of 2023‒10‒02
24 papers chosen by
Tom Coupé, University of Canterbury

How Big Is the Media Multiplier? Evidence from Dyadic News Data By Timothy Besley; Thiemo Fetzer; Hannes Mueller
Learning from the Origins By Alexander Yarkin
Fairness Implications of Heterogeneous Treatment Effect Estimation with Machine Learning Methods in Policy-making By Patrick Rehill; Nicholas Biddle
Deep Semi-Supervised Anomaly Detection for Finding Fraud in the Futures Market By Timothy DeLise
Recurrent Neural Networks with more flexible memory: better predictions than rough volatility By Damien Challet; Vincent Ragel
The roots of inequality: estimating inequality of opportunity from regression trees and forests By Brunori, Paolo
Diffusion Variational Autoencoder for Tackling Stochasticity in Multi-Step Regression Stock Price Prediction By Kelvin J. L. Koa; Yunshan Ma; Ritchie Ng; Tat-Seng Chua
Harnessing the Power of Artificial Intelligence to Forecast Startup Success: An Empirical Evaluation of the SECURE AI Model By Morande, Swapnil; Arshi, Tahseen; Gul, Kanwal; Amini, Mitra
Asymmetric AdaBoost for High-dimensional Maximum Score Regression By Jianghao Chu; Tae-Hwy Lee; Aman Ullah
Agree to Disagree: Measuring Hidden Dissents in FOMC Meetings By Kwok Ping Tsang; Zichao Yang
Predicting Financial Market Trends using Time Series Analysis and Natural Language Processing By Ali Asgarov
Long-term Effects of Temperature Variations on Economic Growth: A Machine Learning Approach By Eugene Kharitonov; Oksana Zakharchuk; Lin Mei
Breakthroughs in Historical Record Linking Using Genealogy Data: The Census Tree Project By Kasey Buckles; Adrian Haws; Joseph Price; Haley E.B. Wilbert
The Socio-Economic Determinants of the Number of Physicians in Italian Regions By Leogrande, Angelo; Costantiello, Alberto; Leogrande, Domenico
Assessing the Impact of Artificial Intelligence on Germany's Labor Market: Insights from a ChatGPT Analysis By Oschinski, Matthias
Decoding Financial Crises: Analyzing Predictors and Evolution By JEONG, Young Sik; BAEK, Yaein
Econometrics of Machine Learning Methods in Economic Forecasting By Andrii Babii; Eric Ghysels; Jonas Striaukas
Analysis of Optimal Portfolio Management Using Hierarchical Clustering By Kapil Panda
Predicting Re-Employment: Machine Learning versus Assessments by Unemployed Workers and by Their Caseworkers By van den Berg, Gerard J.; Kunaschk, Max; Lang, Julia; Stephan, Gesine; Uhlendorff, Arne
Russia-Ukraine war and G7 debt markets: Evidence from public sentiment towards economic sanctions during the conflict By Zunaidah Sulong; Mohammad Abdullah; Emmanuel J. A. Abakah; David Adeabah; Simplice Asongu
Deep multi-step mixed algorithm for high dimensional non-linear PDEs and associated BSDEs By Daniel Bussell; Camilo Andr\'es Garc\'ia-Trillos
The Impact of Artificial Intelligence on Economic Patterns By Lohani, Fazle; Rahman, Mostafizur; Shaturaev, Jakhongir
What is the role of data in jobs in the United Kingdom, Canada, and the United States?: A natural language processing approach By Julia Schmidt; Graham Pilgrim; Annabelle Mourougane
Firms' Price-setting Behaviour: Insights from Earnings Calls By Callan Windsor; Max Zang

How Big Is the Media Multiplier? Evidence from Dyadic News Data

By:	Timothy Besley; Thiemo Fetzer; Hannes Mueller
Abstract:	This paper estimates the size of the media multiplier, a model-based measure of how far media coverage magnifies the economic response to shocks. We combine monthly aggregated and anonymized credit card activity data from 114 card issuing countries in 5 destination countries with a large corpus of news coverage in issuing countries reporting on violent events in the destinations. To define and quantify the media multiplier we estimate a model in which latent beliefs, shaped by either events or news coverage, drive card activity. According to the model, media coverage can more than triple the economic impact of an event. We show that within our sample, media reporting more than doubled the effect of events in Tunisia and speculate about the role of the media in driving international travel patterns. This concept can easily be generalized to other contexts and settings pending suitable data.
Keywords:	media, economic behaviour, news shocks
JEL:	O10 F50 D80 F10 L80
Date:	2023
URL:	http://d.repec.org/n?u=RePEc:ces:ceswps:_10619&r=big

Learning from the Origins

By:	Alexander Yarkin
Abstract:	How do political preferences and voting behaviors respond to information coming from abroad? Focusing on the international migration network, I document that opinion changes at the origins spill over to 1st- and 2nd-generation immigrants abroad. Local diasporas, social media, and family ties to the origins facilitate the transmission, while social integration at destination weakens it. Using the variation in the magnitude, timing, and type of origin-country exposure to the European Refugee Crisis of 2015, I show that salient events trigger learning from the origins. Welcoming asylum policies at the origins decrease opposition to non-Europeans and far-right voting abroad. Transitory refugee flows through the origins send abroad the backlash. Data from Google Trends and Facebook suggests elevated attention to events at the origins and communication with like-minded groups as mechanisms. Similar spillovers following the passage of same-sex marriage laws show the phenomenon generalizes beyond refugee attitudes.
Keywords:	immigration, social networks, spillovers, political attitudes, integration
JEL:	O15 Z13 D72 D83 P00 J61 F22
Date:	2023
URL:	http://d.repec.org/n?u=RePEc:ces:ceswps:_10626&r=big

Fairness Implications of Heterogeneous Treatment Effect Estimation with Machine Learning Methods in Policy-making

By:	Patrick Rehill; Nicholas Biddle
Abstract:	Causal machine learning methods which flexibly generate heterogeneous treatment effect estimates could be very useful tools for governments trying to make and implement policy. However, as the critical artificial intelligence literature has shown, governments must be very careful of unintended consequences when using machine learning models. One way to try and protect against unintended bad outcomes is with AI Fairness methods which seek to create machine learning models where sensitive variables like race or gender do not influence outcomes. In this paper we argue that standard AI Fairness approaches developed for predictive machine learning are not suitable for all causal machine learning applications because causal machine learning generally (at least so far) uses modelling to inform a human who is the ultimate decision-maker while AI Fairness approaches assume a model that is making decisions directly. We define these scenarios as indirect and direct decision-making respectively and suggest that policy-making is best seen as a joint decision where the causal machine learning model usually only has indirect power. We lay out a definition of fairness for this scenario - a model that provides the information a decision-maker needs to accurately make a value judgement about just policy outcomes - and argue that the complexity of causal machine learning models can make this difficult to achieve. The solution here is not traditional AI Fairness adjustments, but careful modelling and awareness of some of the decision-making biases that these methods might encourage which we describe.
Date:	2023–09
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2309.00805&r=big

Deep Semi-Supervised Anomaly Detection for Finding Fraud in the Futures Market

By:	Timothy DeLise
Abstract:	Modern financial electronic exchanges are an exciting and fast-paced marketplace where billions of dollars change hands every day. They are also rife with manipulation and fraud. Detecting such activity is a major undertaking, which has historically been a job reserved exclusively for humans. Recently, more research and resources have been focused on automating these processes via machine learning and artificial intelligence. Fraud detection is overwhelmingly associated with the greater field of anomaly detection, which is usually performed via unsupervised learning techniques because of the lack of labeled data needed for supervised learning. However, a small quantity of labeled data does often exist. This research article aims to evaluate the efficacy of a deep semi-supervised anomaly detection technique, called Deep SAD, for detecting fraud in high-frequency financial data. We use exclusive proprietary limit order book data from the TMX exchange in Montr\'eal, with a small set of true labeled instances of fraud, to evaluate Deep SAD against its unsupervised predecessor. We show that incorporating a small amount of labeled data into an unsupervised anomaly detection framework can greatly improve its accuracy.
Date:	2023–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2309.00088&r=big

Recurrent Neural Networks with more flexible memory: better predictions than rough volatility

By:	Damien Challet; Vincent Ragel
Abstract:	We extend recurrent neural networks to include several flexible timescales for each dimension of their output, which mechanically improves their abilities to account for processes with long memory or with highly disparate time scales. We compare the ability of vanilla and extended long short term memory networks (LSTMs) to predict asset price volatility, known to have a long memory. Generally, the number of epochs needed to train extended LSTMs is divided by two, while the variation of validation and test losses among models with the same hyperparameters is much smaller. We also show that the model with the smallest validation loss systemically outperforms rough volatility predictions by about 20% when trained and tested on a dataset with multiple time series.
Date:	2023–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2308.08550&r=big

The roots of inequality: estimating inequality of opportunity from regression trees and forests

By:	Brunori, Paolo
Abstract:	We propose the use of machine learning methods to estimate inequality of opportunity and to illustrate that regression trees and forests represent a substantial improvement over existing approaches: they reduce the risk of ad hoc model selection and trade off upward and downward bias in inequality of opportunity estimates. The advantages of regression trees and forests are illustrated by an empirical application for a cross-section of 31 European countries. We show that arbitrary model selection might lead to significant biases in inequality of opportunity estimates relative to our preferred method. These biases are reflected in both point estimates and country rankings.
Keywords:	equality of opportunity; machine learning; random forests; Equality of opportunity; Wiley deal
JEL:	J1
Date:	2023–02–20
URL:	http://d.repec.org/n?u=RePEc:ehl:lserod:118220&r=big

Diffusion Variational Autoencoder for Tackling Stochasticity in Multi-Step Regression Stock Price Prediction

By:	Kelvin J. L. Koa; Yunshan Ma; Ritchie Ng; Tat-Seng Chua
Abstract:	Multi-step stock price prediction over a long-term horizon is crucial for forecasting its volatility, allowing financial institutions to price and hedge derivatives, and banks to quantify the risk in their trading books. Additionally, most financial regulators also require a liquidity horizon of several days for institutional investors to exit their risky assets, in order to not materially affect market prices. However, the task of multi-step stock price prediction is challenging, given the highly stochastic nature of stock data. Current solutions to tackle this problem are mostly designed for single-step, classification-based predictions, and are limited to low representation expressiveness. The problem also gets progressively harder with the introduction of the target price sequence, which also contains stochastic noise and reduces generalizability at test-time. To tackle these issues, we combine a deep hierarchical variational-autoencoder (VAE) and diffusion probabilistic techniques to do seq2seq stock prediction through a stochastic generative process. The hierarchical VAE allows us to learn the complex and low-level latent variables for stock prediction, while the diffusion probabilistic model trains the predictor to handle stock price stochasticity by progressively adding random noise to the stock data. Our Diffusion-VAE (D-Va) model is shown to outperform state-of-the-art solutions in terms of its prediction accuracy and variance. More importantly, the multi-step outputs can also allow us to form a stock portfolio over the prediction length. We demonstrate the effectiveness of our model outputs in the portfolio investment task through the Sharpe ratio metric and highlight the importance of dealing with different types of prediction uncertainties.
Date:	2023–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2309.00073&r=big

Harnessing the Power of Artificial Intelligence to Forecast Startup Success: An Empirical Evaluation of the SECURE AI Model

By:	Morande, Swapnil; Arshi, Tahseen; Gul, Kanwal; Amini, Mitra
Abstract:	This pioneering study employs machine learning to predict startup success, addressing the long-standing challenge of deciphering entrepreneurial outcomes amidst uncertainty. Integrating the multidimensional SECURE framework for holistic opportunity evaluation with AI's pattern recognition prowess, the research puts forth a novel analytics-enabled approach to illuminate success determinants. Rigorously constructed predictive models demonstrate remarkable accuracy in forecasting success likelihood, validated through comprehensive statistical analysis. The findings reveal AI’s immense potential in bringing evidence-based objectivity to the complex process of opportunity assessment. On the theoretical front, the research enriches entrepreneurship literature by bridging the knowledge gap at the intersection of structured evaluation tools and data science. On the practical front, it empowers entrepreneurs with an analytical compass for decision-making and helps investors make prudent funding choices. The study also informs policymakers to optimize conditions for entrepreneurship. Overall, it lays the foundation for a new frontier of AI-enabled, data-driven entrepreneurship research and practice. However, acknowledging AI’s limitations, the synthesis underscores the persistent relevance of human creativity alongside data-backed insights. With high predictive performance and multifaceted implications, the SECURE-AI model represents a significant stride toward an analytics-empowered paradigm in entrepreneurship management.
Date:	2023–08–29
URL:	http://d.repec.org/n?u=RePEc:osf:socarx:p3gyb&r=big

Asymmetric AdaBoost for High-dimensional Maximum Score Regression

By:	Jianghao Chu (JPMorgan Chase & Co); Tae-Hwy Lee (Department of Economics, University of California Riverside); Aman Ullah (Department of Economics, University of California Riverside)
Abstract:	Carter Hillâ€™s numerous contributions (books and articles) in econometrics stand out especially in pedagogy. An important aspect of his pedagogy is to integrate â€œtheory and practiceâ€ of econometrics, as coined into the titles of his popular books. The new methodology we propose in this paper is consistent with these contributions of Carter Hill. In particular, we bring the maximum score regression of Manski (1975, 1985) to high dimension in theory and show that the â€œAsymmetric AdaBoostâ€ provides the algorithmic implementation of the high dimensional maximum score regression in practice. Recent advances in machine learning research have not only expanded the horizon of econometrics by providing new methods but also provided the algorithmic aspects of many of traditional econometrics methods. For example, Adaptive Boosting (AdaBoost) introduced by Freund and Schapire (1996) has gained enormous success in binary/discrete classification/prediction. In this paper, we introduce the â€œAsymmetric AdaBoostâ€ and relate it to the maximum score regression in the algorithmic perspective. The Asymmetric AdaBoost solves high-dimensional binary classification/prediction problems with state-dependent loss functions. Asymmetric AdaBoost produces a nonparametric classifier via minimizing the â€œasymmetric exponential riskâ€ which is a convex surrogate of the non-convex 0-1 risk. The convex risk function gives a huge computational advantage over non-convex risk functions of Manski (1975, 1985) especially when the data is high-dimensional. The resulting nonparametric classifier is more robust than the parametric classifiers whose performance depends on the correct specification of the model. We show that the risk of the classifier that Asymmetric AdaBoost produces approaches the Bayes risk which is the infimum of risk that can be achieved by all classifiers. Monte Carlo experiments show that the Asymmetric AdaBoost performs better than the commonly used LASSO-regularized logistic regression when parametric assumption is violated and sample size is large. We apply the Asymmetric AdaBoost to predict business cycle turning points as in Ng (2014).
Keywords:	Maximum Score Regression; High Dimension; Asymmetric AdaBoost; Convex Relaxation; Exponential Risk.
JEL:	C25 C44 C53 C55
Date:	2023–08
URL:	http://d.repec.org/n?u=RePEc:ucr:wpaper:202306&r=big

Agree to Disagree: Measuring Hidden Dissents in FOMC Meetings

By:	Kwok Ping Tsang; Zichao Yang
Abstract:	Based on a record of dissents on FOMC votes and transcripts of the meetings from 1976 to 2017, we develop a deep learning model based on self-attention modules to create a measure of the level of disagreement for each member in each meeting. While dissents are rare, we find that members often have reservations with the policy decision. The level of disagreement is mostly driven by current or predicted macroeconomic data, and personal characteristics of the members play almost no role. We also use our model to evaluate speeches made by members between meetings, and we find a weak correlation between the level of disagreement revealed in them and that of the following meeting. Finally, we find that the level of disagreement increases whenever monetary policy action is more aggressive.
Date:	2023–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2308.10131&r=big

Predicting Financial Market Trends using Time Series Analysis and Natural Language Processing

By:	Ali Asgarov
Abstract:	Forecasting financial market trends through time series analysis and natural language processing poses a complex and demanding undertaking, owing to the numerous variables that can influence stock prices. These variables encompass a spectrum of economic and political occurrences, as well as prevailing public attitudes. Recent research has indicated that the expression of public sentiments on social media platforms such as Twitter may have a noteworthy impact on the determination of stock prices. The objective of this study was to assess the viability of Twitter sentiments as a tool for predicting stock prices of major corporations such as Tesla, Apple. Our study has revealed a robust association between the emotions conveyed in tweets and fluctuations in stock prices. Our findings indicate that positivity, negativity, and subjectivity are the primary determinants of fluctuations in stock prices. The data was analyzed utilizing the Long-Short Term Memory neural network (LSTM) model, which is currently recognized as the leading methodology for predicting stock prices by incorporating Twitter sentiments and historical stock prices data. The models utilized in our study demonstrated a high degree of reliability and yielded precise outcomes for the designated corporations. In summary, this research emphasizes the significance of incorporating public opinions into the prediction of stock prices. The application of Time Series Analysis and Natural Language Processing methodologies can yield significant scientific findings regarding financial market patterns, thereby facilitating informed decision-making among investors. The results of our study indicate that the utilization of Twitter sentiments can serve as a potent instrument for forecasting stock prices, and ought to be factored in when formulating investment strategies.
Date:	2023–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2309.00136&r=big

Long-term Effects of Temperature Variations on Economic Growth: A Machine Learning Approach

By:	Eugene Kharitonov; Oksana Zakharchuk; Lin Mei
Abstract:	This study investigates the long-term effects of temperature variations on economic growth using a data-driven approach. Leveraging machine learning techniques, we analyze global land surface temperature data from Berkeley Earth and economic indicators, including GDP and population data, from the World Bank. Our analysis reveals a significant relationship between average temperature and GDP growth, suggesting that climate variations can substantially impact economic performance. This research underscores the importance of incorporating climate factors into economic planning and policymaking, and it demonstrates the utility of machine learning in uncovering complex relationships in climate-economy studies.
Date:	2023–06
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2308.06265&r=big

Breakthroughs in Historical Record Linking Using Genealogy Data: The Census Tree Project

By:	Kasey Buckles; Adrian Haws; Joseph Price; Haley E.B. Wilbert
Abstract:	The Census Tree is the largest-ever database of record links among the historical U.S. censuses, with over 700 million links for people living in the United States between 1850 and 1940. These high-quality links allow researchers in the social sciences and other disciplines to construct a longitudinal dataset that is highly representative of the population. In this paper, we describe our process for creating the Census Tree, beginning with a collection of over 317 million links contributed by the users of a free online genealogy platform. We then use these links as training data for a machine learning algorithm to make new matches, and incorporate other recent efforts to link the historical U.S. censuses. Finally, we introduce a procedure for filtering the links and adjudicating disagreements. Our complete Census Tree achieves match rates between adjacent censuses that are between 69 and 86% for men, and between 58 and 79% for women. The Census Tree includes women and Black Americans at unprecedented rates, containing 314 million links for the former and more than 41 million for the latter.
JEL:	C81 J10 N01
Date:	2023–09
URL:	http://d.repec.org/n?u=RePEc:nbr:nberwo:31671&r=big

The Socio-Economic Determinants of the Number of Physicians in Italian Regions

By:	Leogrande, Angelo; Costantiello, Alberto; Leogrande, Domenico
Abstract:	In the following article, we analyse the determinants of the number of physicians in the context of ISTAT BES-Benessere Equo Sostenibile data among twenty Italian regions in the period 2004-2022. We apply Panel Data with Random Effects, Panel Data with Fixed Effects, and Pooled OLS-Ordinary Least Squares. We found that the number of Physicians among Italian regions is positively associated, among others, to “Trust in the Police and Firefighters”, “Net Income Inequality”, and negatively associated, among others, to “Research and Development Intensity” and “Soil waterproofing by artificial cover”. Furthermore, we apply the k-Means algorithm optimized with the Silhouette Coefficient and we find the presence of two clusters. Finally, we confront eight different machine-learning algorithms to predict the future value of physicians and we find that the PNN-Probabilistic Neural Network is the best predictive algorithm.
Keywords:	Analysis of Health Care Markets, Health Behaviors, Health Insurance, Public and Private, Health and Inequality, Health and Economic Development, Government Policy • Regulation • Public Health.
JEL:	I10 I11 I12 I14 I15 I18
Date:	2023–09–02
URL:	http://d.repec.org/n?u=RePEc:pra:mprapa:118460&r=big

Assessing the Impact of Artificial Intelligence on Germany's Labor Market: Insights from a ChatGPT Analysis

By:	Oschinski, Matthias
Abstract:	We assess the impact of artificial intelligence (AI) on Germany’s labour market applying the methodology on suitability for machine learning (SML) scores established by Brynjolfsson et al., (2018). However, this study introduces two innovative approaches to the conventional methodology. Instead of relying on traditional crowdsourcing platforms for obtaining ratings on automatability, this research exploits the chatbot capabilities of OpenAI's ChatGPT. Additionally, in alignment with the focus on the German labor market, the study extends the application of SML scores to the European Classification of Skills, Competences, Qualifications and Occupations (ESCO). As such, a distinctive contribution of this study lies in the assessment of ChatGPT's effectiveness in gauging the automatability of skills and competencies within the evolving landscape of AI. Furthermore, the study enhances the applicability of its findings by directly mapping SML scores to the European ESCO classification, rendering the results more pertinent for labor market analyses within the European Union. Initial findings indicate a measured impact of AI on a majority of the 13, 312 distinct ESCO skills and competencies examined. A more detailed analysis reveals that AI exhibits a more pronounced influence on tasks related to computer utilization and information processing. Activities involving decision-making, communication, research, collaboration, and specific technical proficiencies related to medical care, food preparation, construction, and precision equipment operation receive relatively lower scores. Notably, the study highlights the comparative advantage of human employees in transversal skills like creative thinking, collaboration, leadership, the application of general knowledge, attitudes, values, and specific manual and physical skills. Applying our rankings to German labour force data at the 2-digit ISCO level suggests that, in contrast to previous waves of automation, AI may also impact non-routine cognitive occupations. In fact, our results show that business and administration professionals as well as science and engineering associate professionals receive relatively higher rankings compared to teaching professionals, health associate professionals and personal service workers. Ultimately, the research underscores that the overall ramifications of AI on the labor force will be contingent upon the underlying motivations for its deployment. If the primary impetus is cost reduction, AI implementation might follow historical patterns of employment losses with limited gains in productivity. As such, public policy has an important role to play in recalibrating incentives to prioritize machine usefulness over machine intelligence.
Keywords:	Generative AI, Labour, Skills Suitability for Machine Learning, German labour market, ESCO
JEL:	A1 J0
Date:	2023–08–14
URL:	http://d.repec.org/n?u=RePEc:pra:mprapa:118300&r=big

Decoding Financial Crises: Analyzing Predictors and Evolution

By:	JEONG, Young Sik (KOREA INSTITUTE FOR INTERNATIONAL ECONOMIC POLICY (KIEP)); BAEK, Yaein (KOREA INSTITUTE FOR INTERNATIONAL ECONOMIC POLICY (KIEP))
Abstract:	We examine factors that predict financial crises and the evolution of financial crises using non-traditional methodologies, such as machine learning and system dynamics. Firstly, in our random forest model, the top six most important predictors among 12 indicators for the entire period (1870-2017) are the slope of the yield curve, the CPI, consumption, the debt service ratio, equity return, and public debt. Secondly, even though the manifestations of financial crises differ in each case, five common characteristics have been identified by examining various past financial crisis cases using a system dynamics approach (causal loop diagram). The first characteristic is a feedback loop that reinforces credit expansion. Next, the feedback loop leads to the buildup of financial crisis risk. Third, there is the shock that triggers the financial crisis. Fourth, there are risk-spreading factors. Lastly, individual financial crises do not end in themselves but have the common characteristic of becoming the seeds of new crises. In conclusion, two key findings emerge. First, the financial crisis is a systemic problem rather than an individual risk factor. Second, in diagnosing the recent situation, the results point to the risk of the financial crisis spreading.
Keywords:	Financial Crisis; Economic Crisis; Machine Learning; System Dynamics
Date:	2023–08–04
URL:	http://d.repec.org/n?u=RePEc:ris:kiepwe:2023_028&r=big

Econometrics of Machine Learning Methods in Economic Forecasting

By:	Andrii Babii; Eric Ghysels; Jonas Striaukas
Abstract:	This paper surveys the recent advances in machine learning method for economic forecasting. The survey covers the following topics: nowcasting, textual data, panel and tensor data, high-dimensional Granger causality tests, time series cross-validation, classification with economic losses.
Date:	2023–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2308.10993&r=big

Analysis of Optimal Portfolio Management Using Hierarchical Clustering

By:	Kapil Panda
Abstract:	Portfolio optimization is a task that investors use to determine the best allocations for their investments, and fund managers implement computational models to help guide their decisions. While one of the most common portfolio optimization models in the industry is the Markowitz Model, practitioners recognize limitations in its framework that lead to suboptimal out-of-sample performance and unrealistic allocations. In this study, I refine the Markowitz Model by incorporating machine learning to improve portfolio performance. By using a hierarchical clustering-based approach, I am able to enhance portfolio performance on a risk-adjusted basis compared to the Markowitz Model, across various market factors.
Date:	2023–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2308.11202&r=big

Predicting Re-Employment: Machine Learning versus Assessments by Unemployed Workers and by Their Caseworkers

By:	van den Berg, Gerard J. (University of Groningen); Kunaschk, Max (Institute for Employment Research (IAB), Nuremberg); Lang, Julia (Institute for Employment Research (IAB), Nuremberg); Stephan, Gesine (Institute for Employment Research (IAB), Nuremberg); Uhlendorff, Arne (CREST)
Abstract:	Predictions of whether newly unemployed individuals will become long-term unemployed are important for the planning and policy mix of unemployment insurance agencies. We analyze unique data on three sources of information on the probability of re-employment within 6 months (RE6), for the same individuals sampled from the inflow into unemployment. First, they were asked for their perceived probability of RE6. Second, their caseworkers revealed whether they expected RE6. Third, random-forest machine learning methods are trained on administrative data on the full inflow, to predict individual RE6. We compare the predictive performance of these measures and consider whether combinations improve this performance. We show that self-reported and caseworker assessments sometimes contain information not captured by the machine learning algorithm.
Keywords:	unemployment, expectations, prediction, random forest, unemployment insurance, information
JEL:	J64 J65 C55 C53 C41 C21
Date:	2023–09
URL:	http://d.repec.org/n?u=RePEc:iza:izadps:dp16426&r=big

Russia-Ukraine war and G7 debt markets: Evidence from public sentiment towards economic sanctions during the conflict

By:	Zunaidah Sulong (Universiti Sultan Zainal Abidin, Malaysia); Mohammad Abdullah (Universiti Sultan Zainal Abidin, Malaysia); Emmanuel J. A. Abakah (University of Ghana Business School, Accra Ghana); David Adeabah (University of Ghana Business School, Accra Ghana); Simplice Asongu (Yaoundé, Cameroon)
Abstract:	War-related expectations cause changes to investorsâ€™ risks and returns preferences. In this study, we examine the implications of war and sanctions sentiment for the G7 countriesâ€™ debt markets during the Russia-Ukraine war. We use behavioral indicators across social media, news media, and internet attention to reflect the public sentiment from 1st January 2022 to 20th April 2023. We apply the quantile-on-quantile regression (QQR) and rolling window wavelet correlation (RWWC) methods. The quantile-on-quantile regression results show heterogenous impact on fixed income securities. Specifically, extreme public sentiment has a negative impact on G7 fixed income securities return. The wavelets correlation result shows dynamic correlation pattern among public sentiment and fixed income securities. There is a negative relationship between public sentiment and G7 fixed income securities. The correlation is time-varying and highly event dependent. Our additional analysis using corporate bond data indicates the robustness of our findings. Furthermore, the contagion analysis shows public sentiment significantly influence G7 fixed income securities spillover. Our findings can be of great significance while framing strategies for asset allocation, portfolio performance and risk hedging.
Keywords:	Russia-Ukraine war, economic sanctions, G7 debt, fixed income securities, quantile approaches
Date:	2023–01
URL:	http://d.repec.org/n?u=RePEc:exs:wpaper:23/057&r=big

Deep multi-step mixed algorithm for high dimensional non-linear PDEs and associated BSDEs

By:	Daniel Bussell; Camilo Andr\'es Garc\'ia-Trillos
Abstract:	We propose a new multistep deep learning-based algorithm for the resolution of moderate to high dimensional nonlinear backward stochastic differential equations (BSDEs) and their corresponding parabolic partial differential equations (PDE). Our algorithm relies on the iterated time discretisation of the BSDE and approximates its solution and gradient using deep neural networks and automatic differentiation at each time step. The approximations are obtained by sequential minimisation of local quadratic loss functions at each time step through stochastic gradient descent. We provide an analysis of approximation error in the case of a network architecture with weight constraints requiring only low regularity conditions on the generator of the BSDE. The algorithm increases accuracy from its single step parent model and has reduced complexity when compared to similar models in the literature.
Date:	2023–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2308.14487&r=big

The Impact of Artificial Intelligence on Economic Patterns

By:	Lohani, Fazle; Rahman, Mostafizur; Shaturaev, Jakhongir
Abstract:	This article discusses five specific economic patterns influenced by AI: the emergence of the machina economica, the acceleration of the division of labor, the introduction of AI leading to triangular agency relationships, the recognition of data and AI-based machine labor as new factors of production, and the potential for market dominance and unintended external effects. This analysis is grounded in institutional economics and aims to integrate findings from relevant disciplines in economics and computer science. It is based on the research finding that institutional matters remain highly relevant in a world with AI, but AI introduces a new dimension to these matters. The discussion reveals a reinforcing interdependence among the patterns discussed and highlights the need for further research.
Keywords:	AI; labor classifications; methodological procedure; agent-principal conflict; economics of scale
JEL:	D0 F1 F16 G0
Date:	2023–01–10
URL:	http://d.repec.org/n?u=RePEc:pra:mprapa:118316&r=big

What is the role of data in jobs in the United Kingdom, Canada, and the United States?: A natural language processing approach

By:	Julia Schmidt; Graham Pilgrim; Annabelle Mourougane
Abstract:	This paper estimates the data intensity of occupations/sectors (i.e. the share of job postings per occupation/sector related to the production of data) using natural language processing (NLP) on job advertisements in the United Kingdom, Canada and the United States. Online job advertisement data collected by Lightcast provide timely and disaggregated insights into labour demand and skill requirements of different professions. The paper makes three major contributions. First, indicators created from the Lightcast data add to the understanding of digital skills in the labour market. Second, the results may advance the measurement of data assets in national account statistics. Third, the NLP methodology can handle up to 66 languages and can be adapted to measure concepts beyond digital skills. Results provide a ranking of data intensity across occupations, with data analytics activities contributing most to aggregate data intensity shares in all three countries. At the sectoral level, the emerging picture is more heterogeneous across countries. Differences in labour demand primarily explain those variations, with low data-intensive professions contributing most to aggregate data intensity in the United Kingdom. Estimates of investment in data, using a sum of costs approach and sectoral intensity shares, point to lower levels in the United Kingdom and Canada than in the United States.
Keywords:	Data asset, data economy, Data intensity, job advertisements, natural language processing
JEL:	C80 C88 E01 J21
Date:	2023–09–18
URL:	http://d.repec.org/n?u=RePEc:oec:stdaaa:2023/05-en&r=big

Firms' Price-setting Behaviour: Insights from Earnings Calls

By:	Callan Windsor (Reserve Bank of Australia); Max Zang (Reserve Bank of Australia)
Abstract:	We introduce new firm-level indices covering input costs, demand and final prices based on listed Australian firms' earnings calls going back to 2007. These indices are constructed using a powerful transformer-based large language model. We show the new indices track current economic conditions, consistent with a simple conceptual framework we use to explain why there is real-time information in firms' earnings calls. Focusing on firms' price-setting behaviour, the reduced-form associations we estimate appear to show that discussions around final prices have become more sensitive to import costs but less sensitive to labour costs in the period since 2021. This is after controlling for changes in the operating environment that are common to all firms, including global supply shocks. Firms' price-setting sentiment also appears more sensitive to rising input costs compared to falling costs, suggesting that prices could remain front-of-mind for company executives even as supply pressures ease.
Keywords:	price setting; inflation; machine learning; natural language processing; earnings calls
JEL:	C45 E31
Date:	2023–09
URL:	http://d.repec.org/n?u=RePEc:rba:rbardp:rdp2023-06&r=big

This nep-big issue is ©2023 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.