nep-big 2024-07-15 papers

on Big Data

Issue of 2024‒07‒15
34 papers chosen by
Tom Coupé, University of Canterbury

A Hands-on Machine Learning Primer for Social Scientists: Math, Algorithms and Code By Askitas, Nikos
pystacked and ddml: Machine learning for prediction and causal inference in Stata By Mark E. Schaffer
Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs By Alexander Bakumenko; Kate\v{r}ina Hlav\'a\v{c}kov\'a-Schindler; Claudia Plant; Nina C. Hubig
Nowcasting GDP: what are the gains from machine learning algorithms? By Milen Arro-Cannarsa; Dr. Rolf Scheufele
Gated recurrent neural network with TPE Bayesian optimization for enhancing stock index prediction accuracy By Bivas Dinda
Nowcasting subjective well-being with Google Trends: A meta-learning approach By Fabrice Murtin
Machine Learning and Multiple Abortions By Kumar, Pradeep; Nicodemo, Catia; Oreffice, Sonia; Quintana-Domeque, Climent
Modelling and Forecasting Energy Market Volatility Using GARCH and Machine Learning Approach By Seulki Chung
BERT vs GPT for financial engineering By Edward Sharkey; Philip Treleaven
Predictive modeling of foreign exchange trading signals using machine learning techniques By Sugarbayar Enkhbayar; Robert Ślepaczuk
Newswire: A Large-Scale Structured Database of a Century of Historical News By Emily Silcock; Abhishek Arora; Luca D'Amico-Wong; Melissa Dell
HARd to Beat: The Overlooked Impact of Rolling Windows in the Era of Machine Learning By Francesco Audrino; Jonathan Chassot
From rules to forests: rule-based versus statistical models for jobseeker profiling By Junquera, Álvaro F.; Kern, Christoph
Analysing the VAT cut pass-through in Spain using web-scraped supermarket data and machine learning By Nicolás Forteza; Elvira Prades; Marc Roca
Statistical arbitrage in multi-pair trading strategy based on graph clustering algorithms in US equities market By Adam Korniejczuk; Robert Ślepaczuk
Paired completion: quantifying issue-framing at scale with LLMs By Simon D Angus; Lachlan O'Neill
HLOB -- Information Persistence and Structure in Limit Order Books By Antonio Briola; Silvia Bartolucci; Tomaso Aste
Data-Driven Real-time Coupon Allocation in the Online Platform By Jinglong Dai; Hanwei Li; Weiming Zhu; Jianfeng Lin; Binqiang Huang
Graduates, Training and Employment Across the Italian Regions By Arnone, Massimo; Angelillis, Barbara; Costantiello, Alberto; Leogrande, Angelo
Estimating Nonlinear Heterogeneous Agent Models with Neural Networks By Kase, Hanno; Melosi, Leonardo; Rottner, Matthias
Filtered not Mixed: Stochastic Filtering-Based Online Gating for Mixture of Large Language Models By Raeid Saqur; Anastasis Kratsios; Florian Krach; Yannick Limmer; Jacob-Junqi Tian; John Willes; Blanka Horvath; Frank Rudzicz
Reinforcement Learning from Experience Feedback: Application to Economic Policy By Tohid Atashbar
Watch Me Improve — Algorithm Aversion and Demonstrating the Ability to Learn By Berger, Benedikt; Adam, Martin; Rühr, Alexander; Benlian, Alexander
Deep reinforcement learning with positional context for intraday trading By Sven Golu\v{z}a; Tomislav Kova\v{c}evi\'c; Tessa Bauman; Zvonko Kostanj\v{c}ar
Quantifying the Reliance of Black-Box Decision-Makers on Variables of Interest By Daniel Vebman
Utilizing Large Language Models for Automating Technical Customer Support By Jochen Wulf; J\"urg Meierhofer
Deep LPPLS: Forecasting of temporal critical points in natural, engineering and financial systems By Joshua Nielsen; Didier Sornette; Maziar Raissi
Random Subspace Local Projections By Viet Hoang Dinh; Didier Nibbering; Benjamin Wong
Artificial Intelligence and Entrepreneurship By Fossen, Frank M.; McLemore, Trevor; Sorgner, Alina
Using Large Language Models for Text Classification in Experimental Economics By Can Celebi; Stefan Penczynski
Can Earnings Calls Be Used to Gauge Labor Market Tightness? By Mick Dueholm; Aakash Kalyani; Serdar Ozkan
DeepUnifiedMom: Unified Time-series Momentum Portfolio Construction via Multi-Task Learning with Multi-Gate Mixture of Experts By Joel Ong; Dorien Herremans
Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism By Chang Zong; Jian Shao; Weiming Lu; Yueting Zhuang
Reading between the lines: Uncovering asymmetry in the central bank loss function By Haavio, Markus; Heikkinen, Joni; Jalasjoki, Pirkka; Kilponen, Juha; Paloviita, Maritta; Vänni, Ilona

A Hands-on Machine Learning Primer for Social Scientists: Math, Algorithms and Code

By:	Askitas, Nikos (IZA)
Abstract:	This paper addresses the steep learning curve in Machine Learning faced by noncomputer scientists, particularly social scientists, stemming from the absence of a primer on its fundamental principles. I adopt a pedagogical strategy inspired by the adage "once you understand OLS, you can work your way up to any other estimator, " and apply it to Machine Learning. Focusing on a single-hidden-layer artificial neural network, the paper discusses its mathematical underpinnings, including the pivotal Universal Approximation Theorem—an essential "existence theorem". The exposition extends to the algorithmic exploration of solutions, specifically through "feed forward" and "back-propagation", and rounds up with the practical implementation in Python. The objective of this primer is to equip readers with a solid elementary comprehension of first principles and fire some trailblazers to the forefront of AI and causal machine learning.
Keywords:	machine learning, deep learning, supervised learning, artificial neural network, perceptron, Python, keras, tensorflow, universal approximation theorem
JEL:	C01 C87 C00 C60
Date:	2024–05
URL:	https://d.repec.org/n?u=RePEc:iza:izadps:dp17014&r=

pystacked and ddml: Machine learning for prediction and causal inference in Stata

By: Mark E. Schaffer (Heriot-Watt University)

Date: 2023–11–09

URL: https://d.repec.org/n?u=RePEc:boc:econ23:04&r=

Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs

By:	Alexander Bakumenko (Clemson University, USA); Kate\v{r}ina Hlav\'a\v{c}kov\'a-Schindler (University of Vienna, Austria); Claudia Plant (University of Vienna, Austria); Nina C. Hubig (Clemson University, USA)
Abstract:	Detecting anomalies in general ledger data is of utmost importance to ensure trustworthiness of financial records. Financial audits increasingly rely on machine learning (ML) algorithms to identify irregular or potentially fraudulent journal entries, each characterized by a varying number of transactions. In machine learning, heterogeneity in feature dimensions adds significant complexity to data analysis. In this paper, we introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings. To encode non-semantic categorical data from real-world financial records, we tested 3 pre-trained general purpose sentence-transformer models. For the downstream classification task, we implemented and evaluated 5 optimized ML models including Logistic Regression, Random Forest, Gradient Boosting Machines, Support Vector Machines, and Neural Networks. Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines, in selected settings even by a large margin. The findings further underscore the effectiveness of LLMs in enhancing anomaly detection in financial journal entries, particularly by tackling feature sparsity. We discuss a promising perspective on using LLM embeddings for non-semantic data in the financial context and beyond.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.03614&r=

Nowcasting GDP: what are the gains from machine learning algorithms?

By:	Milen Arro-Cannarsa; Dr. Rolf Scheufele
Abstract:	We compare several machine learning methods for nowcasting GDP. A large mixed-frequency data set is used to investigate different algorithms such as regression based methods (LASSO, ridge, elastic net), regression trees (bagging, random forest, gradient boosting), and SVR. As benchmarks, we use univariate models, a simple forward selection algorithm, and a principal components regression. The analysis accounts for publication lags and treats monthly indicators as quarterly variables combined via blocking. Our data set consists of more than 1, 100 time series. For the period after the Great Recession, which is particularly challenging in terms of nowcasting, we find that all considered machine learning techniques beat the univariate benchmark up to 28 % in terms of out-of-sample RMSE. Ridge, elastic net, and SVR are the most promising algorithms in our analysis, significantly outperforming principal components regression.
Keywords:	Nowcasting, Forecasting, Machine learning, Rridge, LASSO, Elastic net, Random forest, Bagging, Boosting, SVM, SVR, Large data sets
JEL:	C53 C55 C32
Date:	2024
URL:	https://d.repec.org/n?u=RePEc:snb:snbwpa:2024-06&r=

Gated recurrent neural network with TPE Bayesian optimization for enhancing stock index prediction accuracy

By:	Bivas Dinda
Abstract:	The recent advancement of deep learning architectures, neural networks, and the combination of abundant financial data and powerful computers are transforming finance, leading us to develop an advanced method for predicting future stock prices. However, the accessibility of investment and trading at everyone's fingertips made the stock markets increasingly intricate and prone to volatility. The increased complexity and volatility of the stock market have driven demand for more models, which would effectively capture high volatility and non-linear behavior of the different stock prices. This study explored gated recurrent neural network (GRNN) algorithms such as LSTM (long short-term memory), GRU (gated recurrent unit), and hybrid models like GRU-LSTM, LSTM-GRU, with Tree-structured Parzen Estimator (TPE) Bayesian optimization for hyperparameter optimization (TPE-GRNN). The aim is to improve the prediction accuracy of the next day's closing price of the NIFTY 50 index, a prominent Indian stock market index, using TPE-GRNN. A combination of eight influential factors is carefully chosen from fundamental stock data, technical indicators, crude oil price, and macroeconomic data to train the models for capturing the changes in the price of the index with the factors of the broader economy. Single-layer and multi-layer TPE-GRNN models have been developed. The models' performance is evaluated using standard matrices like R2, MAPE, and RMSE. The analysis of models' performance reveals the impact of feature selection and hyperparameter optimization (HPO) in enhancing stock index price prediction accuracy. The results show that the MAPE of our proposed TPE-LSTM method is the lowest (best) with respect to all the previous models for stock index price prediction.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.02604&r=

Nowcasting subjective well-being with Google Trends: A meta-learning approach

By:	Fabrice Murtin
Abstract:	This paper applies Machine learning techniques to Google Trends data to provide real-time estimates of national average subjective well-being among 38 OECD countries since 2010. We make extensive usage of large custom micro databases to enhance the training of models on carefully pre-processed Google Trends data. We find that the best one-year-ahead prediction is obtained from a meta-learner that combines the predictions drawn from an Elastic Net with and without interactions, from a Gradient-Boosted Tree and from a Multi-layer Perceptron. As a result, across 38 countries over the 2010-2020 period, the out-of-sample prediction of average subjective well-being reaches an R2 of 0.830.
Keywords:	poverty, spatial inequality, well-being
JEL:	C1 C45 C53 D60 I31
Date:	2024–06–28
URL:	https://d.repec.org/n?u=RePEc:oec:wiseaa:27-en&r=

Machine Learning and Multiple Abortions

By:	Kumar, Pradeep (University of Exeter); Nicodemo, Catia (University of Oxford); Oreffice, Sonia (University of Exeter); Quintana-Domeque, Climent (University of Exeter)
Abstract:	This study employs six Machine Learning methods - Logit, Lasso-Logit, Ridge-Logit, Random Forest, Extreme Gradient Boosting, and an Ensemble - alongside registry data on abortions in Spain from 2011-2019 to predict multiple abortions and assess monetary savings through targeted interventions. We find that Random Forest and an Ensemble method are most effective in the highest risk decile, capturing about 55% of cases, whereas linear models and Extreme Gradient Boosting excel in mid to lower deciles. We also show that targeting the top 20% most at-risk could yield cost savings of 5.44 to 8.2 million EUR, which could be reallocated to prevent unintended pregnancies arising from contraceptive failure, abusive relationships, and sexual assault, among other factors.
Keywords:	Extreme Gradient Boosting, Ridge, random forest, multiple abortions, Logit, Lasso, Ensemble, reproductive healthcare
JEL:	I12 I18 C53 J13 C55
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:iza:izadps:dp17046&r=

Modelling and Forecasting Energy Market Volatility Using GARCH and Machine Learning Approach

By:	Seulki Chung
Abstract:	This paper presents a comparative analysis of univariate and multivariate GARCH-family models and machine learning algorithms in modeling and forecasting the volatility of major energy commodities: crude oil, gasoline, heating oil, and natural gas. It uses a comprehensive dataset incorporating financial, macroeconomic, and environmental variables to assess predictive performance and discusses volatility persistence and transmission across these commodities. Aspects of volatility persistence and transmission, traditionally examined by GARCH-class models, are jointly explored using the SHAP (Shapley Additive exPlanations) method. The findings reveal that machine learning models demonstrate superior out-of-sample forecasting performance compared to traditional GARCH models. Machine learning models tend to underpredict, while GARCH models tend to overpredict energy market volatility, suggesting a hybrid use of both types of models. There is volatility transmission from crude oil to the gasoline and heating oil markets. The volatility transmission in the natural gas market is less prevalent.
Date:	2024–05
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2405.19849&r=

BERT vs GPT for financial engineering

By:	Edward Sharkey; Philip Treleaven
Abstract:	The paper benchmarks several Transformer models [4], to show how these models can judge sentiment from a news event. This signal can then be used for downstream modelling and signal identification for commodity trading. We find that fine-tuned BERT models outperform fine-tuned or vanilla GPT models on this task. Transformer models have revolutionized the field of natural language processing (NLP) in recent years, achieving state-of-the-art results on various tasks such as machine translation, text summarization, question answering, and natural language generation. Among the most prominent transformer models are Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer (GPT), which differ in their architectures and objectives. A CopBERT model training data and process overview is provided. The CopBERT model outperforms similar domain specific BERT trained models such as FinBERT. The below confusion matrices show the performance on CopBERT & CopGPT respectively. We see a ~10 percent increase in f1_score when compare CopBERT vs GPT4 and 16 percent increase vs CopGPT. Whilst GPT4 is dominant It highlights the importance of considering alternatives to GPT models for financial engineering tasks, given risks of hallucinations, and challenges with interpretability. We unsurprisingly see the larger LLMs outperform the BERT models, with predictive power. In summary BERT is partially the new XGboost, what it lacks in predictive power it provides with higher levels of interpretability. Concluding that BERT models might not be the next XGboost [2], but represent an interesting alternative for financial engineering tasks, that require a blend of interpretability and accuracy.
Date:	2024–04
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2405.12990&r=

Predictive modeling of foreign exchange trading signals using machine learning techniques

By:	Sugarbayar Enkhbayar (University of Warsaw, Faculty of Economic Sciences, Quantitative Finance Research Group); Robert Ślepaczuk (University of Warsaw, Faculty of Economic Sciences, Quantitative Finance Research Group, Department of Quantitative Finance and Machine Learning)
Abstract:	This study aimed to apply the algorithmic trading strategy on major foreign exchange pairs and compare the performances of machine learning-based strategies and traditional trend-following strategies with benchmark strategies. It differs from other studies in that it considered a wide variety of cases including different foreign exchange pairs, return methods, data frequency, and individual and integrated trading strategies. Ridge regression, KNN, RF, XGBoost, GBDT, ANN, LSTM, and GRU models were used for the machine learning-based strategy, while the MA cross strategy was employed for the trend-following strategy. Backtests were performed on 6 major pairs in the period from January 1, 2000, to June 30, 2023, and daily, and intraday data were used. The Sharpe ratio was considered as a metric used to refer to economic significance, and the independent t-test was used to determine statistical significance. The general findings of the study suggested that the currency market has become more efficient. The rise in efficiency is probably caused by the fact that more algorithms are being used in this market, and information spreads much faster. Instead of finding one trading strategy that works well on all major foreign exchange pairs, our study showed it’s possible to find an effective algorithmic trading strategy that generates a more effective trading signal in each specific case.
Keywords:	machine learning, algorithmic trading, foreign exchange market, rolling walk-forward optimization, technical indicators
JEL:	C4 C14 C45 C53 C58 G13
Date:	2024
URL:	https://d.repec.org/n?u=RePEc:war:wpaper:2024-10&r=

Newswire: A Large-Scale Structured Database of a Century of Historical News

By:	Emily Silcock; Abhishek Arora; Luca D'Amico-Wong; Melissa Dell
Abstract:	In the U.S. historically, local newspapers drew their content largely from newswires like the Associated Press. Historians argue that newswires played a pivotal role in creating a national identity and shared understanding of the world, but there is no comprehensive archive of the content sent over newswires. We reconstruct such an archive by applying a customized deep learning pipeline to hundreds of terabytes of raw image scans from thousands of local newspapers. The resulting dataset contains 2.7 million unique public domain U.S. newswire articles, written between 1878 and 1977. Locations in these articles are georeferenced, topics are tagged using customized neural topic classification, named entities are recognized, and individuals are disambiguated to Wikipedia using a novel entity disambiguation model. To construct the Newswire dataset, we first recognize newspaper layouts and transcribe around 138 millions structured article texts from raw image scans. We then use a customized neural bi-encoder model to de-duplicate reproduced articles, in the presence of considerable abridgement and noise, quantifying how widely each article was reproduced. A text classifier is used to ensure that we only include newswire articles, which historically are in the public domain. The structured data that accompany the texts provide rich information about the who (disambiguated individuals), what (topics), and where (georeferencing) of the news that millions of Americans read over the course of a century. We also include Library of Congress metadata information about the newspapers that ran the articles on their front pages. The Newswire dataset is useful both for large language modeling - expanding training data beyond what is available from modern web texts - and for studying a diversity of questions in computational linguistics, social science, and the digital humanities.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.09490&r=

HARd to Beat: The Overlooked Impact of Rolling Windows in the Era of Machine Learning

By:	Francesco Audrino; Jonathan Chassot
Abstract:	We investigate the predictive abilities of the heterogeneous autoregressive (HAR) model compared to machine learning (ML) techniques across an unprecedented dataset of 1, 455 stocks. Our analysis focuses on the role of fitting schemes, particularly the training window and re-estimation frequency, in determining the HAR model's performance. Despite extensive hyperparameter tuning, ML models fail to surpass the linear benchmark set by HAR when utilizing a refined fitting approach for the latter. Moreover, the simplicity of HAR allows for an interpretable model with drastically lower computational costs. We assess performance using QLIKE, MSE, and realized utility metrics, finding that HAR consistently outperforms its ML counterparts when both rely solely on realized volatility and VIX as predictors. Our results underscore the importance of a correctly specified fitting scheme. They suggest that properly fitted HAR models provide superior forecasting accuracy, establishing robust guidelines for their practical application and use as a benchmark. This study not only reaffirms the efficacy of the HAR model but also provides a critical perspective on the practical limitations of ML approaches in realized volatility forecasting.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.08041&r=

From rules to forests: rule-based versus statistical models for jobseeker profiling

By:	Junquera, Álvaro F. (Universitat Autònoma de Barcelona); Kern, Christoph
Abstract:	Public employment services (PES) commonly apply profiling systems to target support programs to jobseekers at risk of becoming long-term unemployed. Such systems often codify institutional experiences in a set of decision rules, whose predictive ability, however, is seldomly tested. We systematically evaluate the predictive performance of a rule-based system currently implemented by the PES of Catalonia, Spain, in comparison to the performance of statistical models in predicting future long-term unemployment episodes. Using comprehensive administrative data, we develop linear and machine learning models and evaluate their performance with respect to both discrimination and calibration. Compared to the current rule-based system of Catalonia, our machine learning models achieve greater discrimination ability and remarkable improvements in calibration. Particularly, our random forest model is able to accurately forecast episodes and outperforms the rule-based model by offering robust quantitative predictions that perform well under stress tests. This paper presents the first performance comparison between a complex, currently implemented, rule-based approach and complex statistical profiling models. Our work illustrates the importance of assessing the calibration of profiling models and the potential of statistical tools to assist public employment offices in Spain.
Date:	2024–06–14
URL:	https://d.repec.org/n?u=RePEc:osf:socarx:c7ps3&r=

Analysing the VAT cut pass-through in Spain using web-scraped supermarket data and machine learning

By:	Nicolás Forteza (Banco de España); Elvira Prades (Banco de España); Marc Roca (Banco de España)
Abstract:	On 28 December 2022, the Spanish government announced a temporary Value Added Tax (VAT) rate reduction for selected products. VAT rates were cut on 1 January 2023 and are expected to go back to their previous level by mid-2024. Using a web-scraped dataset, we leverage machine learning techniques to classify each product. Then we study the price effects of the temporary VAT rate reduction, covering the daily prices of roughly 10, 000 food products sold online by a Spanish supermarket. To identify the causal price effects, we compare the evolution of prices for treated items (that is, subject to the tax policy) against a control group (food items outside the policy’s scope). Our findings indicate that, at the supermarket level, the pass-through was almost complete. We observe differences in the speed of pass-through across different product types.
Keywords:	price rigidity, inflation, consumer prices, heterogeneity, microdata, VAT pass-through
JEL:	E31 H22 H25
Date:	2024–05
URL:	https://d.repec.org/n?u=RePEc:bde:wpaper:2417&r=

Statistical arbitrage in multi-pair trading strategy based on graph clustering algorithms in US equities market

By:	Adam Korniejczuk (University of Warsaw, Faculty of Economic Sciences, Quantitative Finance Research Group); Robert Ślepaczuk (University of Warsaw, Faculty of Economic Sciences, Quantitative Finance Research Group, Department of Quantitative Finance and Machine Learning)
Abstract:	The study seeks to develop an effective strategy based on the novel framework of statistical arbitrage based on graph clustering algorithms. Amalgamation of quantitative and machine learning methods, including the Kelly criterion, and an ensemble of machine learning classifiers have been used to improve risk-adjusted returns and increase the immunity to transaction costs over existing approaches. The study seeks to provide an integrated approach to optimal signal detection and risk management. As a part of this approach, innovative ways of optimizing take profit and stop loss functions for daily frequency trading strategies have been proposed and tested. All of the tested approaches outperformed appropriate benchmarks. The best combinations of the techniques and parameters demonstrated significantly better performance metrics than the relevant benchmarks. The results have been obtained under the assumption of realistic transaction costs, but are sensitive to the changes of some key parameters.
Keywords:	graph clustering algorithms, statistical arbitrage, algorithmic investment strategies, pair trading strategy, Kelly criterion, machine learning, risk adjusted returns
JEL:	C4 C45 C55 C65 G11
Date:	2024
URL:	https://d.repec.org/n?u=RePEc:war:wpaper:2024-09&r=

Paired completion: quantifying issue-framing at scale with LLMs

By:	Simon D Angus (SoDa Laboratories & Dept. of Economics, Monash Business School); Lachlan O'Neill (SoDa Laboratories, Monash Business School)
Abstract:	Detecting and quantifying issue framing in textual discourse - the slant or perspective one takes to a given topic (e.g. climate science vs. denialism, misogyny vs. gender equality) - is highly valuable to a range of end-users from social and political scientists to program evaluators and policy analysts. Being able to identify statistically significant shifts, reversals, or changes in issue framing in public discourse would enable the quantitative evaluation of interventions, actors and events that shape discourse. However, issue framing is notoriously challenging for automated natural language processing (NLP) methods since the words and phrases used by either 'side' of an issue are often held in common, with only subtle stylistic flourishes separating their use. Here we develop and rigorously evaluate new detection methods for issue framing and narrative analysis within large text datasets. By introducing a novel application of next-token log probabilities derived from generative large language models (LLMs) we show that issue framing can be reliably and efficiently detected in large corpora with only a few examples of either perspective on a given issue, a method we call 'paired completion'. Through 192 independent experiments over three novel, synthetic datasets, we evaluate paired completion against prompt-based LLM methods and labelled methods using traditional NLP and recent LLM contextual embeddings. We additionally conduct a cost-based analysis to mark out the feasible set of performant methods at production-level scales, and a model bias analysis. Together, our work demonstrates a feasible path to scalable, accurate and low-bias issue-framing in large corpora.
Keywords:	slant detection, text-as-data, synthetic data, computational linguistics
JEL:	C19 C55
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:ajr:sodwps:2024-02&r=

HLOB -- Information Persistence and Structure in Limit Order Books

By:	Antonio Briola; Silvia Bartolucci; Tomaso Aste
Abstract:	We introduce a novel large-scale deep learning model for Limit Order Book mid-price changes forecasting, and we name it `HLOB'. This architecture (i) exploits the information encoded by an Information Filtering Network, namely the Triangulated Maximally Filtered Graph, to unveil deeper and non-trivial dependency structures among volume levels; and (ii) guarantees deterministic design choices to handle the complexity of the underlying system by drawing inspiration from the groundbreaking class of Homological Convolutional Neural Networks. We test our model against 9 state-of-the-art deep learning alternatives on 3 real-world Limit Order Book datasets, each including 15 stocks traded on the NASDAQ exchange, and we systematically characterize the scenarios where HLOB outperforms state-of-the-art architectures. Our approach sheds new light on the spatial distribution of information in Limit Order Books and on its degradation over increasing prediction horizons, narrowing the gap between microstructural modeling and deep learning-based forecasting in high-frequency financial markets.
Date:	2024–05
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2405.18938&r=

Data-Driven Real-time Coupon Allocation in the Online Platform

By:	Jinglong Dai; Hanwei Li; Weiming Zhu; Jianfeng Lin; Binqiang Huang
Abstract:	Traditionally, firms have offered coupons to customer groups at predetermined discount rates. However, advancements in machine learning and the availability of abundant customer data now enable platforms to provide real-time customized coupons to individuals. In this study, we partner with Meituan, a leading shopping platform, to develop a real-time, end-to-end coupon allocation system that is fast and effective in stimulating demand while adhering to marketing budgets when faced with uncertain traffic from a diverse customer base. Leveraging comprehensive customer and product features, we estimate Conversion Rates (CVR) under various coupon values and employ isotonic regression to ensure the monotonicity of predicted CVRs with respect to coupon value. Using calibrated CVR predictions as input, we propose a Lagrangian Dual-based algorithm that efficiently determines optimal coupon values for each arriving customer within 50 milliseconds. We theoretically and numerically investigate the model performance under parameter misspecifications and apply a control loop to adapt to real-time updated information, thereby better adhering to the marketing budget. Finally, we demonstrate through large-scale field experiments and observational data that our proposed coupon allocation algorithm outperforms traditional approaches in terms of both higher conversion rates and increased revenue. As of May 2024, Meituan has implemented our framework to distribute coupons to over 100 million users across more than 110 major cities in China, resulting in an additional CNY 8 million in annual profit. We demonstrate how to integrate a machine learning prediction model for estimating customer CVR, a Lagrangian Dual-based coupon value optimizer, and a control system to achieve real-time coupon delivery while dynamically adapting to random customer arrival patterns.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.05987&r=

Graduates, Training and Employment Across the Italian Regions

By:	Arnone, Massimo; Angelillis, Barbara; Costantiello, Alberto; Leogrande, Angelo
Abstract:	In this article, we analyze the relationships that connect graduates from high school, the training system and employment rates and conditions in the Italian regions between 2004 and 2022. The data used refer to the Istat Bes database. The results show that the growth in the number of high school graduates is positively associated with higher university education and employment with the exception of job satisfaction. Subsequently we also present a clusterization with k-Means algorithm confronting the Silhouette Coefficient with the Elbow Method. Finally, we confront seven different machine-learning algorithms for the prediction of the level of graduated from high school. We also present economic policy suggestions to increase schooling in the Italian regions. The results are critically discussed.
Keywords:	Labor Force and Employment, Human Capital, Occupational Choice, Job Satisfaction, Wage Differentials, Public Policy.
JEL:	J21 J24 J28 J31 J38
Date:	2024–06–01
URL:	https://d.repec.org/n?u=RePEc:pra:mprapa:121117&r=

Estimating Nonlinear Heterogeneous Agent Models with Neural Networks

By:	Kase, Hanno (European Central Bank); Melosi, Leonardo (University of Warwick, FRB Chicago, DNB, & CEPR); Rottner, Matthias (Deutsche Bundesbank)
Abstract:	We leverage recent advancements in machine learning to develop an integrated method to solve globally and estimate models featuring agent heterogeneity, nonlinear constraints, and aggregate uncertainty. Using simulated data, we show that the proposed method accurately estimates the parameters of a nonlinear Heterogeneous Agent New Keynesian (HANK) model with a zero lower bound (ZLB) constraint. We further apply our method to estimate this HANK model using U.S. data. In the estimated model, the interaction between the ZLB constraint and idiosyncratic income risks emerges as a key source of aggregate output volatility.
Keywords:	Neural networks ; likelihood ; global solution ; heterogeneous agents ; nonlinearity ; aggregate uncertainty ; HANK ; zero lower bound. JEL Codes: C11 ; C45 ; D31 ; E32 ; E52.
Date:	2024
URL:	http://d.repec.org/n?u=RePEc:wrk:warwec:1499&r=

Filtered not Mixed: Stochastic Filtering-Based Online Gating for Mixture of Large Language Models

By:	Raeid Saqur; Anastasis Kratsios; Florian Krach; Yannick Limmer; Jacob-Junqi Tian; John Willes; Blanka Horvath; Frank Rudzicz
Abstract:	We propose MoE-F -- a formalised mechanism for combining $N$ pre-trained expert Large Language Models (LLMs) in online time-series prediction tasks by adaptively forecasting the best weighting of LLM predictions at every time step. Our mechanism leverages the conditional information in each expert's running performance to forecast the best combination of LLMs for predicting the time series in its next step. Diverging from static (learned) Mixture of Experts (MoE) methods, MoE-F employs time-adaptive stochastic filtering techniques to combine experts. By framing the expert selection problem as a finite state-space, continuous-time Hidden Markov model (HMM), we can leverage the Wohman-Shiryaev filter. Our approach first constructs $N$ parallel filters corresponding to each of the $N$ individual LLMs. Each filter proposes its best combination of LLMs, given the information that they have access to. Subsequently, the $N$ filter outputs are aggregated to optimize a lower bound for the loss of the aggregated LLMs, which can be optimized in closed-form, thus generating our ensemble predictor. Our contributions here are: (I) the MoE-F algorithm -- deployable as a plug-and-play filtering harness, (II) theoretical optimality guarantees of the proposed filtering-based gating algorithm, and (III) empirical evaluation and ablative results using state of the art foundational and MoE LLMs on a real-world Financial Market Movement task where MoE-F attains a remarkable 17% absolute and 48.5% relative F1 measure improvement over the next best performing individual LLM expert.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.02969&r=

Reinforcement Learning from Experience Feedback: Application to Economic Policy

By:	Tohid Atashbar
Abstract:	Learning from the past is critical for shaping the future, especially when it comes to economic policymaking. Building upon the current methods in the application of Reinforcement Learning (RL) to the large language models (LLMs), this paper introduces Reinforcement Learning from Experience Feedback (RLXF), a procedure that tunes LLMs based on lessons from past experiences. RLXF integrates historical experiences into LLM training in two key ways - by training reward models on historical data, and by using that knowledge to fine-tune the LLMs. As a case study, we applied RLXF to tune an LLM using the IMF's MONA database to generate historically-grounded policy suggestions. The results demonstrate RLXF's potential to equip generative AI with a nuanced perspective informed by previous experiences. Overall, it seems RLXF could enable more informed applications of LLMs for economic policy, but this approach is not without the potential risks and limitations of relying heavily on historical data, as it may perpetuate biases and outdated assumptions.
Keywords:	LLMs; GAI; RLHF; RLAIF; RLXF
Date:	2024–06–07
URL:	https://d.repec.org/n?u=RePEc:imf:imfwpa:2024/114&r=

Watch Me Improve — Algorithm Aversion and Demonstrating the Ability to Learn

By:	Berger, Benedikt; Adam, Martin; Rühr, Alexander; Benlian, Alexander
Abstract:	Owing to advancements in artificial intelligence (AI) and specifically in machine learning, information technology (IT) systems can support humans in an increasing number of tasks. Yet, previous research indicates that people often prefer human support to support by an IT system, even if the latter provides superior performance – a phenomenon called algorithm aversion. A possible cause of algorithm aversion put forward in literature is that users lose trust in IT systems they become familiar with and perceive to err, for example, making forecasts that turn out to deviate from the actual value. Therefore, this paper evaluates the effectiveness of demonstrating an AI-based system’s ability to learn as a potential countermeasure against algorithm aversion in an incentive-compatible online experiment. The experiment reveals how the nature of an erring advisor (i.e., human vs. algorithmic), its familiarity to the user (i.e., unfamiliar vs. familiar), and its ability to learn (i.e., non-learning vs. learning) influence a decision maker’s reliance on the advisor’s judgement for an objective and non-personal decision task. The results reveal no difference in the reliance on unfamiliar human and algorithmic advisors, but differences in the reliance on familiar human and algorithmic advisors that err. Demonstrating an advisor’s ability to learn, however, offsets the effect of familiarity. Therefore, this study contributes to an enhanced understanding of algorithm aversion and is one of the first to examine how users perceive whether an IT system is able to learn. The findings provide theoretical and practical implications for the employment and design of AI-based systems.
Date:	2024–06–18
URL:	https://d.repec.org/n?u=RePEc:dar:wpaper:146095&r=

Deep reinforcement learning with positional context for intraday trading

By:	Sven Golu\v{z}a; Tomislav Kova\v{c}evi\'c; Tessa Bauman; Zvonko Kostanj\v{c}ar
Abstract:	Deep reinforcement learning (DRL) is a well-suited approach to financial decision-making, where an agent makes decisions based on its trading strategy developed from market observations. Existing DRL intraday trading strategies mainly use price-based features to construct the state space. They neglect the contextual information related to the position of the strategy, which is an important aspect given the sequential nature of intraday trading. In this study, we propose a novel DRL model for intraday trading that introduces positional features encapsulating the contextual information into its sparse state space. The model is evaluated over an extended period of almost a decade and across various assets including commodities and foreign exchange securities, taking transaction costs into account. The results show a notable performance in terms of profitability and risk-adjusted metrics. The feature importance results show that each feature incorporating contextual information contributes to the overall performance of the model. Additionally, through an exploration of the agent's intraday trading activity, we unveil patterns that substantiate the effectiveness of our proposed model.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.08013&r=

Quantifying the Reliance of Black-Box Decision-Makers on Variables of Interest

By:	Daniel Vebman
Abstract:	This paper introduces a framework for measuring how much black-box decision-makers rely on variables of interest. The framework adapts a permutation-based measure of variable importance from the explainable machine learning literature. With an emphasis on applicability, I present some of the framework's theoretical and computational properties, explain how reliance computations have policy implications, and work through an illustrative example. In the empirical application to interruptions by Supreme Court Justices during oral argument, I find that the effect of gender is more muted compared to the existing literature's estimate; I then use this paper's framework to compare Justices' reliance on gender and alignment to their reliance on experience, which are incomparable using regression coefficients.
Date:	2024–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2405.17225&r=

Utilizing Large Language Models for Automating Technical Customer Support

By:	Jochen Wulf; J\"urg Meierhofer
Abstract:	The use of large language models (LLMs) such as OpenAI's GPT-4 in technical customer support (TCS) has the potential to revolutionize this area. This study examines automated text correction, summarization of customer inquiries and question answering using LLMs. Through prototypes and data analyses, the potential and challenges of integrating LLMs into the TCS will be demonstrated. Our results show promising approaches for improving the efficiency and quality of customer service through LLMs, but also emphasize the need for quality-assured implementation and organizational adjustments in the data ecosystem.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.01407&r=

Deep LPPLS: Forecasting of temporal critical points in natural, engineering and financial systems

By:	Joshua Nielsen (University of Colorado, Boulder); Didier Sornette (Risks-X, Southern University of Science and Technology (SUSTech); Swiss Finance Institute; ETH Zürich - Department of Management, Technology, and Economics (D-MTEC); Tokyo Institute of Technology); Maziar Raissi (University of California, Riverside)
Abstract:	The Log-Periodic Power Law Singularity (LPPLS) model offers a general framework for capturing dynamics and predicting transition points in diverse natural and social systems. In this work, we present two calibration techniques for the LPPLS model using deep learning. First, we introduce the Mono-LPPLS-NN (M-LNN) model; for any given empirical time series, a unique M-LNN model is trained and shown to outperform state-of-the-art techniques in estimating the nonlinear parameters (tc; m; !) of the LPPLS model as evidenced by the comprehensive distribution of parameter errors. Second, we extend the M-LNN model to a more general model architecture, the Poly-LPPLS-NN (P-LNN), which is able to quickly estimate the nonlinear parameters of the LPPLS model for any given time-series of a fixed length, including previously unseen time-series during training. The Poly class of models train on many synthetic LPPLS time-series augmented with various noise structures in a supervised manner. Given enough training examples, the P-LNN models also outperform state-of-the-art techniques for estimating the parameters of the LPPLS model as evidenced by the comprehensive distribution of parameter errors. Additionally, this class of models is shown to substantially reduce the time to obtain parameter estimates. Finally, we present applications to the diagnostic and prediction of two financial bubble peaks (followed by their crash) and of a famous rockslide. These contributions provide a bridge between deep learning and the study of the prediction of transition times in complex time series.
Keywords:	log-periodicity, finite-time singularity, prediction, change of regime, financial bubbles, landslides, deep learning
JEL:	C00 C13 C69 G01
Date:	2024–05
URL:	https://d.repec.org/n?u=RePEc:chf:rpseri:rp2433&r=

Random Subspace Local Projections

By:	Viet Hoang Dinh; Didier Nibbering; Benjamin Wong
Abstract:	We show how random subspace methods can be adapted to estimating local projections with many controls. Random subspace methods have their roots in the machine learning literature and are implemented by averaging over regressions estimated over different combinations of subsets of these controls. We document three key results: (i) Our approach can successfully recover the impulse response functions across Monte Carlo experiments representative of different macroeconomic settings and identification schemes. (ii) Our results suggest that random subspace methods are more accurate than other dimension reduction methods if the underlying large dataset has a factor structure similar to typical macroeconomic datasets such as FRED-MD. (iii) Our approach leads to differences in the estimated impulse response functions relative to benchmark methods when applied to two widely studied empirical applications.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.01002&r=

Artificial Intelligence and Entrepreneurship

By:	Fossen, Frank M. (University of Nevada, Reno); McLemore, Trevor (University of Nevada, Reno); Sorgner, Alina (John Cabot University)
Abstract:	This survey reviews emerging but fast-growing literature on impacts of artificial intelligence (AI) on entrepreneurship, providing a resource for researchers in entrepreneurship and neighboring disciplines. We begin with a review of definitions of AI and show that ambiguity and broadness of definitions adopted in empirical studies may result in obscured evidence on impacts of AI on en-trepreneurship. Against this background, we present and discuss existing theory and evidence on how AI technologies affect entrepreneurial opportunities and decision-making under uncertainty, the adoption of AI technologies by startups, entry barriers, and the performance of entrepreneurial businesses. We add an original empirical analysis of survey data from the German Socio-economic Panel revealing that entrepreneurs, particularly those with employees, are aware of and use AI technologies significantly more frequently than paid employees. Next, we discuss how AI may affect entrepreneurship indirectly through impacting local and sectoral labor markets. The reviewed evidence suggests that AI technologies that are designed to automate jobs are likely to result in a higher level of necessity entrepreneurship in a region, whereas AI technologies that transform jobs without necessarily displacing human workers increase the level of opportunity entrepreneurship. More generally, AI impacts regional entrepreneurship ecosystems (EE) in multiple ways by altering the importance of existing EE elements and processes, creating new ones, and potentially reducing the role of geography for entrepreneurship. Lastly, we address the question of how regulation of AI may affect the entrepreneurship landscape by focusing on the case of the European Union that has pioneered data protection and AI legislation. We conclude our survey by discussing implications for entrepreneurship research and policy.
Keywords:	artificial intelligence, machine learning, entrepreneurship, AI startups, digital entrepreneurship, opportunity, innovation, entrepreneurship ecosystem, digital entrepreneurship ecosystem, AI regulation
JEL:	J24 L26 O30
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:iza:izadps:dp17055&r=

Using Large Language Models for Text Classification in Experimental Economics

By:	Can Celebi (University of Mannheim); Stefan Penczynski (School of Economics and Centre for Behavioural and Experimental Social Science, University of East Anglia)
Abstract:	In our study, we compare the classification capabilities of GPT-3.5 and GPT-4 with human annotators using text data from economic experiments. We analysed four text corpora, focusing on two domains: promises and strategic reasoning. Starting with prompts close to those given to human annotators, we subsequently explored alternative prompts to investigate the effect of varying classification instructions and degrees of background information on the modelsâ€™ classification performance. Additionally, we varied the number of examples in a prompt (few-shot vs zero-shot) and the use of the zero-shot â€œChain of Thoughtâ€ prompting technique. Our findings show that GPT-4â€™s performance is comparable to human annotators, achieving accuracy levels near or over 90% in three tasks, and in the most challenging task of classifying strategic thinking in asymmetric coordination games, it reaches an accuracy level above 70%.
Keywords:	Text Classification, GPT, Strategic Thinking, Promises
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:uea:wcbess:24-01&r=

Can Earnings Calls Be Used to Gauge Labor Market Tightness?

By:	Mick Dueholm; Aakash Kalyani; Serdar Ozkan
Abstract:	An index that uses textual analysis of earnings calls to track labor issues appears to be highly correlated to one measure of labor market tightness.
Keywords:	textual analysis; natural language processing; earnings calls; labor markets; labor market tightness
Date:	2024–06–18
URL:	https://d.repec.org/n?u=RePEc:fip:l00001:98404&r=

DeepUnifiedMom: Unified Time-series Momentum Portfolio Construction via Multi-Task Learning with Multi-Gate Mixture of Experts

By:	Joel Ong; Dorien Herremans
Abstract:	This paper introduces DeepUnifiedMom, a deep learning framework that enhances portfolio management through a multi-task learning approach and a multi-gate mixture of experts. The essence of DeepUnifiedMom lies in its ability to create unified momentum portfolios that incorporate the dynamics of time series momentum across a spectrum of time frames, a feature often missing in traditional momentum strategies. Our comprehensive backtesting, encompassing diverse asset classes such as equity indexes, fixed income, foreign exchange, and commodities, demonstrates that DeepUnifiedMom consistently outperforms benchmark models, even after factoring in transaction costs. This superior performance underscores DeepUnifiedMom's capability to capture the full spectrum of momentum opportunities within financial markets. The findings highlight DeepUnifiedMom as an effective tool for practitioners looking to exploit the entire range of momentum opportunities. It offers a compelling solution for improving risk-adjusted returns and is a valuable strategy for navigating the complexities of portfolio management.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.08742&r=

Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism

By:	Chang Zong; Jian Shao; Weiming Lu; Yueting Zhuang
Abstract:	The accurate prediction of stock movements is crucial for investment strategies. Stock prices are subject to the influence of various forms of information, including financial indicators, sentiment analysis, news documents, and relational structures. Predominant analytical approaches, however, tend to address only unimodal or bimodal sources, neglecting the complexity of multimodal data. Further complicating the landscape are the issues of data sparsity and semantic conflicts between these modalities, which are frequently overlooked by current models, leading to unstable performance and limiting practical applicability. To address these shortcomings, this study introduces a novel architecture, named Multimodal Stable Fusion with Gated Cross-Attention (MSGCA), designed to robustly integrate multimodal input for stock movement prediction. The MSGCA framework consists of three integral components: (1) a trimodal encoding module, responsible for processing indicator sequences, dynamic documents, and a relational graph, and standardizing their feature representations; (2) a cross-feature fusion module, where primary and consistent features guide the multimodal fusion of the three modalities via a pair of gated cross-attention networks; and (3) a prediction module, which refines the fused features through temporal and dimensional reduction to execute precise movement forecasting. Empirical evaluations demonstrate that the MSGCA framework exceeds current leading methods, achieving performance gains of 8.1%, 6.1%, 21.7% and 31.6% on four multimodal datasets, respectively, attributed to its enhanced multimodal fusion stability.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.06594&r=

Reading between the lines: Uncovering asymmetry in the central bank loss function

By:	Haavio, Markus; Heikkinen, Joni; Jalasjoki, Pirkka; Kilponen, Juha; Paloviita, Maritta; Vänni, Ilona
Abstract:	We depart from the common reaction function-based approach used to infer central bank preferences. Instead, we extract the tone from the textual information in the central bank communication using both a lexicon-based approach and a language model. We combine the tone with real-time information available to the monetary policy decision-maker and directly estimate the loss function. We find strong and robust evidence of asymmetry in the case of the European Central Bank during 1999-2021: the slope of the loss function was roughly three times steeper when inflation exceeded the target compared to when it was below the target. This represents a significant departure from the quadratic and symmetric monetary policy loss function typically applied in macro models.
Keywords:	central bank communication, textual analysis, language models, asymmetric loss function, optimal monetary policy
JEL:	E31 E52 E58
Date:	2024
URL:	https://d.repec.org/n?u=RePEc:zbw:bofrdp:298852&r=

This nep-big issue is ©2024 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.

By:	Mark E. Schaffer (Heriot-Watt University)
Date:	2023–11–09
URL:	https://d.repec.org/n?u=RePEc:boc:econ23:04&r=