|
on Computational Economics |
By: | Askitas, Nikos (IZA) |
Abstract: | This paper addresses the steep learning curve in Machine Learning faced by noncomputer scientists, particularly social scientists, stemming from the absence of a primer on its fundamental principles. I adopt a pedagogical strategy inspired by the adage "once you understand OLS, you can work your way up to any other estimator, " and apply it to Machine Learning. Focusing on a single-hidden-layer artificial neural network, the paper discusses its mathematical underpinnings, including the pivotal Universal Approximation Theorem—an essential "existence theorem". The exposition extends to the algorithmic exploration of solutions, specifically through "feed forward" and "back-propagation", and rounds up with the practical implementation in Python. The objective of this primer is to equip readers with a solid elementary comprehension of first principles and fire some trailblazers to the forefront of AI and causal machine learning. |
Keywords: | machine learning, deep learning, supervised learning, artificial neural network, perceptron, Python, keras, tensorflow, universal approximation theorem |
JEL: | C01 C87 C00 C60 |
Date: | 2024–05 |
URL: | https://d.repec.org/n?u=RePEc:iza:izadps:dp17014 |
By: | Alexander Bakumenko (Clemson University, USA); Kate\v{r}ina Hlav\'a\v{c}kov\'a-Schindler (University of Vienna, Austria); Claudia Plant (University of Vienna, Austria); Nina C. Hubig (Clemson University, USA) |
Abstract: | Detecting anomalies in general ledger data is of utmost importance to ensure trustworthiness of financial records. Financial audits increasingly rely on machine learning (ML) algorithms to identify irregular or potentially fraudulent journal entries, each characterized by a varying number of transactions. In machine learning, heterogeneity in feature dimensions adds significant complexity to data analysis. In this paper, we introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings. To encode non-semantic categorical data from real-world financial records, we tested 3 pre-trained general purpose sentence-transformer models. For the downstream classification task, we implemented and evaluated 5 optimized ML models including Logistic Regression, Random Forest, Gradient Boosting Machines, Support Vector Machines, and Neural Networks. Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines, in selected settings even by a large margin. The findings further underscore the effectiveness of LLMs in enhancing anomaly detection in financial journal entries, particularly by tackling feature sparsity. We discuss a promising perspective on using LLM embeddings for non-semantic data in the financial context and beyond. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.03614 |
By: | Bivas Dinda |
Abstract: | The recent advancement of deep learning architectures, neural networks, and the combination of abundant financial data and powerful computers are transforming finance, leading us to develop an advanced method for predicting future stock prices. However, the accessibility of investment and trading at everyone's fingertips made the stock markets increasingly intricate and prone to volatility. The increased complexity and volatility of the stock market have driven demand for more models, which would effectively capture high volatility and non-linear behavior of the different stock prices. This study explored gated recurrent neural network (GRNN) algorithms such as LSTM (long short-term memory), GRU (gated recurrent unit), and hybrid models like GRU-LSTM, LSTM-GRU, with Tree-structured Parzen Estimator (TPE) Bayesian optimization for hyperparameter optimization (TPE-GRNN). The aim is to improve the prediction accuracy of the next day's closing price of the NIFTY 50 index, a prominent Indian stock market index, using TPE-GRNN. A combination of eight influential factors is carefully chosen from fundamental stock data, technical indicators, crude oil price, and macroeconomic data to train the models for capturing the changes in the price of the index with the factors of the broader economy. Single-layer and multi-layer TPE-GRNN models have been developed. The models' performance is evaluated using standard matrices like R2, MAPE, and RMSE. The analysis of models' performance reveals the impact of feature selection and hyperparameter optimization (HPO) in enhancing stock index price prediction accuracy. The results show that the MAPE of our proposed TPE-LSTM method is the lowest (best) with respect to all the previous models for stock index price prediction. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.02604 |
By: | Mark E. Schaffer (Heriot-Watt University) |
Date: | 2023–11–09 |
URL: | https://d.repec.org/n?u=RePEc:boc:econ23:04 |
By: | Tohid Atashbar |
Abstract: | Learning from the past is critical for shaping the future, especially when it comes to economic policymaking. Building upon the current methods in the application of Reinforcement Learning (RL) to the large language models (LLMs), this paper introduces Reinforcement Learning from Experience Feedback (RLXF), a procedure that tunes LLMs based on lessons from past experiences. RLXF integrates historical experiences into LLM training in two key ways - by training reward models on historical data, and by using that knowledge to fine-tune the LLMs. As a case study, we applied RLXF to tune an LLM using the IMF's MONA database to generate historically-grounded policy suggestions. The results demonstrate RLXF's potential to equip generative AI with a nuanced perspective informed by previous experiences. Overall, it seems RLXF could enable more informed applications of LLMs for economic policy, but this approach is not without the potential risks and limitations of relying heavily on historical data, as it may perpetuate biases and outdated assumptions. |
Keywords: | LLMs; GAI; RLHF; RLAIF; RLXF |
Date: | 2024–06–07 |
URL: | https://d.repec.org/n?u=RePEc:imf:imfwpa:2024/114 |
By: | Milen Arro-Cannarsa; Dr. Rolf Scheufele |
Abstract: | We compare several machine learning methods for nowcasting GDP. A large mixed-frequency data set is used to investigate different algorithms such as regression based methods (LASSO, ridge, elastic net), regression trees (bagging, random forest, gradient boosting), and SVR. As benchmarks, we use univariate models, a simple forward selection algorithm, and a principal components regression. The analysis accounts for publication lags and treats monthly indicators as quarterly variables combined via blocking. Our data set consists of more than 1, 100 time series. For the period after the Great Recession, which is particularly challenging in terms of nowcasting, we find that all considered machine learning techniques beat the univariate benchmark up to 28 % in terms of out-of-sample RMSE. Ridge, elastic net, and SVR are the most promising algorithms in our analysis, significantly outperforming principal components regression. |
Keywords: | Nowcasting, Forecasting, Machine learning, Rridge, LASSO, Elastic net, Random forest, Bagging, Boosting, SVM, SVR, Large data sets |
JEL: | C53 C55 C32 |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:snb:snbwpa:2024-06 |
By: | Francesco Audrino; Jonathan Chassot |
Abstract: | We investigate the predictive abilities of the heterogeneous autoregressive (HAR) model compared to machine learning (ML) techniques across an unprecedented dataset of 1, 455 stocks. Our analysis focuses on the role of fitting schemes, particularly the training window and re-estimation frequency, in determining the HAR model's performance. Despite extensive hyperparameter tuning, ML models fail to surpass the linear benchmark set by HAR when utilizing a refined fitting approach for the latter. Moreover, the simplicity of HAR allows for an interpretable model with drastically lower computational costs. We assess performance using QLIKE, MSE, and realized utility metrics, finding that HAR consistently outperforms its ML counterparts when both rely solely on realized volatility and VIX as predictors. Our results underscore the importance of a correctly specified fitting scheme. They suggest that properly fitted HAR models provide superior forecasting accuracy, establishing robust guidelines for their practical application and use as a benchmark. This study not only reaffirms the efficacy of the HAR model but also provides a critical perspective on the practical limitations of ML approaches in realized volatility forecasting. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.08041 |
By: | Buxmann, Peter; Hess, Thomas; Thatcher, Jason Bennett |
Abstract: | Artificial intelligence (AI) is about to bring fundamental changes in our society and economy, touching on how organizations make decisions, deliver services, and evaluate opportunities. Given the breadth of their potential reach across companies of different sizes and in different industries, Erik Brynjolfsson and Andrew McAfee of MIT even speak of AI as “the most important general-purpose technology of our era” (Brynjolfsson and McAfee 2017, p. 2). Today, AI applications in most of the cases are based upon machine learning algorithms, whereby supervised learning, in particular, has become established in practice. |
Date: | 2024–06–18 |
URL: | https://d.repec.org/n?u=RePEc:dar:wpaper:146094 |
By: | Sugarbayar Enkhbayar (University of Warsaw, Faculty of Economic Sciences, Quantitative Finance Research Group); Robert Ślepaczuk (University of Warsaw, Faculty of Economic Sciences, Quantitative Finance Research Group, Department of Quantitative Finance and Machine Learning) |
Abstract: | This study aimed to apply the algorithmic trading strategy on major foreign exchange pairs and compare the performances of machine learning-based strategies and traditional trend-following strategies with benchmark strategies. It differs from other studies in that it considered a wide variety of cases including different foreign exchange pairs, return methods, data frequency, and individual and integrated trading strategies. Ridge regression, KNN, RF, XGBoost, GBDT, ANN, LSTM, and GRU models were used for the machine learning-based strategy, while the MA cross strategy was employed for the trend-following strategy. Backtests were performed on 6 major pairs in the period from January 1, 2000, to June 30, 2023, and daily, and intraday data were used. The Sharpe ratio was considered as a metric used to refer to economic significance, and the independent t-test was used to determine statistical significance. The general findings of the study suggested that the currency market has become more efficient. The rise in efficiency is probably caused by the fact that more algorithms are being used in this market, and information spreads much faster. Instead of finding one trading strategy that works well on all major foreign exchange pairs, our study showed it’s possible to find an effective algorithmic trading strategy that generates a more effective trading signal in each specific case. |
Keywords: | machine learning, algorithmic trading, foreign exchange market, rolling walk-forward optimization, technical indicators |
JEL: | C4 C14 C45 C53 C58 G13 |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:war:wpaper:2024-10 |
By: | Kumar, Pradeep (University of Exeter); Nicodemo, Catia (University of Oxford); Oreffice, Sonia (University of Exeter); Quintana-Domeque, Climent (University of Exeter) |
Abstract: | This study employs six Machine Learning methods - Logit, Lasso-Logit, Ridge-Logit, Random Forest, Extreme Gradient Boosting, and an Ensemble - alongside registry data on abortions in Spain from 2011-2019 to predict multiple abortions and assess monetary savings through targeted interventions. We find that Random Forest and an Ensemble method are most effective in the highest risk decile, capturing about 55% of cases, whereas linear models and Extreme Gradient Boosting excel in mid to lower deciles. We also show that targeting the top 20% most at-risk could yield cost savings of 5.44 to 8.2 million EUR, which could be reallocated to prevent unintended pregnancies arising from contraceptive failure, abusive relationships, and sexual assault, among other factors. |
Keywords: | Extreme Gradient Boosting, Ridge, random forest, multiple abortions, Logit, Lasso, Ensemble, reproductive healthcare |
JEL: | I12 I18 C53 J13 C55 |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:iza:izadps:dp17046 |
By: | Adam Korniejczuk (University of Warsaw, Faculty of Economic Sciences, Quantitative Finance Research Group); Robert Ślepaczuk (University of Warsaw, Faculty of Economic Sciences, Quantitative Finance Research Group, Department of Quantitative Finance and Machine Learning) |
Abstract: | The study seeks to develop an effective strategy based on the novel framework of statistical arbitrage based on graph clustering algorithms. Amalgamation of quantitative and machine learning methods, including the Kelly criterion, and an ensemble of machine learning classifiers have been used to improve risk-adjusted returns and increase the immunity to transaction costs over existing approaches. The study seeks to provide an integrated approach to optimal signal detection and risk management. As a part of this approach, innovative ways of optimizing take profit and stop loss functions for daily frequency trading strategies have been proposed and tested. All of the tested approaches outperformed appropriate benchmarks. The best combinations of the techniques and parameters demonstrated significantly better performance metrics than the relevant benchmarks. The results have been obtained under the assumption of realistic transaction costs, but are sensitive to the changes of some key parameters. |
Keywords: | graph clustering algorithms, statistical arbitrage, algorithmic investment strategies, pair trading strategy, Kelly criterion, machine learning, risk adjusted returns |
JEL: | C4 C45 C55 C65 G11 |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:war:wpaper:2024-09 |
By: | Seulki Chung |
Abstract: | This paper presents a comparative analysis of univariate and multivariate GARCH-family models and machine learning algorithms in modeling and forecasting the volatility of major energy commodities: crude oil, gasoline, heating oil, and natural gas. It uses a comprehensive dataset incorporating financial, macroeconomic, and environmental variables to assess predictive performance and discusses volatility persistence and transmission across these commodities. Aspects of volatility persistence and transmission, traditionally examined by GARCH-class models, are jointly explored using the SHAP (Shapley Additive exPlanations) method. The findings reveal that machine learning models demonstrate superior out-of-sample forecasting performance compared to traditional GARCH models. Machine learning models tend to underpredict, while GARCH models tend to overpredict energy market volatility, suggesting a hybrid use of both types of models. There is volatility transmission from crude oil to the gasoline and heating oil markets. The volatility transmission in the natural gas market is less prevalent. |
Date: | 2024–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2405.19849 |
By: | Ayush Singh; Anshu K. Jha; Amit N. Kumar |
Abstract: | In this paper, our focus lies on the Merton's jump diffusion model, employing jump processes characterized by the compound Poisson process. Our primary objective is to forecast the drift and volatility of the model using a variety of methodologies. We adopt an approach that involves implementing different drift, volatility, and jump terms within the model through various machine learning techniques, traditional methods, and statistical methods on price-volume data. Additionally, we introduce a path-dependent Monte Carlo simulation to model cryptocurrency prices, taking into account the volatility and unexpected jumps in prices. |
Date: | 2024–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2405.12988 |
By: | Kase, Hanno (European Central Bank); Melosi, Leonardo (University of Warwick, FRB Chicago, DNB, & CEPR); Rottner, Matthias (Deutsche Bundesbank) |
Abstract: | We leverage recent advancements in machine learning to develop an integrated method to solve globally and estimate models featuring agent heterogeneity, nonlinear constraints, and aggregate uncertainty. Using simulated data, we show that the proposed method accurately estimates the parameters of a nonlinear Heterogeneous Agent New Keynesian (HANK) model with a zero lower bound (ZLB) constraint. We further apply our method to estimate this HANK model using U.S. data. In the estimated model, the interaction between the ZLB constraint and idiosyncratic income risks emerges as a key source of aggregate output volatility. |
Keywords: | Neural networks ; likelihood ; global solution ; heterogeneous agents ; nonlinearity ; aggregate uncertainty ; HANK ; zero lower bound. JEL Codes: C11 ; C45 ; D31 ; E32 ; E52. |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:wrk:warwec:1499 |
By: | Junquera, Álvaro F. (Universitat Autònoma de Barcelona); Kern, Christoph |
Abstract: | Public employment services (PES) commonly apply profiling systems to target support programs to jobseekers at risk of becoming long-term unemployed. Such systems often codify institutional experiences in a set of decision rules, whose predictive ability, however, is seldomly tested. We systematically evaluate the predictive performance of a rule-based system currently implemented by the PES of Catalonia, Spain, in comparison to the performance of statistical models in predicting future long-term unemployment episodes. Using comprehensive administrative data, we develop linear and machine learning models and evaluate their performance with respect to both discrimination and calibration. Compared to the current rule-based system of Catalonia, our machine learning models achieve greater discrimination ability and remarkable improvements in calibration. Particularly, our random forest model is able to accurately forecast episodes and outperforms the rule-based model by offering robust quantitative predictions that perform well under stress tests. This paper presents the first performance comparison between a complex, currently implemented, rule-based approach and complex statistical profiling models. Our work illustrates the importance of assessing the calibration of profiling models and the potential of statistical tools to assist public employment offices in Spain. |
Date: | 2024–06–14 |
URL: | https://d.repec.org/n?u=RePEc:osf:socarx:c7ps3 |
By: | Sven Golu\v{z}a; Tomislav Kova\v{c}evi\'c; Tessa Bauman; Zvonko Kostanj\v{c}ar |
Abstract: | Deep reinforcement learning (DRL) is a well-suited approach to financial decision-making, where an agent makes decisions based on its trading strategy developed from market observations. Existing DRL intraday trading strategies mainly use price-based features to construct the state space. They neglect the contextual information related to the position of the strategy, which is an important aspect given the sequential nature of intraday trading. In this study, we propose a novel DRL model for intraday trading that introduces positional features encapsulating the contextual information into its sparse state space. The model is evaluated over an extended period of almost a decade and across various assets including commodities and foreign exchange securities, taking transaction costs into account. The results show a notable performance in terms of profitability and risk-adjusted metrics. The feature importance results show that each feature incorporating contextual information contributes to the overall performance of the model. Additionally, through an exploration of the agent's intraday trading activity, we unveil patterns that substantiate the effectiveness of our proposed model. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.08013 |
By: | Simon D Angus (SoDa Laboratories & Dept. of Economics, Monash Business School); Lachlan O'Neill (SoDa Laboratories, Monash Business School) |
Abstract: | Detecting and quantifying issue framing in textual discourse - the slant or perspective one takes to a given topic (e.g. climate science vs. denialism, misogyny vs. gender equality) - is highly valuable to a range of end-users from social and political scientists to program evaluators and policy analysts. Being able to identify statistically significant shifts, reversals, or changes in issue framing in public discourse would enable the quantitative evaluation of interventions, actors and events that shape discourse. However, issue framing is notoriously challenging for automated natural language processing (NLP) methods since the words and phrases used by either 'side' of an issue are often held in common, with only subtle stylistic flourishes separating their use. Here we develop and rigorously evaluate new detection methods for issue framing and narrative analysis within large text datasets. By introducing a novel application of next-token log probabilities derived from generative large language models (LLMs) we show that issue framing can be reliably and efficiently detected in large corpora with only a few examples of either perspective on a given issue, a method we call 'paired completion'. Through 192 independent experiments over three novel, synthetic datasets, we evaluate paired completion against prompt-based LLM methods and labelled methods using traditional NLP and recent LLM contextual embeddings. We additionally conduct a cost-based analysis to mark out the feasible set of performant methods at production-level scales, and a model bias analysis. Together, our work demonstrates a feasible path to scalable, accurate and low-bias issue-framing in large corpora. |
Keywords: | slant detection, text-as-data, synthetic data, computational linguistics |
JEL: | C19 C55 |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:ajr:sodwps:2024-02 |
By: | Edward Sharkey; Philip Treleaven |
Abstract: | The paper benchmarks several Transformer models [4], to show how these models can judge sentiment from a news event. This signal can then be used for downstream modelling and signal identification for commodity trading. We find that fine-tuned BERT models outperform fine-tuned or vanilla GPT models on this task. Transformer models have revolutionized the field of natural language processing (NLP) in recent years, achieving state-of-the-art results on various tasks such as machine translation, text summarization, question answering, and natural language generation. Among the most prominent transformer models are Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer (GPT), which differ in their architectures and objectives. A CopBERT model training data and process overview is provided. The CopBERT model outperforms similar domain specific BERT trained models such as FinBERT. The below confusion matrices show the performance on CopBERT & CopGPT respectively. We see a ~10 percent increase in f1_score when compare CopBERT vs GPT4 and 16 percent increase vs CopGPT. Whilst GPT4 is dominant It highlights the importance of considering alternatives to GPT models for financial engineering tasks, given risks of hallucinations, and challenges with interpretability. We unsurprisingly see the larger LLMs outperform the BERT models, with predictive power. In summary BERT is partially the new XGboost, what it lacks in predictive power it provides with higher levels of interpretability. Concluding that BERT models might not be the next XGboost [2], but represent an interesting alternative for financial engineering tasks, that require a blend of interpretability and accuracy. |
Date: | 2024–04 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2405.12990 |
By: | Bernhard Kasberger; Simon Martin; Hans-Theo Normann; Tobias Werner |
Abstract: | Algorithms play an increasingly important role in economic situations. These situations are often strategic, where the artificial intelligence may or may not be cooperative. We study the deter-minants and forms of algorithmic cooperation in the infinitely repeated prisoner’s dilemma. We run a sequence of computational experiments, accompanied by additional repeated prisoner’s dilemma games played by humans in the lab. We find that the same factors that increase human cooperation largely also determine the cooperation rates of algorithms. However, algorithms tend to play different strategies than humans. Algorithms cooperate less than humans when cooperation is very risky or not incentive-compatible. |
Keywords: | artificial intelligence, cooperation, large language models, Q-learning, repeated prisoner’s dilemma |
JEL: | C72 C73 C92 D83 |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:ces:ceswps:_11124 |
By: | Fabrice Murtin |
Abstract: | This paper applies Machine learning techniques to Google Trends data to provide real-time estimates of national average subjective well-being among 38 OECD countries since 2010. We make extensive usage of large custom micro databases to enhance the training of models on carefully pre-processed Google Trends data. We find that the best one-year-ahead prediction is obtained from a meta-learner that combines the predictions drawn from an Elastic Net with and without interactions, from a Gradient-Boosted Tree and from a Multi-layer Perceptron. As a result, across 38 countries over the 2010-2020 period, the out-of-sample prediction of average subjective well-being reaches an R2 of 0.830. |
Keywords: | poverty, spatial inequality, well-being |
JEL: | C1 C45 C53 D60 I31 |
Date: | 2024–06–28 |
URL: | https://d.repec.org/n?u=RePEc:oec:wiseaa:27-en |
By: | Emily Silcock; Abhishek Arora; Luca D'Amico-Wong; Melissa Dell |
Abstract: | In the U.S. historically, local newspapers drew their content largely from newswires like the Associated Press. Historians argue that newswires played a pivotal role in creating a national identity and shared understanding of the world, but there is no comprehensive archive of the content sent over newswires. We reconstruct such an archive by applying a customized deep learning pipeline to hundreds of terabytes of raw image scans from thousands of local newspapers. The resulting dataset contains 2.7 million unique public domain U.S. newswire articles, written between 1878 and 1977. Locations in these articles are georeferenced, topics are tagged using customized neural topic classification, named entities are recognized, and individuals are disambiguated to Wikipedia using a novel entity disambiguation model. To construct the Newswire dataset, we first recognize newspaper layouts and transcribe around 138 millions structured article texts from raw image scans. We then use a customized neural bi-encoder model to de-duplicate reproduced articles, in the presence of considerable abridgement and noise, quantifying how widely each article was reproduced. A text classifier is used to ensure that we only include newswire articles, which historically are in the public domain. The structured data that accompany the texts provide rich information about the who (disambiguated individuals), what (topics), and where (georeferencing) of the news that millions of Americans read over the course of a century. We also include Library of Congress metadata information about the newspapers that ran the articles on their front pages. The Newswire dataset is useful both for large language modeling - expanding training data beyond what is available from modern web texts - and for studying a diversity of questions in computational linguistics, social science, and the digital humanities. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.09490 |
By: | Nicole Immorlica; Brendan Lucier; Aleksandrs Slivkins |
Abstract: | Traditionally, AI has been modeled within economics as a technology that impacts payoffs by reducing costs or refining information for human agents. Our position is that, in light of recent advances in generative AI, it is increasingly useful to model AI itself as an economic agent. In our framework, each user is augmented with an AI agent and can consult the AI prior to taking actions in a game. The AI agent and the user have potentially different information and preferences over the communication, which can result in equilibria that are qualitatively different than in settings without AI. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.00477 |
By: | Raeid Saqur; Anastasis Kratsios; Florian Krach; Yannick Limmer; Jacob-Junqi Tian; John Willes; Blanka Horvath; Frank Rudzicz |
Abstract: | We propose MoE-F -- a formalised mechanism for combining $N$ pre-trained expert Large Language Models (LLMs) in online time-series prediction tasks by adaptively forecasting the best weighting of LLM predictions at every time step. Our mechanism leverages the conditional information in each expert's running performance to forecast the best combination of LLMs for predicting the time series in its next step. Diverging from static (learned) Mixture of Experts (MoE) methods, MoE-F employs time-adaptive stochastic filtering techniques to combine experts. By framing the expert selection problem as a finite state-space, continuous-time Hidden Markov model (HMM), we can leverage the Wohman-Shiryaev filter. Our approach first constructs $N$ parallel filters corresponding to each of the $N$ individual LLMs. Each filter proposes its best combination of LLMs, given the information that they have access to. Subsequently, the $N$ filter outputs are aggregated to optimize a lower bound for the loss of the aggregated LLMs, which can be optimized in closed-form, thus generating our ensemble predictor. Our contributions here are: (I) the MoE-F algorithm -- deployable as a plug-and-play filtering harness, (II) theoretical optimality guarantees of the proposed filtering-based gating algorithm, and (III) empirical evaluation and ablative results using state of the art foundational and MoE LLMs on a real-world Financial Market Movement task where MoE-F attains a remarkable 17% absolute and 48.5% relative F1 measure improvement over the next best performing individual LLM expert. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.02969 |
By: | Nicolás Forteza (Banco de España); Elvira Prades (Banco de España); Marc Roca (Banco de España) |
Abstract: | On 28 December 2022, the Spanish government announced a temporary Value Added Tax (VAT) rate reduction for selected products. VAT rates were cut on 1 January 2023 and are expected to go back to their previous level by mid-2024. Using a web-scraped dataset, we leverage machine learning techniques to classify each product. Then we study the price effects of the temporary VAT rate reduction, covering the daily prices of roughly 10, 000 food products sold online by a Spanish supermarket. To identify the causal price effects, we compare the evolution of prices for treated items (that is, subject to the tax policy) against a control group (food items outside the policy’s scope). Our findings indicate that, at the supermarket level, the pass-through was almost complete. We observe differences in the speed of pass-through across different product types. |
Keywords: | price rigidity, inflation, consumer prices, heterogeneity, microdata, VAT pass-through |
JEL: | E31 H22 H25 |
Date: | 2024–05 |
URL: | https://d.repec.org/n?u=RePEc:bde:wpaper:2417 |
By: | Can Celebi (University of Mannheim); Stefan Penczynski (School of Economics and Centre for Behavioural and Experimental Social Science, University of East Anglia) |
Abstract: | In our study, we compare the classification capabilities of GPT-3.5 and GPT-4 with human annotators using text data from economic experiments. We analysed four text corpora, focusing on two domains: promises and strategic reasoning. Starting with prompts close to those given to human annotators, we subsequently explored alternative prompts to investigate the effect of varying classification instructions and degrees of background information on the models’ classification performance. Additionally, we varied the number of examples in a prompt (few-shot vs zero-shot) and the use of the zero-shot “Chain of Thought†prompting technique. Our findings show that GPT-4’s performance is comparable to human annotators, achieving accuracy levels near or over 90% in three tasks, and in the most challenging task of classifying strategic thinking in asymmetric coordination games, it reaches an accuracy level above 70%. |
Keywords: | Text Classification, GPT, Strategic Thinking, Promises |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:uea:wcbess:24-01 |
By: | Daniel Vebman |
Abstract: | This paper introduces a framework for measuring how much black-box decision-makers rely on variables of interest. The framework adapts a permutation-based measure of variable importance from the explainable machine learning literature. With an emphasis on applicability, I present some of the framework's theoretical and computational properties, explain how reliance computations have policy implications, and work through an illustrative example. In the empirical application to interruptions by Supreme Court Justices during oral argument, I find that the effect of gender is more muted compared to the existing literature's estimate; I then use this paper's framework to compare Justices' reliance on gender and alignment to their reliance on experience, which are incomparable using regression coefficients. |
Date: | 2024–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2405.17225 |
By: | Pablo Alvarez-Campana; Felix Villafanez; Fernando Acebes; David Poza |
Abstract: | This paper presents a simulation approach to enhance the performance of heuristics for multi-project scheduling. Unlike other heuristics available in the literature that use only one priority criterion for resource allocation, this paper proposes a structured way to sequentially apply more than one priority criterion for this purpose. By means of simulation, different feasible schedules are obtained to, therefore, increase the probability of finding the schedule with the shortest duration. The performance of this simulation approach was validated with the MPSPLib library, one of the most prominent libraries for resource-constrained multi-project scheduling. These results highlight the proposed method as a useful option for addressing limited time and resources in portfolio management. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.02102 |
By: | Joel Ong; Dorien Herremans |
Abstract: | This paper introduces DeepUnifiedMom, a deep learning framework that enhances portfolio management through a multi-task learning approach and a multi-gate mixture of experts. The essence of DeepUnifiedMom lies in its ability to create unified momentum portfolios that incorporate the dynamics of time series momentum across a spectrum of time frames, a feature often missing in traditional momentum strategies. Our comprehensive backtesting, encompassing diverse asset classes such as equity indexes, fixed income, foreign exchange, and commodities, demonstrates that DeepUnifiedMom consistently outperforms benchmark models, even after factoring in transaction costs. This superior performance underscores DeepUnifiedMom's capability to capture the full spectrum of momentum opportunities within financial markets. The findings highlight DeepUnifiedMom as an effective tool for practitioners looking to exploit the entire range of momentum opportunities. It offers a compelling solution for improving risk-adjusted returns and is a valuable strategy for navigating the complexities of portfolio management. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.08742 |
By: | Guido Gazzani; Julien Guyon |
Abstract: | We consider the path-dependent volatility (PDV) model of Guyon and Lekeufack (2023), where the instantaneous volatility is a linear combination of a weighted sum of past returns and the square root of a weighted sum of past squared returns. We discuss the influence of an additional parameter that unlocks enough volatility on the upside to reproduce the implied volatility smiles of S&P 500 and VIX options. This PDV model, motivated by empirical studies, comes with computational challenges, especially in relation to VIX options pricing and calibration. We propose an accurate neural network approximation of the VIX which leverages on the Markovianity of the 4-factor version of the model. The VIX is learned as a function of the Markovian factors and the model parameters. We use this approximation to tackle the joint calibration of S&P 500 and VIX options. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.02319 |
By: | Jingru Jia; Zehua Yuan; Junhao Pan; Paul McNamara; Deming Chen |
Abstract: | When making decisions under uncertainty, individuals often deviate from rational behavior, which can be evaluated across three dimensions: risk preference, probability weighting, and loss aversion. Given the widespread use of large language models (LLMs) in decision-making processes, it is crucial to assess whether their behavior aligns with human norms and ethical expectations or exhibits potential biases. Several empirical studies have investigated the rationality and social behavior performance of LLMs, yet their internal decision-making tendencies and capabilities remain inadequately understood. This paper proposes a framework, grounded in behavioral economics, to evaluate the decision-making behaviors of LLMs. Through a multiple-choice-list experiment, we estimate the degree of risk preference, probability weighting, and loss aversion in a context-free setting for three commercial LLMs: ChatGPT-4.0-Turbo, Claude-3-Opus, and Gemini-1.0-pro. Our results reveal that LLMs generally exhibit patterns similar to humans, such as risk aversion and loss aversion, with a tendency to overweight small probabilities. However, there are significant variations in the degree to which these behaviors are expressed across different LLMs. We also explore their behavior when embedded with socio-demographic features, uncovering significant disparities. For instance, when modeled with attributes of sexual minority groups or physical disabilities, Claude-3-Opus displays increased risk aversion, leading to more conservative choices. These findings underscore the need for careful consideration of the ethical implications and potential biases in deploying LLMs in decision-making scenarios. Therefore, this study advocates for developing standards and guidelines to ensure that LLMs operate within ethical boundaries while enhancing their utility in complex decision-making environments. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.05972 |
By: | Mengfei Chen (Grace); Mohamed Kharbeche (Grace); Mohamed Haouari (Grace); Weihong (Grace); Guo |
Abstract: | How to ensure accessibility to food and nutrition while food supply chains suffer from demand and supply uncertainties caused by disruptive forces such as the COVID-19 pandemic and natural disasters is an emerging and critical issue. Unstable access to food influences the level of nutrition that weakens the health and well-being of citizens. Therefore, a food accessibility evaluation index is proposed in this work to quantify how well nutrition needs are met. The proposed index is then embedded in a stochastic multi-objective mixed-integer optimization problem to determine the optimal supply chain design to maximize food accessibility and minimize cost. Considering uncertainty in demand and supply, the multi-objective problem is solved in a two-phase simulation-optimization framework in which Green Field Analysis is applied to determine the long-term, tactical decisions such as supply chain configuration, and then Monte Carlo simulation is performed iteratively to determine the short-term supply chain operations by solving a stochastic programming problem. A case study is conducted on the beef supply chain in Qatar. Pareto efficient solutions are validated in discrete event simulation to evaluate the performance of the designed supply chain in various realistic scenarios and provide recommendations for different decision-makers. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.04439 |