nep-cmp 2024-07-29 papers

on Computational Economics

Issue of 2024‒07‒29
37 papers chosen by
Stan Miles, Thompson Rivers University

GraphCNNpred: A stock market indices prediction using a Graph based deep learning system By Yuhui Jin
Machine Learning for Economic Forecasting: An Application to China's GDP Growth By Yanqing Yang; Xingcheng Xu; Jinfeng Ge; Yan Xu
Artificial Intelligence and Algorithmic Price Collusion in Two-sided Markets By Cristian Chica; Yinglong Guo; Gilad Lerman
Investigating Factors Influencing Dietary Quality in China: Machine Learning Approaches By Feng, Yuan; Liu, Shuang; Zhang, Man; Jin, Yanhong; Yu, Xiaohua
Using Machine Learning Method to Estimate the Heterogeneous Impacts of the Updated Nutrition Facts Panel By Zhang, Yuxiang; Liu, Yizao; Sears, James M.
$\text{Alpha}^2$: Discovering Logical Formulaic Alphas using Deep Reinforcement Learning By Feng Xu; Yan Yin; Xinyu Zhang; Tianyuan Liu; Shengyi Jiang; Zongzhang Zhang
F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data By Zexing Xu; Linjun Zhang; Sitan Yang; Rasoul Etesami; Hanghang Tong; Huan Zhang; Jiawei Han
What Teaches Robots to Walk, Teaches Them to Trade too -- Regime Adaptive Execution using Informed Data and LLMs By Raeid Saqur
Statistical arbitrage in multi-pair trading strategy based on graph clustering algorithms in US equities market By Adam Korniejczuk; Robert \'Slepaczuk
Developing an International Macroeconomic Forecasting Model Based on Big Data By Yoon, Sang-Ha
Predicting the Validity and Reliability of Survey Questions By Felderer, Barbara; Repke, Lydia; Weber, Wiebke; Schweisthal, jonas; Bothmann, Ludwig
Evolution of Spatial Drivers for Oil Palm Expansion over Time: Insights from Spatiotemporal Data and Machine Learning Models By Zhao, Jing; Cochrane, Mark; Zhang, Xin; Elmore, Andrew; Lee, Janice; Su, Ye
Improving Realized LGD Approximation: A Novel Framework with XGBoost for Handling Missing Cash-Flow Data By Zuzanna Kostecka; Robert \'Slepaczuk
LABOR-LLM: Language-Based Occupational Representations with Large Language Models By Tianyu Du; Ayush Kanodia; Herman Brunborg; Keyon Vafa; Susan Athey
Unveiling the predictive factors influencing consumers purchase intention towards biofortified products: A PLS-SEM model with agent-based simulation By Tan, Fuli; Wang, Jingjing; De Steur, Hans; Fan, Shenggen
Simple method for efficiently solving dynamic models with continuous actions using policy gradient By Takeshi Fukasawa
Breastfeeding and Child Development Outcomes across Early Childhood and Adolescence: Doubly Robust Estimation with Machine Learning By Khudri, Md Mohsan; Hussey, Andrew
The ESG Determinants of Mental Health Index Across Italian Regions: A Machine Learning Approach By Resta, Emanuela; Logroscino, Giancarlo; Tafuri, Silvio; Peter, Preethymol; Noviello, Chiara; Costantiello, Alberto; Leogrande, Angelo
Skills, Innovation, and Growth: An Agent-Based Policy Analysis By Dawid, Herbert; Gemkow, Simon; Harting, Philipp; Kabus, Kordian; Neugart, Michael; Wersching, Klaus
Stochastic Earned Value Analysis using Monte Carlo Simulation and Statistical Learning Techniques By Fernando Acebes; M Pereda; David Poza; Javier Pajares; Jose M Galan
Effects of technological change and automation on industry structure and (wage-)inequality: insights from a dynamic task-based model By Dawid, Herbert; Neugart, Michael
Learning control variables and instruments for causal analysis in observational data By Nicolas Apfel; Julia Hatamyar; Martin Huber; Jannis Kueck
Impact of Sentiment analysis on Energy Sector Stock Prices : A FinBERT Approach By Sarra Ben Yahia; Jose Angel Garcia Sanchez; Rania Hentati Kaffel
Artificial Intelligence Based Technologies and Economic Growth in a Creative Region By Batabyal, Amitrajeet; Kourtit, Karima; Nijkamp, Peter
Fiscal transfers and regional economic growth By Dawid, H.; Harting, P.; Neugart, M.
News Deja Vu: Connecting Past and Present with Semantic Search By Brevin Franklin; Emily Silcock; Abhishek Arora; Tom Bryan; Melissa Dell
Trading Devil: Robust backdoor attack via Stochastic investment models and Bayesian approach By Orson Mengara
Modelling Uncertain Volatility Using Quantum Stochastic Calculus: Unitary vs Non-Unitary Time Evolution By Will Hicks
Contrastive Entity Coreference and Disambiguation for Historical Texts By Abhishek Arora; Emily Silcock; Leander Heldring; Melissa Dell
Algorithmic Collusion And The Minimum Price Markov Game By Igor Sadoune; Marcelin Joanis; Andrea Lodi
Tracking Trends in Topics of Agricultural and Applied Economics Discourse over the Last Century Using Natural Language Processing By Lee, Jacob W.; Elliott, Brendan; Lam, Aaron; Gupta, Neha; Wilson, Norbert L.W.; Collins, Leslie M.; Mainsah, Boyla
Longitudinal market structure detection using a dynamic modularity-spectral algorithm By Philipp Wirth; Francesca Medda; Thomas Schr\"oder
Impact of the Availability of ChatGPT on Software Development: A Synthetic Difference in Differences Estimation using GitHub Data By Alexander Quispe; Rodrigo Grijalba
Optimal policy learning using Stata By Giovanni Cerulli
Reducing False Discoveries in Statistically-Significant Regional-Colocation Mining: A Summary of Results By Subhankar Ghosh; Jayant Gupta; Arun Sharma; Shuai An; Shashi Shekhar
Strategy-proof Selling: a Geometric Approach By Mridu Prabal Goswami
Testing for an Explosive Bubble using High-Frequency Volatility By H. Peter Boswijk; Jun Yu; Yang Zu

GraphCNNpred: A stock market indices prediction using a Graph based deep learning system

By:	Yuhui Jin
Abstract:	Deep learning techniques for predicting stock market prices is an popular topic in the field of data science. Customized feature engineering arises as pre-processing tools of different stock market dataset. In this paper, we give a graph neural network based convolutional neural network (CNN) model, that can be applied on diverse source of data, in the attempt to extract features to predict the trends of indices of \text{S}\&\text{P} 500, NASDAQ, DJI, NYSE, and RUSSEL.
Date:	2024–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2407.03760

Machine Learning for Economic Forecasting: An Application to China's GDP Growth

By:	Yanqing Yang; Xingcheng Xu; Jinfeng Ge; Yan Xu
Abstract:	This paper aims to explore the application of machine learning in forecasting Chinese macroeconomic variables. Specifically, it employs various machine learning models to predict the quarterly real GDP growth of China, and analyzes the factors contributing to the performance differences among these models. Our findings indicate that the average forecast errors of machine learning models are generally lower than those of traditional econometric models or expert forecasts, particularly in periods of economic stability. However, during certain inflection points, although machine learning models still outperform traditional econometric models, expert forecasts may exhibit greater accuracy in some instances due to experts' more comprehensive understanding of the macroeconomic environment and real-time economic variables. In addition to macroeconomic forecasting, this paper employs interpretable machine learning methods to identify the key attributive variables from different machine learning models, aiming to enhance the understanding and evaluation of their contributions to macroeconomic fluctuations.
Date:	2024–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2407.03595

Artificial Intelligence and Algorithmic Price Collusion in Two-sided Markets

By:	Cristian Chica; Yinglong Guo; Gilad Lerman
Abstract:	Algorithmic price collusion facilitated by artificial intelligence (AI) algorithms raises significant concerns. We examine how AI agents using Q-learning engage in tacit collusion in two-sided markets. Our experiments reveal that AI-driven platforms achieve higher collusion levels compared to Bertrand competition. Increased network externalities significantly enhance collusion, suggesting AI algorithms exploit them to maximize profits. Higher user heterogeneity or greater utility from outside options generally reduce collusion, while higher discount rates increase it. Tacit collusion remains feasible even at low discount rates. To mitigate collusive behavior and inform potential regulatory measures, we propose incorporating a penalty term in the Q-learning algorithm.
Date:	2024–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2407.04088

Investigating Factors Influencing Dietary Quality in China: Machine Learning Approaches

By: Feng, Yuan; Liu, Shuang; Zhang, Man; Jin, Yanhong; Yu, Xiaohua

Keywords: Food Consumption/Nutrition/Food Safety

Date: 2024

URL: https://d.repec.org/n?u=RePEc:ags:aaea22:343836

Using Machine Learning Method to Estimate the Heterogeneous Impacts of the Updated Nutrition Facts Panel

By:	Zhang, Yuxiang; Liu, Yizao; Sears, James M.
Keywords:	Food Consumption/Nutrition/Food Safety, Health Economics And Policy, Consumer/ Household Economics
Date:	2024
URL:	https://d.repec.org/n?u=RePEc:ags:aaea22:343727

$\text{Alpha}^2$: Discovering Logical Formulaic Alphas using Deep Reinforcement Learning

By:	Feng Xu; Yan Yin; Xinyu Zhang; Tianyuan Liu; Shengyi Jiang; Zongzhang Zhang
Abstract:	Alphas are pivotal in providing signals for quantitative trading. The industry highly values the discovery of formulaic alphas for their interpretability and ease of analysis, compared with the expressive yet overfitting-prone black-box alphas. In this work, we focus on discovering formulaic alphas. Prior studies on automatically generating a collection of formulaic alphas were mostly based on genetic programming (GP), which is known to suffer from the problems of being sensitive to the initial population, converting to local optima, and slow computation speed. Recent efforts employing deep reinforcement learning (DRL) for alpha discovery have not fully addressed key practical considerations such as alpha correlations and validity, which are crucial for their effectiveness. In this work, we propose a novel framework for alpha discovery using DRL by formulating the alpha discovery process as program construction. Our agent, $\text{Alpha}^2$, assembles an alpha program optimized for an evaluation metric. A search algorithm guided by DRL navigates through the search space based on value estimates for potential alpha outcomes. The evaluation metric encourages both the performance and the diversity of alphas for a better final trading strategy. Our formulation of searching alphas also brings the advantage of pre-calculation dimensional analysis, ensuring the logical soundness of alphas, and pruning the vast search space to a large extent. Empirical experiments on real-world stock markets demonstrates $\text{Alpha}^2$'s capability to identify a diverse set of logical and effective alphas, which significantly improves the performance of the final trading strategy. The code of our method is available at https://github.com/x35f/alpha2.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.16505

F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data

By:	Zexing Xu; Linjun Zhang; Sitan Yang; Rasoul Etesami; Hanghang Tong; Huan Zhang; Jiawei Han
Abstract:	Demand prediction is a crucial task for e-commerce and physical retail businesses, especially during high-stake sales events. However, the limited availability of historical data from these peak periods poses a significant challenge for traditional forecasting methods. In this paper, we propose a novel approach that leverages strategically chosen proxy data reflective of potential sales patterns from similar entities during non-peak periods, enriched by features learned from a graph neural networks (GNNs)-based forecasting model, to predict demand during peak events. We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm that leverages proxy data from non-peak periods and GNN-generated relational metadata to learn feature-specific layer parameters, thereby adapting to demand forecasts for peak events. Theoretically, we show that by considering domain similarities through task-specific metadata, our model achieves improved generalization, where the excess risk decreases as the number of training tasks increases. Empirical evaluations on large-scale industrial datasets demonstrate the superiority of our approach. Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.16221

What Teaches Robots to Walk, Teaches Them to Trade too -- Regime Adaptive Execution using Informed Data and LLMs

By:	Raeid Saqur
Abstract:	Machine learning techniques applied to the problem of financial market forecasting struggle with dynamic regime switching, or underlying correlation and covariance shifts in true (hidden) market variables. Drawing inspiration from the success of reinforcement learning in robotics, particularly in agile locomotion adaptation of quadruped robots to unseen terrains, we introduce an innovative approach that leverages world knowledge of pretrained LLMs (aka. 'privileged information' in robotics) and dynamically adapts them using intrinsic, natural market rewards using LLM alignment technique we dub as "Reinforcement Learning from Market Feedback" (RLMF). Strong empirical results demonstrate the efficacy of our method in adapting to regime shifts in financial markets, a challenge that has long plagued predictive models in this domain. The proposed algorithmic framework outperforms best-performing SOTA LLM models on the existing (FLARE) benchmark stock-movement (SM) tasks by more than 15\% improved accuracy. On the recently proposed NIFTY SM task, our adaptive policy outperforms the SOTA best performing trillion parameter models like GPT-4. The paper details the dual-phase, teacher-student architecture and implementation of our model, the empirical results obtained, and an analysis of the role of language embeddings in terms of Information Gain.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.15508

Statistical arbitrage in multi-pair trading strategy based on graph clustering algorithms in US equities market

By:	Adam Korniejczuk; Robert \'Slepaczuk
Abstract:	The study seeks to develop an effective strategy based on the novel framework of statistical arbitrage based on graph clustering algorithms. Amalgamation of quantitative and machine learning methods, including the Kelly criterion, and an ensemble of machine learning classifiers have been used to improve risk-adjusted returns and increase immunity to transaction costs over existing approaches. The study seeks to provide an integrated approach to optimal signal detection and risk management. As a part of this approach, innovative ways of optimizing take profit and stop loss functions for daily frequency trading strategies have been proposed and tested. All of the tested approaches outperformed appropriate benchmarks. The best combinations of the techniques and parameters demonstrated significantly better performance metrics than the relevant benchmarks. The results have been obtained under the assumption of realistic transaction costs, but are sensitive to changes in some key parameters.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.10695

Developing an International Macroeconomic Forecasting Model Based on Big Data

By:	Yoon, Sang-Ha (KOREA INSTITUTE FOR INTERNATIONAL ECONOMIC POLICY (KIEP))
Abstract:	In the era of big data, economists are exploring new data sources and methodologies to improve economic forecasting. This study examines the potential of big data and machine learning in enhancing the predictive power of international macroeconomic forecasting models. The research utilizes both structured and unstructured data to forecast Korea's GDP growth rate. For structured data, around 200 macroeconomic and financial indicators from Korea and the U.S. were used with machine learning techniques (Random Forest, XGBoost, LSTM) and ensemble models. Results show that machine learning generally outperforms traditional econometric models, particularly for one-quarter-ahead forecasts, although performance varies by country and period. For unstructured data, the study uses Naver search data as a proxy for public sentiment. Using Dynamic Model Averaging and Selection (DMA and DMS) techniques, it incorporates eight Naver search indices alongside traditional macroeconomic variables. The findings suggest that online search data improves predictive power, especially in capturing economic turning points. The study also compares these big data-driven models with a Dynamic Stochastic General Equilibrium (DSGE) model. While DSGE offers policy analysis capabilities, its in-sample forecasts make direct comparison difficult. However, DMA and DMS models using search indices seem to better capture the GDP plunge in 2020. Based on the research findings, the author offers several suggestions to maximize the potential of big data. He stresses the importance of discovering and constructing diverse data sources, while also developing new analytical techniques such as machine learning. Furthermore, he suggests that big data models can be used as auxiliary indicators to complement existing forecasting models, and proposes that combining structural models with big data methodologies could create synergistic effects. Lastly, by using text mining on various online sources to build comprehensive databases, we can secure richer and more real-time economic data. These suggestions demonstrate the significant potential of big data in improving the accuracy of international macroeconomic forecasting, particularly emphasizing its effectiveness in situations where the economy is undergoing rapid changes.
Keywords:	International Macroeconomic Forecasting Model; Big Data
Date:	2024–06–14
URL:	https://d.repec.org/n?u=RePEc:ris:kiepwe:2024_018

Predicting the Validity and Reliability of Survey Questions

By:	Felderer, Barbara; Repke, Lydia; Weber, Wiebke; Schweisthal, jonas; Bothmann, Ludwig
Abstract:	The Survey Quality Predictor (SQP) is an open-access system to predict the quality, i.e., the reliability and validity, of survey questions based on the characteristics of the questions. The prediction is based on a meta-regression of many multitrait-multimethod (MTMM) experiments in which characteristics of the survey questions were systematically varied. The release of SQP 3.0 that is based on an expanded data base as compared to previous SQP versions raised the need for a new meta-regression. To find the best method for analyzing the complex data structure of SQP (e.g., the existence of various uncorrelated predictors), we compared four suitable machine learning methods in terms of their ability to predict both survey quality indicators: LASSO, elastic net, boosting and random forest. The article discusses the performance of the models and illustrates the importance of the individual item characteristics in the random forest model, which was chosen for SQP 3.0.
Date:	2024–06–27
URL:	https://d.repec.org/n?u=RePEc:osf:osfxxx:hkngd

Evolution of Spatial Drivers for Oil Palm Expansion over Time: Insights from Spatiotemporal Data and Machine Learning Models

By:	Zhao, Jing; Cochrane, Mark; Zhang, Xin; Elmore, Andrew; Lee, Janice; Su, Ye
Keywords:	Land Economics/Use, Environmental Economics And Policy, Community/Rural/Urban Development
Date:	2024
URL:	https://d.repec.org/n?u=RePEc:ags:aaea22:344016

Improving Realized LGD Approximation: A Novel Framework with XGBoost for Handling Missing Cash-Flow Data

By:	Zuzanna Kostecka; Robert \'Slepaczuk
Abstract:	The scope for the accurate calculation of the Loss Given Default (LGD) parameter is comprehensive in terms of financial data. In this research, we aim to explore methods for improving the approximation of realized LGD in conditions of limited access to the cash-flow data. We enhance the performance of the method which relies on the differences between exposure values (delta outstanding approach) by employing machine learning (ML) techniques. The research utilizes the data from the mortgage portfolio of one of the European countries and assumes a close resemblance to similar economic contexts. It incorporates non-financial variables and macroeconomic data related to the housing market, improving the accuracy of loss severity approximation. The proposed methodology attempts to mitigate the country-specific (related to the local legal) or portfolio-specific factors in aim to show the general advantage of applying ML techniques, rather than case-specific relation. We developed an XGBoost model that does not rely on cash-flow data yet enhances the accuracy of realized LGD estimation compared to results obtained with the delta outstanding approach. A novel aspect of our work is the detailed exploration of the delta outstanding approach and the methodology for addressing conditions of limited access to cash-flow data through machine learning models.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.17308

LABOR-LLM: Language-Based Occupational Representations with Large Language Models

By:	Tianyu Du; Ayush Kanodia; Herman Brunborg; Keyon Vafa; Susan Athey
Abstract:	Many empirical studies of labor market questions rely on estimating relatively simple predictive models using small, carefully constructed longitudinal survey datasets based on hand-engineered features. Large Language Models (LLMs), trained on massive datasets, encode vast quantities of world knowledge and can be used for the next job prediction problem. However, while an off-the-shelf LLM produces plausible career trajectories when prompted, the probability with which an LLM predicts a particular job transition conditional on career history will not, in general, align with the true conditional probability in a given population. Recently, Vafa et al. (2024) introduced a transformer-based "foundation model", CAREER, trained using a large, unrepresentative resume dataset, that predicts transitions between jobs; it further demonstrated how transfer learning techniques can be used to leverage the foundation model to build better predictive models of both transitions and wages that reflect conditional transition probabilities found in nationally representative survey datasets. This paper considers an alternative where the fine-tuning of the CAREER foundation model is replaced by fine-tuning LLMs. For the task of next job prediction, we demonstrate that models trained with our approach outperform several alternatives in terms of predictive performance on the survey data, including traditional econometric models, CAREER, and LLMs with in-context learning, even though the LLM can in principle predict job titles that are not allowed in the survey data. Further, we show that our fine-tuned LLM-based models' predictions are more representative of the career trajectories of various workforce subpopulations than off-the-shelf LLM models and CAREER. We conduct experiments and analyses that highlight the sources of the gains in the performance of our models for representative predictions.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.17972

Unveiling the predictive factors influencing consumers purchase intention towards biofortified products: A PLS-SEM model with agent-based simulation

By:	Tan, Fuli; Wang, Jingjing; De Steur, Hans; Fan, Shenggen
Keywords:	Marketing, Consumer/ Household Economics, Food Consumption/Nutrition/Food Safety
Date:	2024
URL:	https://d.repec.org/n?u=RePEc:ags:aaea22:344020

Simple method for efficiently solving dynamic models with continuous actions using policy gradient

By:	Takeshi Fukasawa
Abstract:	This study proposes the Value Function-Policy Gradient Iteration-Spectral (VF-PGI-Spectral) algorithm, which efficiently solves discrete-time infinite-horizon dynamic models with continuous actions. It combines the spectral algorithm to accelerate convergence. The method is applicable not only to single-agent dynamic optimization problems, but also to multi-agent dynamic games, which previously proposed methods cannot deal with. Moreover, the proposed algorithm is not limited to models with specific functional forms, and applies to models with multiple continuous actions. This study shows the results of numerical experiments, showing the effective performance of the proposed algorithm.
Date:	2024–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2407.04227

Breastfeeding and Child Development Outcomes across Early Childhood and Adolescence: Doubly Robust Estimation with Machine Learning

By:	Khudri, Md Mohsan (Austin Community College); Hussey, Andrew (University of Memphis)
Abstract:	Using data from the Panel Study of Income Dynamics, we estimate the impact of breastfeeding initiation and duration on multiple cognitive, health, and behavioral outcomes spanning early childhood through adolescence. To mitigate the potential bias from misspecification, we employ a doubly robust (DR) estimation method, addressing misspecification in either the treatment or outcome models while adjusting for selection effects. Our novel approach is to use and evaluate a battery of supervised machine learning (ML) algorithms to improve propensity score (PS) estimates. We demonstrate that the gradient boosting machine (GBM) algorithm removes bias more effectively and minimizes other prediction errors compared to logit and probit models as well as alternative ML algorithms. Across all outcomes, our DR-GBM estimation generally yields lower estimates than OLS, DR, and PS matching using standard and alternative ML algorithms and even sibling fixed effects estimates. We find that having been breastfed is significantly linked to multiple improved early cognitive outcomes, though the impact reduces somewhat with age. In contrast, we find mixed evidence regarding the impact of breastfeeding on non-cognitive (health and behavioral) outcomes, with effects being most pronounced in adolescence. Our results also suggest relatively higher cognitive benefits for children of minority mothers and children of mothers with at least some post-high school education, and minimal marginal benefits of breastfeeding duration beyond 12 months for cognitive outcomes and 6 months for non-cognitive outcomes.
Keywords:	breastfeeding, human capital, cognitive and non-cognitive outcomes, doubly robust estimation, machine learning
JEL:	I12 I18 J13 J24 C21 C63
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:iza:izadps:dp17080

The ESG Determinants of Mental Health Index Across Italian Regions: A Machine Learning Approach

By:	Resta, Emanuela; Logroscino, Giancarlo; Tafuri, Silvio; Peter, Preethymol; Noviello, Chiara; Costantiello, Alberto; Leogrande, Angelo
Abstract:	The following article analyses the relationship between the mental health index and the variables of the Environment, Social and Governance-ESG model in the Italian regions between 2004 and 2023. First of all, a static analysis is proposed aimed at identifying trends relating to mental health in the Italian regions with indication of the regional gaps. Subsequently, a clustering with k-Means algorithm is proposed. Below is a comparison of 11 machine learning algorithms for predicting the performance of the mental health index. Finally, the article offers some economic policy suggestions. The results are critically discussed in light of the scientific literature
Keywords:	Mental Health Index, Machine Learning, ESG, Regional Inequalities
JEL:	I11 I12 I13 I14 I15 I18
Date:	2024–06–14
URL:	https://d.repec.org/n?u=RePEc:pra:mprapa:121204

Skills, Innovation, and Growth: An Agent-Based Policy Analysis

By:	Dawid, Herbert; Gemkow, Simon; Harting, Philipp; Kabus, Kordian; Neugart, Michael; Wersching, Klaus
Abstract:	We develop an agent-based macroeconomic model featuring a distinct geographical dimension and heterogeneous workers with respect to skill types. The model, which will become part of a larger simulation platform for European policymaking (EURACE), allows us to conduct exante evaluations of a wide range of public policy measures and their interaction. In particular, we study the growth and labor market effects of various policy types that promote workers’ general skill levels. Using a calibrated model it is examined in how far effects differ if spending is uniformly spread over all regions in the economy or focused in one particular region. We find that the geographic distribution of policy measures significantly affects the effects of the policy even if total spending is kept constant. Focussing training efforts in one region is the worst policy outcome while spreading funds equally across regions generates a larger output in the long-run but not in the short-run.
Date:	2024–07–01
URL:	https://d.repec.org/n?u=RePEc:dar:wpaper:146365

Stochastic Earned Value Analysis using Monte Carlo Simulation and Statistical Learning Techniques

By:	Fernando Acebes; M Pereda; David Poza; Javier Pajares; Jose M Galan
Abstract:	The aim of this paper is to describe a new an integrated methodology for project control under uncertainty. This proposal is based on Earned Value Methodology and risk analysis and presents several refinements to previous methodologies. More specifically, the approach uses extensive Monte Carlo simulation to obtain information about the expected behavior of the project. This dataset is exploited in several ways using different statistical learning methodologies in a structured fashion. Initially, simulations are used to detect if project deviations are a consequence of the expected variability using Anomaly Detection algorithms. If the project follows this expected variability, probabilities of success in cost and time and expected cost and total duration of the project can be estimated using classification and regression approaches.
Date:	2024–05
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.02589

Effects of technological change and automation on industry structure and (wage-)inequality: insights from a dynamic task-based model

By:	Dawid, Herbert; Neugart, Michael
Abstract:	The advent of artificial intelligence is changing the task allocation of workers and machines in firms’ production processes with potentially wide ranging effects on workers and firms. We develop an agent-based simulation framework to investigate the consequences of different types of automation for industry output, the wage distribution, the labor share, and industry dynamics. It is shown how the competitiveness of markets, in particular barriers to entry, changes the effects that automation has on various outcome variables, and to which extent heterogeneous workers with distinct general skill endowments and heterogeneous firms featuring distinct wage offer rules affect the channels via which automation changes market outcomes.
Date:	2024–06–25
URL:	https://d.repec.org/n?u=RePEc:dar:wpaper:146300

Learning control variables and instruments for causal analysis in observational data

By:	Nicolas Apfel; Julia Hatamyar; Martin Huber; Jannis Kueck
Abstract:	This study introduces a data-driven, machine learning-based method to detect suitable control variables and instruments for assessing the causal effect of a treatment on an outcome in observational data, if they exist. Our approach tests the joint existence of instruments, which are associated with the treatment but not directly with the outcome (at least conditional on observables), and suitable control variables, conditional on which the treatment is exogenous, and learns the partition of instruments and control variables from the observed data. The detection of sets of instruments and control variables relies on the condition that proper instruments are conditionally independent of the outcome given the treatment and suitable control variables. We establish the consistency of our method for detecting control variables and instruments under certain regularity conditions, investigate the finite sample performance through a simulation study, and provide an empirical application to labor market data from the Job Corps study.
Date:	2024–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2407.04448

Impact of Sentiment analysis on Energy Sector Stock Prices : A FinBERT Approach

By:	Sarra Ben Yahia (CES - Centre d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique); Jose Angel Garcia Sanchez (CES - Centre d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique); Rania Hentati Kaffel (CES - Centre d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique)
Abstract:	This study provides sentiment analysis model to enhance market return forecasts by considering investor sentiment from social media platforms like Twitter (X). We leverage advanced NLP techniques and large language models to analyze sentiment from financial tweets. We use a large web-scrapped data of selected energy stock daily returns spanning from 2018 to 2023. Sentiment scores derived from FinBERT are integrated into a novel predictive model (SIMDM) to evaluate autocorrelation structures within both the sentiment scores and stock returns data. Our findings reveal i) significant correlations between sentiment scores and stock prices. ii) Results are highly sensitive to data quality. iii) Our study reinforces the concept of market efficiency and offers empirical evidence regarding the delayed influence of emotional states on stock returns.
Keywords:	financial NLP finBERT information extraction webscraping sentiment analysis, financial NLP, finBERT, information extraction, webscraping, sentiment analysis, LLM, Deep learing
Date:	2024–06–30
URL:	https://d.repec.org/n?u=RePEc:hal:cesptp:hal-04629569

Artificial Intelligence Based Technologies and Economic Growth in a Creative Region

By:	Batabyal, Amitrajeet; Kourtit, Karima; Nijkamp, Peter
Abstract:	We analyze economic growth in a stylized, high-tech region A with two key features. First, the residents of this region are high-tech because they possess skills. In the language of Richard Florida, these residents comprise the region’s creative class and they possess creative capital. Second, the region is high-tech because it uses an artificial intelligence (AI)-based technology and we model the use of this technology. In this setting, we first derive expressions for three growth metrics. Second, we use these metrics to show that the economy of A converges to a balanced growth path (BGP). Third, we compute the growth rate of output per effective creative capital unit on this BGP. Fourth, we study how heterogeneity in initial conditions influences outcomes on the BGP by introducing a second high-tech region B into the analysis. At time t=0, two key savings rates in A are twice as large as in B. We compute the ratio of the BGP value of income per effective creative capital unit in A to its value in B. Finally, we compute the ratio of the BGP value of skills per effective creative capital unit in A to its value in B.
Keywords:	Artificial Intelligence, Creative Capital, Regional Economic Growth, Skills
JEL:	O33 R11
Date:	2023–12–11
URL:	https://d.repec.org/n?u=RePEc:pra:mprapa:121328

Fiscal transfers and regional economic growth

By:	Dawid, H.; Harting, P.; Neugart, M.
Abstract:	In the aftermath of the financial crisis, with periphery countries in the European Union falling even more behind the core countries economically, there have been quests for various kinds of fiscal policies in order to revert divergence. How these policies would unfold and perform comparatively is largely unknown. We analyze four such stylized policies in an agent-based macroeconomic model and study the economic mechanisms behind their relative success. Our main findings are that the core country sharing the debt burden of the periphery country has almost no effect on the growth dynamics of that region, fiscal transfers have a positive short- and long-run impact on per-capita consumption in the target region, and that technology-oriented firm subsidies have the strongest positive long-run impact on competitiveness of the periphery country at which they are targeted. The positive effect of the technology-oriented policy is reinforced if combined with household transfers.
Date:	2024–06–25
URL:	https://d.repec.org/n?u=RePEc:dar:wpaper:146302

News Deja Vu: Connecting Past and Present with Semantic Search

By:	Brevin Franklin; Emily Silcock; Abhishek Arora; Tom Bryan; Melissa Dell
Abstract:	Social scientists and the general public often analyze contemporary events by drawing parallels with the past, a process complicated by the vast, noisy, and unstructured nature of historical texts. For example, hundreds of millions of page scans from historical newspapers have been noisily transcribed. Traditional sparse methods for searching for relevant material in these vast corpora, e.g., with keywords, can be brittle given complex vocabularies and OCR noise. This study introduces News Deja Vu, a novel semantic search tool that leverages transformer large language models and a bi-encoder approach to identify historical news articles that are most similar to modern news queries. News Deja Vu first recognizes and masks entities, in order to focus on broader parallels rather than the specific named entities being discussed. Then, a contrastively trained, lightweight bi-encoder retrieves historical articles that are most similar semantically to a modern query, illustrating how phenomena that might seem unique to the present have varied historical precedents. Aimed at social scientists, the user-friendly News Deja Vu package is designed to be accessible for those who lack extensive familiarity with deep learning. It works with large text datasets, and we show how it can be deployed to a massive scale corpus of historical, open-source news articles. While human expertise remains important for drawing deeper insights, News Deja Vu provides a powerful tool for exploring parallels in how people have perceived past and present.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.15593

Trading Devil: Robust backdoor attack via Stochastic investment models and Bayesian approach

By:	Orson Mengara
Abstract:	With the growing use of voice-activated systems and speech recognition technologies, the danger of backdoor attacks on audio data has grown significantly. This research looks at a specific type of attack, known as a Stochastic investment-based backdoor attack (MarketBack), in which adversaries strategically manipulate the stylistic properties of audio to fool speech recognition systems. The security and integrity of machine learning models are seriously threatened by backdoor attacks, in order to maintain the reliability of audio applications and systems, the identification of such attacks becomes crucial in the context of audio data. Experimental results demonstrated that MarketBack is feasible to achieve an average attack success rate close to 100% in seven victim models when poisoning less than 1% of the training data.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.10719

Modelling Uncertain Volatility Using Quantum Stochastic Calculus: Unitary vs Non-Unitary Time Evolution

By:	Will Hicks
Abstract:	In this article we look at stochastic processes with uncertain parameters, and consider different ways in which information is obtained when carrying out observations. For example we focus on the case of a the random evolution of a traded financial asset price with uncertain volatility. The quantum approach presented, allows us to encode different volatility levels in a state acting on a Hilbert space. We consider different means of defining projective measurements in order to track the evolution of a traded market price, and discuss the results of different Monte-Carlo simulations.
Date:	2024–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2407.04520

Contrastive Entity Coreference and Disambiguation for Historical Texts

By:	Abhishek Arora; Emily Silcock; Leander Heldring; Melissa Dell
Abstract:	Massive-scale historical document collections are crucial for social science research. Despite increasing digitization, these documents typically lack unique cross-document identifiers for individuals mentioned within the texts, as well as individual identifiers from external knowledgebases like Wikipedia/Wikidata. Existing entity disambiguation methods often fall short in accuracy for historical documents, which are replete with individuals not remembered in contemporary knowledgebases. This study makes three key contributions to improve cross-document coreference resolution and disambiguation in historical texts: a massive-scale training dataset replete with hard negatives - that sources over 190 million entity pairs from Wikipedia contexts and disambiguation pages - high-quality evaluation data from hand-labeled historical newswire articles, and trained models evaluated on this historical benchmark. We contrastively train bi-encoder models for coreferencing and disambiguating individuals in historical texts, achieving accurate, scalable performance that identifies out-of-knowledgebase individuals. Our approach significantly surpasses other entity disambiguation models on our historical newswire benchmark. Our models also demonstrate competitive performance on modern entity disambiguation benchmarks, particularly certain news disambiguation datasets.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.15576

Algorithmic Collusion And The Minimum Price Markov Game

By:	Igor Sadoune; Marcelin Joanis; Andrea Lodi
Abstract:	This paper introduces the Minimum Price Markov Game (MPMG), a dynamic variant of the Prisoner's Dilemma. The MPMG serves as a theoretical model and reasonable approximation of real-world first-price sealed-bid public auctions that follow the minimum price rule. The goal is to provide researchers and practitioners with a framework to study market fairness and regulation in both digitized and non-digitized public procurement processes, amidst growing concerns about algorithmic collusion in online markets. We demonstrate, using multi-agent reinforcement learning-driven artificial agents, that algorithmic tacit coordination is difficult to achieve in the MPMG when cooperation is not explicitly engineered. Paradoxically, our results highlight the robustness of the minimum price rule in an auction environment, but also show that it is not impervious to full-scale algorithmic collusion. These findings contribute to the ongoing debates about algorithmic pricing and its implications.
Date:	2024–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2407.03521

Tracking Trends in Topics of Agricultural and Applied Economics Discourse over the Last Century Using Natural Language Processing

By:	Lee, Jacob W.; Elliott, Brendan; Lam, Aaron; Gupta, Neha; Wilson, Norbert L.W.; Collins, Leslie M.; Mainsah, Boyla
Keywords:	Research Methods/Statistical Methods, Agricultural And Food Policy, Teaching/Communication/Extension/Profession
Date:	2024
URL:	https://d.repec.org/n?u=RePEc:ags:aaea22:343814

Longitudinal market structure detection using a dynamic modularity-spectral algorithm

By:	Philipp Wirth; Francesca Medda; Thomas Schr\"oder
Abstract:	In this paper, we introduce the Dynamic Modularity-Spectral Algorithm (DynMSA), a novel approach to identify clusters of stocks with high intra-cluster correlations and low inter-cluster correlations by combining Random Matrix Theory with modularity optimisation and spectral clustering. The primary objective is to uncover hidden market structures and find diversifiers based on return correlations, thereby achieving a more effective risk-reducing portfolio allocation. We applied DynMSA to constituents of the S&P 500 and compared the results to sector- and market-based benchmarks. Besides the conception of this algorithm, our contributions further include implementing a sector-based calibration for modularity optimisation and a correlation-based distance function for spectral clustering. Testing revealed that DynMSA outperforms baseline models in intra- and inter-cluster correlation differences, particularly over medium-term correlation look-backs. It also identifies stable clusters and detects regime changes due to exogenous shocks, such as the COVID-19 pandemic. Portfolios constructed using our clusters showed higher Sortino and Sharpe ratios, lower downside volatility, reduced maximum drawdown and higher annualised returns compared to an equally weighted market benchmark.
Date:	2024–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2407.04500

Impact of the Availability of ChatGPT on Software Development: A Synthetic Difference in Differences Estimation using GitHub Data

By:	Alexander Quispe; Rodrigo Grijalba
Abstract:	Advancements in Artificial Intelligence, particularly with ChatGPT, have significantly impacted software development. Utilizing novel data from GitHub Innovation Graph, we hypothesize that ChatGPT enhances software production efficiency. Utilizing natural experiments where some governments banned ChatGPT, we employ Difference-in-Differences (DID), Synthetic Control (SC), and Synthetic Difference-in-Differences (SDID) methods to estimate its effects. Our findings indicate a significant positive impact on the number of git pushes, repositories, and unique developers per 100, 000 people, particularly for high-level, general purpose, and shell scripting languages. These results suggest that AI tools like ChatGPT can substantially boost developer productivity, though further analysis is needed to address potential downsides such as low quality code and privacy concerns.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.11046

Optimal policy learning using Stata

By:	Giovanni Cerulli (IRcRES, Rome)
Abstract:	This presentation introduces the Stata package opl for optimal policy learning, facilitating ex ante policy impact evaluation within the Stata environment. Despite theoretical progress, practical implementations of policy-learning algorithms are still poor within popular statistical software. To address this limitation, the package implements three popular policy learning algorithms in Stata (threshold-based, linear-combination, and Fxed-depth decision tree), and provides practical demonstrations of them using a real database. Also, I present a policy scenario development proposing a menu strategy, which is particularly useful when selection variables are affected by welfare monotonicity. Overall, the package contributes to bridging the gap between theoretical advancements and practical applications of policy learning.
Date:	2024–05–09
URL:	https://d.repec.org/n?u=RePEc:boc:isug24:02

Reducing False Discoveries in Statistically-Significant Regional-Colocation Mining: A Summary of Results

By:	Subhankar Ghosh; Jayant Gupta; Arun Sharma; Shuai An; Shashi Shekhar
Abstract:	Given a set \emph{S} of spatial feature types, its feature instances, a study area, and a neighbor relationship, the goal is to find pairs $ $ such that \emph{C} is a statistically significant regional-colocation pattern in $r_{g}$. This problem is important for applications in various domains including ecology, economics, and sociology. The problem is computationally challenging due to the exponential number of regional colocation patterns and candidate regions. Previously, we proposed a miner \cite{10.1145/3557989.3566158} that finds statistically significant regional colocation patterns. However, the numerous simultaneous statistical inferences raise the risk of false discoveries (also known as the multiple comparisons problem) and carry a high computational cost. We propose a novel algorithm, namely, multiple comparisons regional colocation miner (MultComp-RCM) which uses a Bonferroni correction. Theoretical analysis, experimental evaluation, and case study results show that the proposed method reduces both the false discovery rate and computational cost.
Date:	2024–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2407.02536

Strategy-proof Selling: a Geometric Approach

By:	Mridu Prabal Goswami
Abstract:	We consider one buyer and one seller. For a bundle $(t, q)\in [0, \infty[\times [0, 1]=\mathbb{Z}$, $q$ either refers to the wining probability of an object or a share of a good, and $t$ denotes the payment that the buyer makes. We define classical and restricted classical preferences of the buyer on $\mathbb{Z}$; they incorporate quasilinear, non-quasilinear, risk averse preferences with multidimensional pay-off relevant parameters. We define rich single-crossing subsets of the two classes, and characterize strategy-proof mechanisms by using monotonicity of the mechanisms and continuity of the indirect preference correspondences. We also provide a computationally tractable optimization program to compute the optimal mechanism. We do not use revenue equivalence and virtual valuations as tools in our proofs. Our proof techniques bring out the geometric interaction between the single-crossing property and the positions of bundles $(t, q)$s. Our proofs are simple and provide computationally tractable optimization program to compute the optimal mechanism. The extension of the optimization program to the $n-$ buyer environment is immediate.
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2406.12279

Testing for an Explosive Bubble using High-Frequency Volatility

By:	H. Peter Boswijk (Amsterdam School of Economics, University of Amsterdam); Jun Yu (Department of Finance and Business Economics, Faculty of Business Administration, University of Macau); Yang Zu (Department of Economics, University of Macau)
Abstract:	Based on a continuous-time stochastic volatility model with a linear drift, we develop a test for explosive behavior in financial asset prices at a low frequency when prices are sampled at a higher frequency. The test exploits the volatility information in the high-frequency data. The method consists of devolatizing log-asset price increments with realized volatility measures and performing a supremumtype recursive Dickey-Fuller test on the devolatized sample. The proposed test has a nuisance-parameter-free asymptotic distribution and is easy to implement. We study the size and power properties of the test in Monte Carlo simulations. A realtime date-stamping strategy based on the devolatized sample is proposed for the origination and conclusion dates of the explosive regime. Conditions under which the real-time date-stamping strategy is consistent are established. The test and the date-stamping strategy are applied to study explosive behavior in cryptocurrency and stock markets.
Keywords:	Stochastic volatility model; Unit root test; Double asymptotics; Explosiveness; Asset price bubbles
JEL:	C12 C22 G01
Date:	2024–06
URL:	https://d.repec.org/n?u=RePEc:boa:wpaper:202402

This nep-cmp issue is ©2024 by Stan Miles. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.

By:	Feng, Yuan; Liu, Shuang; Zhang, Man; Jin, Yanhong; Yu, Xiaohua
Keywords:	Food Consumption/Nutrition/Food Safety
Date:	2024
URL:	https://d.repec.org/n?u=RePEc:ags:aaea22:343836