nep-big New Economics Papers
on Big Data
Issue of 2024‒01‒15
twenty-six papers chosen by
Tom Coupé, University of Canterbury


  1. Online Job Posts Contain Very Little Wage Information By Honey Batra; Amanda Michaud; Simon Mongey
  2. COVID-19, School Closures and (Cyber)Bullying in Germany By Rahlff, Helen; Rinne, Ulf; Sonnabend, Hendrik
  3. AI and Jobs: Has the Inflection Point Arrived? Evidence from an Online Labor Platform By Dandan Qiao; Huaxia Rui; Qian Xiong
  4. Nowcasting Madagascar's real GDP using machine learning algorithms By Ramaharo, Franck Maminirina; Rasolofomanana, Gerzhino H
  5. chatReport: Democratizing Sustainability Disclosure Analysis through LLM-based Tools By Jingwei Ni; Julia Bingler; Chiara Colesanti Senni; Mathias Kraus; Glen Gostlow; Tobias Schimanski; Dominik Stammbach; Saeid Vaghefi; Qian Wang; Nicolas Webersinke; Tobias Wekhof; Tingyu Yu; Markus Leippold
  6. A Comprehensive Machine Learning Framework for Dynamic Portfolio Choice With Transaction Costs By Luca Gaegauf; Simon Scheidegger; Fabio Trojani
  7. Do LLM Agents Exhibit Social Behavior? By Yan Leng; Yuan Yuan
  8. Vox Populi, Vox AI? Using Language Models to Estimate German Public Opinion By von der Heyde, Leah; Haensch, Anna-Carolina; Wenz, Alexander
  9. Integrating New Technologies into Science: The case of AI By Stefano Bianchini; Moritz M\"uller; Pierre Pelletier
  10. The Role of Economic News in Predicting Suicides By Francesco Moscone; Elisa Tosetti; Giorgio Vittadini
  11. Churn Prediction via Multimodal Fusion Learning: Integrating Customer Financial Literacy, Voice, and Behavioral Data By David Hason Rudd; Huan Huo; Md. Rafiqul Islam; Guandong Xu
  12. When Does Aggregating Multiple Skills with Multi-Task Learning Work? A Case Study in Financial NLP By Jingwei Ni; Zhijing Jin; Qian Wang; Mrinmaya Sachan; Markus Leippold
  13. Deep Reinforcement Learning for Quantitative Trading By Maochun Xu; Zixun Lan; Zheng Tao; Jiawei Du; Zongao Ye
  14. ClimateBERT-NetZero: Detecting and Assessing Net Zero and Reduction Targets By Tobias Schimanski; Julia Bingler; Camilla Hyslop; Mathias Kraus; Markus Leippold
  15. Towards Sobolev Pruning By Neil Kichler; Sher Afghan; Uwe Naumann
  16. Generative artificial intelligence enhances individual creativity but reduces the collective diversity of novel content By Anil R. Doshi; Oliver P. Hauser
  17. Forecasting exports in selected OECD countries and Iran using MLP Artificial Neural Network By Soheila Khajoui; Saeid Dehyadegari; Sayyed Abdolmajid Jalaee
  18. StockEmotions: Discover Investor Emotions for Financial Sentiment Analysis and Multivariate Time Series By Jean Lee; Hoyoul Luis Youn; Josiah Poon; Soyeon Caren Han
  19. A graph-based multimodal framework to predict gentrification By Javad Eshtiyagh; Baotong Zhang; Yujing Sun; Linhui Wu; Zhao Wang
  20. Uniswap Daily Transaction Indices by Network By Nir Chemaya; Lin William Cong; Emma Jorgensen; Dingyue Liu; Luyao Zhang
  21. FABLES: Framework for Autonomous Behaviour-rich Language-driven Emotion-enabled Synthetic populations. By HRADEC Jiri; OSTLAENDER Nicole; BERNINI Alba
  22. Does women’s political empowerment matter for income inequality? By Miriam Hortas-Rico; Vicente Rios
  23. The power to conserve: a field experiment on electricity use in Qatar By Al-Ubaydli, Omar; Cassidy, Alecia; Chatterjee, Anomitro; Khalifa, Ahmed; Price, Michael
  24. A Machine Learning Approach to Targeting Humanitarian Assistance Among Forcibly Displaced Populations By : Angela C. Lyons; Alejandro Montoya Castano; : Josephine Kass-Hanna; : Yifang Zhang; Aiman Soliman
  25. Machine Learning and Fundraising: Applications of Artificial Neural Networks By Diana Barro; Luca Barzanti; Marco Corazza; Martina Nardon
  26. Decoding Social Sentiment in DAO: A Comparative Analysis of Blockchain Governance Communities By Yutong Quan; Xintong Wu; Wanlin Deng; Luyao Zhang

  1. By: Honey Batra; Amanda Michaud; Simon Mongey
    Abstract: We present six facts that characterize the little wage information contained in the universe of online job posts in the U.S. First, wage information is rare: only 14% of posts contain any wage information and the minority of these (6%) have a point wage. The majority (8%) feature a range of wages that are on average wide, spanning 28% of the midpoint (e.g. $21-28/hr or $32, 000$42, 000/yr). Second, information varies systematically along the occupation-wage gradient. Third, posted wages are 40% higher than wages in BLS data in low-wage occupations and 20% lower than BLS data in high-wage occupations. Fourth, among the wages that are posted, high wage firms are more opaque, with more and wider ranges. Fifth, there is zero correlation between wage information and local labor market tightness. Sixth, of the top 20 posting private firms, none have any wage information in more than 2% of their posts. Our findings caution against treating wage data from job postings as a stand-in for administrative data. We provide an example of bias in econometric inference that worsens as wage information falls.
    JEL: E20 J30
    Date: 2023–12
    URL: http://d.repec.org/n?u=RePEc:nbr:nberwo:31984&r=big
  2. By: Rahlff, Helen (University of Hagen); Rinne, Ulf (IZA); Sonnabend, Hendrik (Fern Universität Hagen)
    Abstract: We analyze the prevalence of bullying in Germany during COVID-19, both as a real-life phenomenon (in-person bullying, or in our context: school bullying) and via social media and electronic communication tools (cyberbullying). Using Google Trends data from 2013 to 2022 and exploiting the COVID-19 pandemic as a natural experiment when schools switched to distance learning, we document stark changes in the prevalence of (cyber)bullying in Germany: Our results indicate that during school years affected by COVID-19, online searches for school bullying decreased by about 25 percent, while online searches for cyberbullying increased by about 48 percent during the same periods.
    Keywords: school bullying, cyberbullying, Google Trends
    JEL: H75 I12 I21 I28 I31
    Date: 2023–12
    URL: http://d.repec.org/n?u=RePEc:iza:izadps:dp16650&r=big
  3. By: Dandan Qiao; Huaxia Rui; Qian Xiong
    Abstract: Artificial intelligence (AI) refers to the ability of machines or software to mimic or even surpass human intelligence in a given cognitive task. While humans learn by both induction and deduction, the success of current AI is rooted in induction, relying on its ability to detect statistical regularities in task input -- an ability learnt from a vast amount of training data using enormous computation resources. We examine the performance of such a statistical AI in a human task through the lens of four factors, including task learnability, statistical resource, computation resource, and learning techniques, and then propose a three-phase visual framework to understand the evolving relation between AI and jobs. Based on this conceptual framework, we develop a simple economic model of competition to show the existence of an inflection point for each occupation. Before AI performance crosses the inflection point, human workers always benefit from an improvement in AI performance, but after the inflection point, human workers become worse off whenever such an improvement occurs. To offer empirical evidence, we first argue that AI performance has passed the inflection point for the occupation of translation but not for the occupation of web development. We then study how the launch of ChatGPT, which led to significant improvement of AI performance on many tasks, has affected workers in these two occupations on a large online labor platform. Consistent with the inflection point conjecture, we find that translators are negatively affected by the shock both in terms of the number of accepted jobs and the earnings from those jobs, while web developers are positively affected by the very same shock. Given the potentially large disruption of AI on employment, more studies on more occupations using data from different platforms are urgently needed.
    Date: 2023–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2312.04180&r=big
  4. By: Ramaharo, Franck Maminirina (Ministry of Economy and Finance (Ministère de l'Economie et des Finances)); Rasolofomanana, Gerzhino H (Ministry of Economy and Finances)
    Abstract: We investigate the predictive power of different machine learning algorithms to nowcast Madagascar's gross domestic product (GDP). We trained popular regression models, including linear regularized regression (Ridge, Lasso, Elastic-net), dimensionality reduction model (principal component regression), k-nearest neighbors algorithm (k-NN regression), support vector regression (linear SVR), and tree-based ensemble models (Random forest and XGBoost regressions), on 10 Malagasy quarterly macroeconomic leading indicators over the period 2007Q1-2022Q4, and we used simple econometric models as a benchmark. We measured the nowcast accuracy of each model by calculating the root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). Our findings reveal that the Ensemble Model, formed by aggregating individual predictions, consistently outperforms traditional econometric models. We conclude that machine learning models can deliver more accurate and timely nowcasts of Malagasy economic performance and provide policymakers with additional guidance for data-driven decision making.
    Date: 2023–12–22
    URL: http://d.repec.org/n?u=RePEc:osf:africa:vpuac&r=big
  5. By: Jingwei Ni (ETH Zurich); Julia Bingler (University of Oxford); Chiara Colesanti Senni (ETH Zürich; University of Zurich); Mathias Kraus (University of Erlangen); Glen Gostlow (University of Zurich); Tobias Schimanski (University of Zurich); Dominik Stammbach (ETH Zurich); Saeid Vaghefi (University of Zurich); Qian Wang (University of Zurich); Nicolas Webersinke (Friedrich-Alexander-Universität Erlangen-Nürnberg); Tobias Wekhof (ETH Zürich); Tingyu Yu (University of Zurich); Markus Leippold (University of Zurich; Swiss Finance Institute)
    Abstract: This paper introduces a novel approach to enhance Large Language Models (LLMs) with expert knowledge to automate the analysis of corporate sustainability reports by benchmarking them against the Task Force for Climate-Related Financial Disclosures (TCFD) recommendations. Corporate sustainability reports are crucial in assessing organizations' environmental and social risks and impacts. However, analyzing these reports' vast amounts of information makes human analysis often too costly. As a result, only a few entities worldwide have the resources to analyze these reports, which could lead to a lack of transparency. While AI-powered tools can automatically analyze the data, they are prone to inaccuracies as they lack domain-specific expertise. This paper introduces a novel approach to enhance LLMs with expert knowledge to automate the analysis of corporate sustainability reports. We christen our tool \textsc{chatReport}, and apply it in a first use case to assess corporate climate risk disclosures following the TCFD recommendations. ChatReport results from collaborating with experts in climate science, finance, economic policy, and computer science, demonstrating how domain experts can be involved in developing AI tools. We make our prompt templates, generated data, and scores available to the public to encourage transparency.
    Keywords: Task Force for Climate-Related Financial Disclosures, Sustainability Report, Large Language Model, ChatGPT
    Date: 2023–11
    URL: http://d.repec.org/n?u=RePEc:chf:rpseri:rp23111&r=big
  6. By: Luca Gaegauf (University of Zurich); Simon Scheidegger (University of Lausanne); Fabio Trojani (University of Geneva; University of Turin; Swiss Finance Institute)
    Abstract: We introduce a comprehensive computational framework for solving dynamic portfolio choice problems with many risky assets, transaction costs, and borrowing and short-selling constraints. Our approach leverages the synergy between Gaussian process regression and Bayesian active learning to efficiently approximate value and policy functions with a novel, formal way of characterizing the irregularly-shaped no-trade region; we then embed this into a discrete-time dynamic programming algorithm. This combination allows us to study dynamic portfolio choice problems with more risky assets than was previously possible. Our results indicate that giving the agent access to more assets may alleviate some illiquidity resulting from the presence of transaction costs.
    Keywords: Machine learning, computational finance, computational economics, Gaussian process regression, dynamic portfolio optimization, transaction costs, liquidity premia
    JEL: C61 C63 C68 E21
    Date: 2023–11
    URL: http://d.repec.org/n?u=RePEc:chf:rpseri:rp23114&r=big
  7. By: Yan Leng; Yuan Yuan
    Abstract: The advances of Large Language Models (LLMs) are expanding their utility in both academic research and practical applications. Recent social science research has explored the use of these "black-box" LLM agents for simulating complex social systems and potentially substituting human subjects in experiments. Our study delves into this emerging domain, investigating the extent to which LLMs exhibit key social interaction principles, such as social learning, social preference, and cooperative behavior, in their interactions with humans and other agents. We develop a novel framework for our study, wherein classical laboratory experiments involving human subjects are adapted to use LLM agents. This approach involves step-by-step reasoning that mirrors human cognitive processes and zero-shot learning to assess the innate preferences of LLMs. Our analysis of LLM agents' behavior includes both the primary effects and an in-depth examination of the underlying mechanisms. Focusing on GPT-4, the state-of-the-art LLM, our analyses suggest that LLM agents appear to exhibit a range of human-like social behaviors such as distributional and reciprocity preferences, responsiveness to group identity cues, engagement in indirect reciprocity, and social learning capabilities. However, our analysis also reveals notable differences: LLMs demonstrate a pronounced fairness preference, weaker positive reciprocity, and a more calculating approach in social learning compared to humans. These insights indicate that while LLMs hold great promise for applications in social science research, such as in laboratory experiments and agent-based modeling, the subtle behavioral differences between LLM agents and humans warrant further investigation. Careful examination and development of protocols in evaluating the social behaviors of LLMs are necessary before directly applying these models to emulate human behavior.
    Date: 2023–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2312.15198&r=big
  8. By: von der Heyde, Leah (LMU Munich); Haensch, Anna-Carolina; Wenz, Alexander (University of Mannheim)
    Abstract: The recent development of large language models (LLMs) has spurred discussions about whether LLM-generated “synthetic samples” could complement or replace traditional surveys, considering their training data potentially reflects attitudes and behaviors prevalent in the population. A number of mostly US-based studies have prompted LLMs to mimic survey respondents, finding that the responses closely match the survey data. However, several contextual factors related to the relationship between the respective target population and LLM training data might affect the generalizability of such findings. In this study, we investigate the extent to which LLMs can estimate public opinion in Germany, using the example of vote choice as outcome of interest. To generate a synthetic sample of eligible voters in Germany, we create personas matching the individual characteristics of the 2017 German Longitudinal Election Study respondents. Prompting GPT-3 with each persona, we ask the LLM to predict each respondents’ vote choice in the 2017 German federal elections and compare these predictions to the survey-based estimates on the aggregate and subgroup levels. We find that GPT-3 does not predict citizens’ vote choice accurately, exhibiting a bias towards the Green and Left parties, and making better predictions for more “typical” voter subgroups. While the language model is able to capture broad-brush tendencies tied to partisanship, it tends to miss out on the multifaceted factors that sway individual voter choices. Furthermore, our results suggest that GPT-3 might not be reliable for estimating nuanced, subgroup-specific political attitudes. By examining the prediction of voting behavior using LLMs in a new context, our study contributes to the growing body of research about the conditions under which LLMs can be leveraged for studying public opinion. The findings point to disparities in opinion representation in LLMs and underscore the limitation of applying them for public opinion estimation without accounting for the biases in their training data.
    Date: 2023–12–15
    URL: http://d.repec.org/n?u=RePEc:osf:socarx:8je9g&r=big
  9. By: Stefano Bianchini; Moritz M\"uller; Pierre Pelletier
    Abstract: New technologies have the power to revolutionize science. It has happened in the past and is happening again with the emergence of new computational tools, such as Artificial Intelligence (AI) and Machine Learning (ML). Despite the documented impact of these technologies, there remains a significant gap in understanding the process of their adoption within the scientific community. In this paper, we draw on theories of scientific and technical human capital (STHC) to study the integration of AI in scientific research, focusing on the human capital of scientists and the external resources available within their network of collaborators and institutions. We validate our hypotheses on a large sample of publications from OpenAlex, covering all sciences from 1980 to 2020. We find that the diffusion of AI is strongly driven by social mechanisms that organize the deployment and creation of human capital that complements the technology. Our results suggest that AI is pioneered by domain scientists with a `taste for exploration' and who are embedded in a network rich of computer scientists, experienced AI scientists and early-career researchers; they also come from institutions with high citation impact and a relatively strong publication history on AI. The pattern is similar across scientific disciplines, the exception being access to high-performance computing (HPC), which is important in chemistry and the medical sciences but less so in other fields. Once AI is integrated into research, most adoption factors continue to influence its subsequent reuse. Implications for the organization and management of science in the evolving era of AI-driven discovery are discussed.
    Date: 2023–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2312.09843&r=big
  10. By: Francesco Moscone (Brunel University London; Ca' Foscari University of Venice); Elisa Tosetti (University of Padua); Giorgio Vittadini (University of Milano-Bicocca)
    Abstract: In this paper we explore the role of media and language used to comment on economic news in explaining and anticipating suicides in England and Wales. This is an interesting question, given the large delay in the release of official statistics on suicides. We use a large data set of over 200, 000 news articles published in six major UK newspapers from 2001 to 2015 and carry sentiment analysis of the language used to comment on economic news. We extract daily indicators measuring a set of negative emotions that are often associated with poor mental health and use them to explain and forecast national daily suicide figures. We find that highly negative comments on the economic situation in newspaper articles are predictors of higher suicide numbers, especially when using words conveying stronger emotions of fear and despair. Our results suggest that media language carrying very strong, negative feelings may be an early signal of a deterioration in a population's mental health.
    Keywords: suicide, health outcomes, text analysis, emotions extraction, forecasting
    JEL: I14 I15
    Date: 2023
    URL: http://d.repec.org/n?u=RePEc:ven:wpaper:2023:32&r=big
  11. By: David Hason Rudd (UTS - University of Technology Sydney); Huan Huo (UTS - University of Technology Sydney); Md. Rafiqul Islam (UTS - University of Technology Sydney); Guandong Xu (UTS - University of Technology Sydney)
    Abstract: In today's competitive landscape, businesses grapple with customer retention. Churn prediction models, although beneficial, often lack accuracy due to the reliance on a single data source. The intricate nature of human behavior and highdimensional customer data further complicate these efforts. To address these concerns, this paper proposes a multimodal fusion learning model for identifying customer churn risk levels in financial service providers. Our multimodal approach integrates customer sentiments, financial literacy (FL) level, and financial behavioral data, enabling more accurate and bias-free churn prediction models. The proposed FL model utilizes a SMOGN-COREG supervised model to gauge customer FL levels from their financial data. The baseline churn model applies an ensemble artificial neural network and oversampling techniques to predict churn propensity in high-dimensional financial data. We also incorporate a speech emotion recognition model employing a pretrained CNN-VGG16 to recognize customer emotions based on pitch, energy, and tone. To integrate these diverse features while retaining unique insights, we introduced late and hybrid fusion techniques that complementary boost coordinated multimodal colearning. Robust metrics were utilized to evaluate the proposed multimodal fusion model and hence the approach's validity, including mean average precision and macro-averaged F1 score. Our novel approach demonstrates a marked improvement in churn prediction, achieving a test accuracy of 91.2%, a Mean Average Precision (MAP) score of 66, and a Macro-Averaged F1 score of 54 through the proposed hybrid fusion learning technique compared with late fusion and baseline models. Furthermore, the analysis demonstrates a positive correlation between negative emotions, low FL scores, and high-risk customers.
    Abstract: Dans le paysage concurrentiel actuel, les entreprises sont confrontées à des défis en matière de rétention de la clientèle. Bien qu'utiles, les modèles de prédiction du churn manquent souvent de précision en raison de leur dépendance à une seule source de données. La nature complexe du comportement humain et les données clients de haute dimension compliquent davantage ces efforts. Pour répondre à ces préoccupations, cet article propose un modèle d'apprentissage par fusion multimodale pour identifier les niveaux de risque de churn chez les clients des prestataires de services financiers. Notre approche multimodale intègre les sentiments des clients, le niveau de littératie financière (LF) et les données comportementales financières, permettant des modèles de prédiction du churn plus précis et exempts de biais. Le modèle LF proposé utilise un modèle supervisé SMOGN-COREG pour évaluer les niveaux de LF des clients à partir de leurs données financières. Le modèle de base du churn applique un réseau de neurones artificiels en ensemble et des techniques de suréchantillonnage pour prédire la propension au churn dans des données financières de haute dimension. Nous incorporons également un modèle de reconnaissance des émotions vocales utilisant un CNN-VGG16 pré-entraîné pour reconnaître les émotions des clients en fonction de la hauteur, de l'énergie et du ton. Pour intégrer ces caractéristiques diverses tout en conservant des insights uniques, nous avons introduit des techniques de fusion tardive et hybride qui renforcent de manière complémentaire l'apprentissage coordonné multimodal. Des métriques robustes ont été utilisées pour évaluer le modèle de fusion multimodale proposé et donc la validité de l'approche, y compris la précision moyenne et le score F1 macro-moyenné. Notre approche innovante démontre une amélioration significative dans la prédiction du churn, atteignant une précision de test de 91, 2 %, un score de précision moyenne (MAP) de 66 et un score F1 macro-moyenné de 54 grâce à la technique d'apprentissage par fusion hybride proposée, comparée aux modèles de fusion tardive et de base. De plus, l'analyse montre une corrélation positive entre les émotions négatives, les faibles scores de LF et les clients à haut risque.
    Keywords: Churn prediction multimodal learning feature fusion financial literacy speech emotion recognition customer behavior, Churn prediction, multimodal learning, feature fusion, financial literacy, speech emotion recognition, customer behavior
    Date: 2023–10–30
    URL: http://d.repec.org/n?u=RePEc:hal:journl:hal-04320145&r=big
  12. By: Jingwei Ni (ETH Zurich); Zhijing Jin (ETH Zurich); Qian Wang (University of Zurich); Mrinmaya Sachan (ETH Zürich); Markus Leippold (University of Zurich; Swiss Finance Institute)
    Abstract: Multi-task learning (MTL) aims at achieving a better model by leveraging data and knowledge from multiple tasks. However, MTL does not always work – sometimes negative transfer occurs between tasks, especially when aggregating loosely related skills, leaving it an open question when MTL works. Previous studies show that MTL performance can be improved by algorithmic tricks. However, what tasks and skills should be included is less well explored. In this work, we conduct a case study in Financial NLP where multiple datasets exist for skills relevant to the domain, such as numeric reasoning and sentiment analysis. Due to the task difficulty and data scarcity in the Financial NLP domain, we explore when aggregating such diverse skills from multiple datasets with MTL can work. Our findings suggest that the key to MTL success lies in skill diversity, relatedness between tasks, and choice of aggregation size and shared capacity. Specifically, MTL works well when tasks are diverse but related, and when the size of the task aggregation and the shared capacity of the model are balanced to avoid overwhelming certain tasks.
    Keywords: Multi-Task Learning, Sentiment Analysis, Financial Datasets, FinBERT
    Date: 2023–11
    URL: http://d.repec.org/n?u=RePEc:chf:rpseri:rp23112&r=big
  13. By: Maochun Xu; Zixun Lan; Zheng Tao; Jiawei Du; Zongao Ye
    Abstract: Artificial Intelligence (AI) and Machine Learning (ML) are transforming the domain of Quantitative Trading (QT) through the deployment of advanced algorithms capable of sifting through extensive financial datasets to pinpoint lucrative investment openings. AI-driven models, particularly those employing ML techniques such as deep learning and reinforcement learning, have shown great prowess in predicting market trends and executing trades at a speed and accuracy that far surpass human capabilities. Its capacity to automate critical tasks, such as discerning market conditions and executing trading strategies, has been pivotal. However, persistent challenges exist in current QT methods, especially in effectively handling noisy and high-frequency financial data. Striking a balance between exploration and exploitation poses another challenge for AI-driven trading agents. To surmount these hurdles, our proposed solution, QTNet, introduces an adaptive trading model that autonomously formulates QT strategies through an intelligent trading agent. Incorporating deep reinforcement learning (DRL) with imitative learning methodologies, we bolster the proficiency of our model. To tackle the challenges posed by volatile financial datasets, we conceptualize the QT mechanism within the framework of a Partially Observable Markov Decision Process (POMDP). Moreover, by embedding imitative learning, the model can capitalize on traditional trading tactics, nurturing a balanced synergy between discovery and utilization. For a more realistic simulation, our trading agent undergoes training using minute-frequency data sourced from the live financial market. Experimental findings underscore the model's proficiency in extracting robust market features and its adaptability to diverse market conditions.
    Date: 2023–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2312.15730&r=big
  14. By: Tobias Schimanski (University of Zurich); Julia Bingler (University of Oxford); Camilla Hyslop (University of Oxford); Mathias Kraus (University of Erlangen); Markus Leippold (University of Zurich; Swiss Finance Institute)
    Abstract: Public and private actors struggle to assess the vast amounts of information about sustainability commitments made by various institutions. To address this problem, we create a novel tool for automatically detecting corporate, national, and regional net zero and reduction targets in three steps. First, we introduce an expert-annotated data set with 3.5K text samples. Second, we train and release ClimateBERT-NetZero, a natural language classifier to detect whether a text contains a net zero or reduction target. Third, we showcase its analysis potential with two use cases: We first demonstrate how ClimateBERT-NetZero can be combined with conventional question-answering (Q&A) models to analyze the ambitions displayed in net zero and reduction targets. Furthermore, we employ the ClimateBERT-NetZero model on quarterly earning call transcripts and outline how communication patterns evolve over time. Our experiments demonstrate promising pathways for extracting and analyzing net zero and emission reduction targets at scale.
    Keywords: Net Zero Targets, ClimateBERT, Transformers, NLP
    Date: 2023–11
    URL: http://d.repec.org/n?u=RePEc:chf:rpseri:rp23110&r=big
  15. By: Neil Kichler; Sher Afghan; Uwe Naumann
    Abstract: The increasing use of stochastic models for describing complex phenomena warrants surrogate models that capture the reference model characteristics at a fraction of the computational cost, foregoing potentially expensive Monte Carlo simulation. The predominant approach of fitting a large neural network and then pruning it to a reduced size has commonly neglected shortcomings. The produced surrogate models often will not capture the sensitivities and uncertainties inherent in the original model. In particular, (higher-order) derivative information of such surrogates could differ drastically. Given a large enough network, we expect this derivative information to match. However, the pruned model will almost certainly not share this behavior. In this paper, we propose to find surrogate models by using sensitivity information throughout the learning and pruning process. We build on work using Interval Adjoint Significance Analysis for pruning and combine it with the recent advancements in Sobolev Training to accurately model the original sensitivity information in the pruned neural network based surrogate model. We experimentally underpin the method on an example of pricing a multidimensional Basket option modelled through a stochastic differential equation with Brownian motion. The proposed method is, however, not limited to the domain of quantitative finance, which was chosen as a case study for intuitive interpretations of the sensitivities. It serves as a foundation for building further surrogate modelling techniques considering sensitivity information.
    Date: 2023–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2312.03510&r=big
  16. By: Anil R. Doshi; Oliver P. Hauser
    Abstract: Creativity is core to being human. Generative artificial intelligence (GenAI) holds promise for humans to be more creative by offering new ideas, or less creative by anchoring on GenAI ideas. We study the causal impact of GenAI ideas on the production of an unstructured creative output in an online experimental study where some writers could obtain ideas for a story from a GenAI platform. We find that access to GenAI ideas causes stories to be evaluated as more creative, better written and more enjoyable, especially among less creative writers. However, objective measures of story similarity within each condition reveal that GenAI-enabled stories are more similar to each other than stories by humans alone. These results point to an increase in individual creativity, but at the same time there is a risk of losing collective novelty: this dynamic resembles a social dilemma where individual writers are better off using GenAI to improve their own writing, but collectively a narrower scope of novel content may be produced with GenAI. Our results have implications for researchers, policy-makers and practitioners interested in bolstering creativity, but point to potential downstream consequences from over-reliance.
    Date: 2023–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2312.00506&r=big
  17. By: Soheila Khajoui; Saeid Dehyadegari; Sayyed Abdolmajid Jalaee
    Abstract: The present study aimed to forecast the exports of a select group of Organization for Economic Co-operation and Development (OECD) countries and Iran using the neural networks. The data concerning the exports of the above countries from 1970 to 2019 were collected. The collected data were implemented to forecast the exports of the investigated countries for 2021 to 2025. The analysis was performed using the Multi-Layer-Perceptron (MLP) neural network in Python. Out of the total number, 75 percent were used as training data, and 25 percent were used as the test data. The findings of the study were evaluated with 99% accuracy, which indicated the reliability of the output of the network. The Results show that Covid-19 has affected exports over time. However, long-term export contracts are less affected by tensions and crises, due to the effect of exports on economic growth, per capita income and it is better for economic policies of countries to use long-term export contracts.
    Date: 2023–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2312.15535&r=big
  18. By: Jean Lee; Hoyoul Luis Youn; Josiah Poon; Soyeon Caren Han
    Abstract: There has been growing interest in applying NLP techniques in the financial domain, however, resources are extremely limited. This paper introduces StockEmotions, a new dataset for detecting emotions in the stock market that consists of 10, 000 English comments collected from StockTwits, a financial social media platform. Inspired by behavioral finance, it proposes 12 fine-grained emotion classes that span the roller coaster of investor emotion. Unlike existing financial sentiment datasets, StockEmotions presents granular features such as investor sentiment classes, fine-grained emotions, emojis, and time series data. To demonstrate the usability of the dataset, we perform a dataset analysis and conduct experimental downstream tasks. For financial sentiment/emotion classification tasks, DistilBERT outperforms other baselines, and for multivariate time series forecasting, a Temporal Attention LSTM model combining price index, text, and emotion features achieves the best performance than using a single feature.
    Date: 2023–01
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2301.09279&r=big
  19. By: Javad Eshtiyagh; Baotong Zhang; Yujing Sun; Linhui Wu; Zhao Wang
    Abstract: Gentrification--the transformation of a low-income urban area caused by the influx of affluent residents--has many revitalizing benefits. However, it also poses extremely concerning challenges to low-income residents. To help policymakers take targeted and early action in protecting low-income residents, researchers have recently proposed several machine learning models to predict gentrification using socioeconomic and image features. Building upon previous studies, we propose a novel graph-based multimodal deep learning framework to predict gentrification based on urban networks of tracts and essential facilities (e.g., schools, hospitals, and subway stations). We train and test the proposed framework using data from Chicago, New York City, and Los Angeles. The model successfully predicts census-tract level gentrification with 0.9 precision on average. Moreover, the framework discovers a previously unexamined strong relationship between schools and gentrification, which provides a basis for further exploration of social factors affecting gentrification.
    Date: 2023–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2312.15646&r=big
  20. By: Nir Chemaya; Lin William Cong; Emma Jorgensen; Dingyue Liu; Luyao Zhang
    Abstract: DeFi is transforming financial services by removing intermediaries and producing a wealth of open-source data. This transformation is propelled by Layer 2 (L2) solutions, aimed at boosting network efficiency and scalability beyond current Layer 1 (L1) capabilities. This study addresses the lack of detailed L2 impact analysis by examining over 50 million transactions from Uniswap. Our dataset, featuring transactions from L1 and L2 across networks like Ethereum and Polygon, provides daily indices revealing adoption, scalability, and decentralization within the DeFi space. These indices help to elucidate the complex relationship between DeFi and L2 technologies, advancing our understanding of the ecosystem. The dataset is enhanced by an open-source Python framework for computing decentralization indices, adaptable for various research needs. This positions the dataset as a vital resource for machine learning endeavors, particularly deep learning, contributing significantly to the development of Blockchain as Web3's infrastructure.
    Date: 2023–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2312.02660&r=big
  21. By: HRADEC Jiri (European Commission - JRC); OSTLAENDER Nicole; BERNINI Alba
    Abstract: The research investigates how large language models (LLMs) emerge as reservoirs of a vast array of human experiences, behaviours, and emotions. Building upon prior work of the JRC on synthetic populations , it presents a complete step-by-step guide on how to use LLMs to create highly realistic modelling scenarios and complex societies of autonomous emotional AI agents. This technique is aligned with agent-based modelling (ABM) and facilitates quantitative evaluation. The report describes how the agents were instantiated using LLMs, enriched with personality traits using the ABC-EBDI model, equipped with short- and long-term memory, and access to detailed knowledge of their environment. This setting of embodied reasoning significantly improved the agents' problem-solving capabilities and when subjected to various scenarios, the LLM-driven agents exhibited behaviours mirroring human-like reasoning and emotions, inter-agent patterns and realistic conversations, including elements that mirrored critical thinking. These LLM-driven agents can serve as believable proxies for human behaviour in simulated environments presenting vast implications for future research and policy applications, including studying impacts of different policy scenarios. This bears the opportunity to combine the narrative-based world of foresight scenarios with the advantages of quantitative modelling
    Date: 2023–10
    URL: http://d.repec.org/n?u=RePEc:ipt:iptwpa:jrc135070&r=big
  22. By: Miriam Hortas-Rico; Vicente Rios
    Abstract: This paper analyzes the relationship between women’s political empowerment (WPE) and income inequality in a sample of 142 countries between 1990 and 2019. To identify causal effects, we rely on the use of Random Forests techniques and the exogenous variation on ancestral and traditional cultural norms of gender roles within an instrumental variable panel data modeling approach. These tree-based machine learning statistical techniques help us to predict the spatio-temporal distribution of WPE with high accuracy solely using ancestral societal traits. This predicted variable is then used in the second stage of the IV estimation of a panel specification of income inequality including fixed and time-period fixed effects. Our panel-IV regressions show that (i) WPE reduces income inequality and that (ii) this effect is partly transmitted via redistributive policies. In addition, we employ partial identification methods to ensure that our results are not influenced by unobserved confounding variables. Furthermore, we find that the negative link between WPE is robust to the presence of spatial interdependence and time persistence in inequality outcomes, the presence of outliers and influential observations, and an alternative definition of income inequality. Taken together, our results suggest that the observed negative link between WPE and income inequality is likely to be causal.
    Date: 2023–12
    URL: http://d.repec.org/n?u=RePEc:fda:fdaddt:2023-10&r=big
  23. By: Al-Ubaydli, Omar; Cassidy, Alecia; Chatterjee, Anomitro; Khalifa, Ahmed; Price, Michael
    Abstract: High resource users often have the strongest response to behavioral interventions promoting conservation. Yet, little is known about how to motivate them. We implement a field experiment in Qatar, where residential customers have some of the highest energy use per capita in the world. Our dataset consists of 207, 325 monthly electricity meter readings from a panel of 6, 096 customers. We employ two normative treatments priming identity - a religious message quoting the Qur’an, and a national message reminding households that Qatar prioritizes energy conservation. The treatments reduce electricity use by 3.8% and both messages are equally effective. However, this masks significant heterogeneity. Using machine learning methods on supplemental survey data, we elucidate how agency, motivation, and responsibility activate conservation responses to our identity primes.
    Keywords: electricity consumption; natural field experiments; identity; moral suasion; agency; Qatar; super-users; consumer behaviour; electricity; energy; energy saving; household energy
    JEL: C93 D90 Q41
    Date: 2023–11–20
    URL: http://d.repec.org/n?u=RePEc:ehl:lserod:121048&r=big
  24. By: : Angela C. Lyons (University of Illinois at Urbana-Champaign); Alejandro Montoya Castano (University of Illinois at Urbana-Champaign); : Josephine Kass-Hanna (IESEG School of Management, Univ. Lille); : Yifang Zhang (University of Illinois at Urbana Champaign); Aiman Soliman (University of Illinois at Urbana-Champaign)
    Abstract: Increasing trends in forced displacement and poverty are expected to intensify in coming years. Data science approaches can be useful for governments and humanitarian organizations in designing more robust and effective targeting mechanisms. This study applies machine learning techniques and combines geospatial data with survey data collected from Syrian refugees in Lebanon over the last four years to help develop more robust and operationalizable targeting strategies. Our findings highlight the importance of a comprehensive and flexible framework that captures other poverty dimensions along with the commonly used expenditure metric, while also allowing for regular updates to keep up with (rapidly) changing contexts over time. The analysis also points to geographical heterogeneities that are likely to impact the effectiveness of targeting strategies. The insights from this study have important implications for agencies seeking to improve targeting, especially with shrinking humanitarian funding
    Date: 2023–11–20
    URL: http://d.repec.org/n?u=RePEc:erg:wpaper:1654&r=big
  25. By: Diana Barro (Department of Economics, Ca' Foscari University of Venice); Luca Barzanti (Department of Mathematics, University of Bologna); Marco Corazza (Department of Economics, Ca' Foscari University of Venice); Martina Nardon (Department of Economics, Ca' Foscari University of Venice)
    Abstract: In fundraising management, the assessment of the expected gift is a key point. The availability of accurate estimates of the number of donations, their amounts, and the gift probability is relevant in order to evaluate the results of a fundraising campaign. The accuracy of the expected gift estimation depends on the appropriate use of the information about Donors. In this contribution, we propose a non-parametric methodology for the prediction of Donors' behavior based on Artificial Neural Networks. In particular, Multi-Layer Perceptron is applied. In the numerical experiments, the expected gift is then estimated based on a simulated dataset of Donors' individual characteristics and information on donations history.
    Keywords: Fundraising Management, Donor's Profile, Gift Expectation, Artificial Neural Networks
    JEL: C45 D64
    Date: 2023
    URL: http://d.repec.org/n?u=RePEc:ven:wpaper:2023:33&r=big
  26. By: Yutong Quan; Xintong Wu; Wanlin Deng; Luyao Zhang
    Abstract: Blockchain technology is leading a revolutionary transformation across diverse industries, with effective governance standing as a critical determinant for the success and sustainability of blockchain projects. Community forums, pivotal in engaging decentralized autonomous organizations (DAOs), wield a substantial impact on blockchain governance decisions. Concurrently, Natural Language Processing (NLP), particularly sentiment analysis, provides powerful insights from textual data. While prior research has explored the potential of NLP tools in social media sentiment analysis, a gap persists in understanding the sentiment landscape of blockchain governance communities. The evolving discourse and sentiment dynamics on the forums of top DAOs remain largely unknown. This paper delves deep into the evolving discourse and sentiment dynamics on the public forums of leading DeFi projects -- Aave, Uniswap, Curve Dao, Aragon, Yearn.finance, Merit Circle, and Balancer -- placing a primary focus on discussions related to governance issues. Despite differing activity patterns, participants across these decentralized communities consistently express positive sentiments in their Discord discussions, indicating optimism towards governance decisions. Additionally, our research suggests a potential interplay between discussion intensity and sentiment dynamics, indicating that higher discussion volumes may contribute to more stable and positive emotions. The insights gained from this study are valuable for decision-makers in blockchain governance, underscoring the pivotal role of sentiment analysis in interpreting community emotions and its evolving impact on the landscape of blockchain governance. This research significantly contributes to the interdisciplinary exploration of the intersection of blockchain and society, with a specific emphasis on the decentralized blockchain governance ecosystem.
    Date: 2023–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2311.14676&r=big

This nep-big issue is ©2024 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.