nep-cmp New Economics Papers
on Computational Economics
Issue of 2025–02–24
23 papers chosen by
Stan Miles, Thompson Rivers University


  1. Can Machines Learn Weak Signals? By Zhouyu Shen; Dacheng Xiu
  2. Decision-informed Neural Networks with Large Language Model Integration for Portfolio Optimization By Yoontae Hwang; Yaxuan Kong; Stefan Zohren; Yongjae Lee
  3. Efficient Triangular Arbitrage Detection via Graph Neural Networks By Di Zhang
  4. Putting AI agents through their paces on general tasks By Fernando Perez-Cruz; Hyun Song Shin
  5. Whole Lotta Training - Studying School-to-Training Transitions by Training Artificial Neural Networks By Kubitza, Dennis Oliver; Weßling, Katarina
  6. Can AI Solve the Peer Review Crisis? A Large-Scale Experiment on LLM's Performance and Biases in Evaluating Economics Papers By Pataranutaporn, Pat; Powdthavee, Nattavudh; Maes, Pattie
  7. Nowcasting Madagascar's real GDP using machine learning algorithms By Ramaharo, Franck Maminirina; Rasolofomanana, Gerzhino H
  8. Supervised Similarity for High-Yield Corporate Bonds with Quantum Cognition Machine Learning By Joshua Rosaler; Luca Candelori; Vahagn Kirakosyan; Kharen Musaelian; Ryan Samson; Martin T. Wells; Dhagash Mehta; Stefano Pasquali
  9. The impact of prudential regulations on the UK housing market and economy: insights from an agent-based model By Marco Bardoscia; Adrian Carro; Marc Hinterschweiger; Mauro Napoletano; Lilit Popoyan; Andrea Roventini; Arzu Uluc
  10. Detecting and Mitigating Shortcut Learning Bias in Machine Learning: A Pathway to More Generalizable ML-based (IS) Research By Matthew Caron; Oliver Müller; Johannes Kriebel
  11. Regret-Optimized Portfolio Enhancement through Deep Reinforcement Learning and Future Looking Rewards By Daniil Karzanov; Rub\'en Garz\'on; Mikhail Terekhov; Caglar Gulcehre; Thomas Raffinot; Marcin Detyniecki
  12. MarketSenseAI 2.0: Enhancing Stock Analysis through LLM Agents By George Fatouros; Kostas Metaxas; John Soldatos; Manos Karathanassis
  13. Towards a Deep Learning approach to regularise discourse of collaborative learner By Chowdhury, Koushik
  14. MLPESTEL: The New Era of Forecasting Change in the Operational Environment of Businesses Using LLMs By Alnajjar, Khalid; Hämäläinen, Mika
  15. When Dimensionality Hurts: The Role of LLM Embedding Compression for Noisy Regression Tasks By Felix Drinkall; Janet B. Pierrehumbert; Stefan Zohren
  16. Exploratory Utility Maximization Problem with Tsallis Entropy By Chen Ziyi; Gu Jia-wen
  17. Strategizing with AI: Insights from a Beauty Contest Experiment By Iuliia Alekseenko; Dmitry Dagaev; Sofia Paklina; Petr Parshakov
  18. The heterogeneous impact of the EU-Canada agreement with causal machine learning By Lionel Fontagné; Francesca Micocci; Armando Rungi
  19. Predicting Socio-economic Indicator Variations with Satellite Image Time Series and Transformer By Robin Jarry; Marc Chaumont; Laure Berti-Equille; Gérard Subsol
  20. Utilizing Big Administrative Data in Evaluation Research: Integrating Causal Modeling, Program Theory, and Machine Learning By de Avila, Rogerio
  21. NEAT Algorithm-based Stock Trading Strategy with Multiple Technical Indicators Resonance By Li-Chun Huang
  22. Comment on "Sequential validation of treatment heterogeneity" and "Comment on generic machine learning inference on heterogeneous treatment effects in randomized experiments" By Victor Chernozhukov; Mert Demirer; Esther Duflo; Iv\'an Fern\'andez-Val
  23. GPT's Performance in Identifying Outcome Changes on ClinicalTrials.gov By Ying, Xiangji; Vorland, Colby J.; Qureshi, Riaz; Brown, Andrew William; Kilicoglu, Halil; Saldanha, Ian; DeVito, Nicholas J; Mayo-Wilson, Evan

  1. By: Zhouyu Shen; Dacheng Xiu
    Abstract: In high-dimensional regressions with low signal-to-noise ratios, we assess the predictive performance of several prevalent machine learning methods. Theoretical insights show Ridge regression's superiority in exploiting weak signals, surpassing a zero benchmark. In contrast, Lasso fails to exceed this baseline, indicating its learning limitations. Simulations reveal that Random Forest generally outperforms Gradient Boosted Regression Trees when signals are weak. Moreover, Neural Networks with l2-regularization excel in capturing nonlinear functions of weak signals. Our empirical analysis across six economic datasets suggests that the weakness of signals, not necessarily the absence of sparsity, may be Lasso's major limitation in economic predictions.
    JEL: C45 C52 C53 C55 C58
    Date: 2025–01
    URL: https://d.repec.org/n?u=RePEc:nbr:nberwo:33421
  2. By: Yoontae Hwang; Yaxuan Kong; Stefan Zohren; Yongjae Lee
    Abstract: This paper addresses the critical disconnect between prediction and decision quality in portfolio optimization by integrating Large Language Models (LLMs) with decision-focused learning. We demonstrate both theoretically and empirically that minimizing the prediction error alone leads to suboptimal portfolio decisions. We aim to exploit the representational power of LLMs for investment decisions. An attention mechanism processes asset relationships, temporal dependencies, and macro variables, which are then directly integrated into a portfolio optimization layer. This enables the model to capture complex market dynamics and align predictions with the decision objectives. Extensive experiments on S\&P100 and DOW30 datasets show that our model consistently outperforms state-of-the-art deep learning models. In addition, gradient-based analyses show that our model prioritizes the assets most crucial to decision making, thus mitigating the effects of prediction errors on portfolio performance. These findings underscore the value of integrating decision objectives into predictions for more robust and context-aware portfolio management.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.00828
  3. By: Di Zhang
    Abstract: Triangular arbitrage is a profitable trading strategy in financial markets that exploits discrepancies in currency exchange rates. Traditional methods for detecting triangular arbitrage opportunities, such as exhaustive search algorithms and linear programming solvers, often suffer from high computational complexity and may miss potential opportunities in dynamic markets. In this paper, we propose a novel approach to triangular arbitrage detection using Graph Neural Networks (GNNs). By representing the currency exchange network as a graph, we leverage the powerful representation and learning capabilities of GNNs to identify profitable arbitrage opportunities more efficiently. Specifically, we formulate the triangular arbitrage problem as a graph-based optimization task and design a GNN architecture that captures the complex relationships between currencies and exchange rates. We introduce a relaxed loss function to enable more flexible learning and integrate Deep Q-Learning principles to optimize the expected returns. Our experiments on a synthetic dataset demonstrate that the proposed GNN-based method achieves a higher average yield with significantly reduced computational time compared to traditional methods. This work highlights the potential of using GNNs for solving optimization problems in finance and provides a promising approach for real-time arbitrage detection in dynamic financial markets.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.03194
  4. By: Fernando Perez-Cruz; Hyun Song Shin
    Abstract: Multimodal large language models (LLMs), trained on vast datasets are becoming increasingly capable in many settings. However, the capabilities of such models are typically evaluated in narrow tasks, much like standard machine learning models trained for specific objectives. We take a different tack by putting the latest LLM agents through their paces in general tasks involved in solving three popular games - Wordle, Face Quiz and Flashback. These games are easily tackled by humans but they demand a degree of self-awareness and higher-level abilities to experiment, to learn from mistakes and to plan accordingly. We find that the LLM agents display mixed performance in these general tasks. They lack the awareness to learn from mistakes and the capacity for self-correction. LLMs' performance in the most complex cognitive subtasks may not be the limiting factor for their deployment in real-world environments. Instead, it would be important to evaluate the capabilities of AGI-aspiring LLMs through general tests that encompass multiple cognitive tasks, enabling them to solve complete, real-world applications.
    Keywords: AI Agents, LLMs evaluation
    JEL: C88
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:bis:biswps:1245
  5. By: Kubitza, Dennis Oliver; Weßling, Katarina
    Abstract: Transitions from school to further education, training, or work are among the most extensively researched topics in the social sciences. Success in such transitions is influenced by predictors operating at multiple levels, such as the individual, the institutional, or the regional level. These levels are intertwined, creating complex inter-dependencies in their influence on transitions. To unravel them, researchers typically apply (multilevel) regression techniques and focus on mediating and moderating relations between distinct predictors. Recent research demonstrates that machine learning techniques can uncover previously overlooked patterns among variables. To detect new patterns in transitions from school to vocational training, we apply artificial neural networks (ANNs) trained on survey data from the German National Educational Panel Study (NEPS) linked with regional data. For an accessible interpretation of complex patterns, we use explainable artificial intelligence (XAI) methods. We establish multiple non-linear interactions within and across levels, concluding that they have the potential to inspire new substantive research questions. We argue that adopting ANNs in the social sciences yields new insights into established relationships and makes complex patterns more accessible
    Keywords: school-to-work transitions, VET, machine learning, explainable artificial neuronal networks, SHAP values, rule extraction
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:zbw:esprep:310974
  6. By: Pataranutaporn, Pat (Massachusetts Institute of Technology); Powdthavee, Nattavudh (Nanyang Technological University, Singapore); Maes, Pattie (Massachusetts Institute of Technology)
    Abstract: We investigate whether artificial intelligence can address the peer review crisis in economics by analyzing 27, 090 evaluations of 9, 030 unique submissions using a large language model (LLM). The experiment systematically varies author characteristics (e.g., affiliation, reputation, gender) and publication quality (e.g., top-tier, mid-tier, low-tier, AI-generated papers). The results indicate that LLMs effectively distinguish paper quality but exhibit biases favoring prominent institutions, male authors, and renowned economists. Additionally, LLMs struggle to differentiate high-quality AI-generated papers from genuine top-tier submissions. While LLMs offer efficiency gains, their susceptibility to bias necessitates cautious integration and hybrid peer review models to balance equity and accuracy.
    Keywords: Artificial Intelligence, peer review, large language model (LLM), bias in academia, economics publishing, equity-efficiency trade-off
    JEL: A11 C63 O33 I23
    Date: 2025–01
    URL: https://d.repec.org/n?u=RePEc:iza:izadps:dp17659
  7. By: Ramaharo, Franck Maminirina (Ministry of Economy and Finance (Ministère de l'Economie et des Finances)); Rasolofomanana, Gerzhino H (Ministry of Economy and Finances)
    Abstract: We investigate the predictive power of different machine learning algorithms to nowcast Madagascar's gross domestic product (GDP). We trained popular regression models, including linear regularized regression (Ridge, Lasso, Elastic-net), dimensionality reduction model (principal component regression), k-nearest neighbors algorithm (k-NN regression), support vector regression (linear SVR), and tree-based ensemble models (Random forest and XGBoost regressions), on 10 Malagasy quarterly macroeconomic leading indicators over the period 2007Q1-2022Q4, and we used simple econometric models as a benchmark. We measured the nowcast accuracy of each model by calculating the root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). Our findings reveal that the Ensemble Model, formed by aggregating individual predictions, consistently outperforms traditional econometric models. We conclude that machine learning models can deliver more accurate and timely nowcasts of Malagasy economic performance and provide policymakers with additional guidance for data-driven decision making.
    Date: 2023–12–22
    URL: https://d.repec.org/n?u=RePEc:osf:africa:vpuac_v1
  8. By: Joshua Rosaler; Luca Candelori; Vahagn Kirakosyan; Kharen Musaelian; Ryan Samson; Martin T. Wells; Dhagash Mehta; Stefano Pasquali
    Abstract: We investigate the application of quantum cognition machine learning (QCML), a novel paradigm for both supervised and unsupervised learning tasks rooted in the mathematical formalism of quantum theory, to distance metric learning in corporate bond markets. Compared to equities, corporate bonds are relatively illiquid and both trade and quote data in these securities are relatively sparse. Thus, a measure of distance/similarity among corporate bonds is particularly useful for a variety of practical applications in the trading of illiquid bonds, including the identification of similar tradable alternatives, pricing securities with relatively few recent quotes or trades, and explaining the predictions and performance of ML models based on their training data. Previous research has explored supervised similarity learning based on classical tree-based models in this context; here, we explore the application of the QCML paradigm for supervised distance metric learning in the same context, showing that it outperforms classical tree-based models in high-yield (HY) markets, while giving comparable or better performance (depending on the evaluation metric) in investment grade (IG) markets.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.01495
  9. By: Marco Bardoscia (BANK OF ENGLAND); Adrian Carro (BANCO DE ESPAÑA AND UNIVERSITY OF OXFORD); Marc Hinterschweiger (BANK OF ENGLAND); Mauro Napoletano (SCUOLA SUPERIORE SANT’ANNA, UNIVERSITÉ CÔTE D’AZUR AND SCIENCES PO, OFCE); Lilit Popoyan (UNIVERSITY OF LONDON AND SCUOLA SUPERIORE SANT’ANNA); Andrea Roventini (SCUOLA SUPERIORE SANT’ANNA AND SCIENCES PO, OFCE); Arzu Uluc (BANK OF ENGLAND)
    Abstract: We develop a macroeconomic agent-based model to study the joint impact of borrower and lender-based prudential policies on the housing and credit markets and the economy more widely. We perform three experiments: (i) an increase of total capital requirements; (ii) the introduction of a loan-to-income (LTI) cap on mortgages to owner-occupiers; and (iii) the introduction of both experiments at the same time. Our results suggest that tightening capital requirements leads to a sharp decrease in commercial and mortgage lending and housing transactions. When the LTI cap is in place, house prices fall sharply relative to income and the homeownership rate decreases. When both policy instruments are combined, we find that housing transactions and prices drop. Both policies have a positive impact on real GDP and unemployment, while having no material impact on inflation and the real interest rate.
    Keywords: prudential policies, housing market, macroeconomy, agent-based models
    JEL: C63 D1 D31 E58 G21 G28 R2 R21 R31
    Date: 2024–01
    URL: https://d.repec.org/n?u=RePEc:bde:wpaper:2502
  10. By: Matthew Caron (Paderborn University); Oliver Müller (Paderborn University); Johannes Kriebel (University of Hamburg)
    Abstract: Shortcut learning is a critical challenge in machine learning (ML) that arises when models rely on spurious patterns or superficial associations rather than meaningful relationships in the data. While this issue has been widely studied in computer vision and natural language processing, its impact on tabular and categorical data -- i.e., data common in ML-based research within Information Systems (IS) -- remains underexplored. To address this challenge, we propose a two-phase framework: detecting shortcut learning biases through advanced sampling strategies and mitigating these biases using methods like feature exclusion. Additionally, we emphasize the importance of transparent reporting to enhance reproducibility and provide insights into a model’s generalization capabilities. Using simulated and real-world data, we demonstrate the harmful effects of shortcut learning in tabular data. The results highlight how distribution shifts expose shortcut dependencies, a key focus of the detection phase in our framework. These shifts reveal how models relying on shortcuts fail to generalize beyond training data. While our mitigation strategy is exploratory, it demonstrates that addressing shortcut learning is feasible and underscores the need for further research into model-agnostic solutions. By encouraging comprehensive evaluations and transparent reporting, this work aims to advance the generalizability, reproducibility, and reliability of ML-based research in IS.
    Keywords: Machine Learning; ML-Based Research; Shortcut Learning; Reproducibility; Generalizability
    JEL: C8
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:pdn:dispap:129
  11. By: Daniil Karzanov; Rub\'en Garz\'on; Mikhail Terekhov; Caglar Gulcehre; Thomas Raffinot; Marcin Detyniecki
    Abstract: This paper introduces a novel agent-based approach for enhancing existing portfolio strategies using Proximal Policy Optimization (PPO). Rather than focusing solely on traditional portfolio construction, our approach aims to improve an already high-performing strategy through dynamic rebalancing driven by PPO and Oracle agents. Our target is to enhance the traditional 60/40 benchmark (60% stocks, 40% bonds) by employing the Regret-based Sharpe reward function. To address the impact of transaction fee frictions and prevent signal loss, we develop a transaction cost scheduler. We introduce a future-looking reward function and employ synthetic data training through a circular block bootstrap method to facilitate the learning of generalizable allocation strategies. We focus on two key evaluation measures: return and maximum drawdown. Given the high stochasticity of financial markets, we train 20 independent agents each period and evaluate their average performance against the benchmark. Our method not only enhances the performance of the existing portfolio strategy through strategic rebalancing but also demonstrates strong results compared to other baselines.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.02619
  12. By: George Fatouros; Kostas Metaxas; John Soldatos; Manos Karathanassis
    Abstract: MarketSenseAI is a novel framework for holistic stock analysis which leverages Large Language Models (LLMs) to process financial news, historical prices, company fundamentals and the macroeconomic environment to support decision making in stock analysis and selection. In this paper, we present the latest advancements on MarketSenseAI, driven by rapid technological expansion in LLMs. Through a novel architecture combining Retrieval-Augmented Generation and LLM agents, the framework processes SEC filings and earnings calls, while enriching macroeconomic analysis through systematic processing of diverse institutional reports. We demonstrate a significant improvement in fundamental analysis accuracy over the previous version. Empirical evaluation on S\&P 100 stocks over two years (2023-2024) shows MarketSenseAI achieving cumulative returns of 125.9% compared to the index return of 73.5%, while maintaining comparable risk profiles. Further validation on S\&P 500 stocks during 2024 demonstrates the framework's scalability, delivering a 33.8% higher Sortino ratio than the market. This work marks a significant advancement in applying LLM technology to financial analysis, offering insights into the robustness of LLM-driven investment strategies.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.00415
  13. By: Chowdhury, Koushik
    Abstract: Collaborative learning is a method of education in which a group of learners solves a particular task. A collaborative setting encourages learners to take a more active role in knowledge construction. However, when they communicate on a virtual platform such as a chat platform, it is important that they can refer to each other correctly so that they can improve their learning activities with the help of each other, but learners can be sidetracked, which retards their learning progress. To address this issue, this thesis practiced text classification approaches to regularize the conversation between learners so they could refer to each other correctly. The dataset was collected from a focus group experiment designed for students in the Educational Technology Department at Saarland University. The report gives a clear idea of how the collected dataset has been coded and validated with the help of intercoder reliability measurements. After data preprocessing, state-of-the-art data augmentation techniques such as spelling, insertion, substitution, and synonym augmentation are applied. The thesis examines various neural network models to identify the best model for the dataset. Among them, Bidirectional Encoder Representations from Transformers (BERT) provides the best performance with an accuracy of 0.94 and a 0.17 loss value for the augmented preprocessed dataset, where recurrent neural network models tend to overfit. In the evaluation part, a summary of performance matrices is shown, and to evaluate the model, a new dataset with similar data is generated with the help of the OpenAI API Key. The BERT model is able to classify 960 responses out of 1005, where both recurrent neural networks are classified less than 200. The thesis also discussed the issue of model poisoning so that when the model is updated, it can tackle the unclassified responses. Finally, a simple demo of how this BERT model is used to regularize the discourse of two collaborative learners is presented with the help of the Jupyter interface.
    Date: 2023–05–11
    URL: https://d.repec.org/n?u=RePEc:osf:thesis:hjk4b_v1
  14. By: Alnajjar, Khalid; Hämäläinen, Mika
    Abstract: This study explored the integration of futures studies into business strategy, focusing on the development of a nоvel theoretical framework and computational methods for forecasting future operational environments. Recognizing the critical role of anticipating technological paradigm shifts, as evidenced by the downfall of companies such as Blockbuster, Palm and Nokia, we proposed a new framework called MLPESTEL or Multilayer PESTEL. The framework combines PESTEL analysis with Bronfenbrenner’s Ecological Systems Theory. This amalgamation aims to provide a more holistic understanding of a company's operational environment, extending from macro to micro levels. However, adapting Bronfenbrenner’s model, originally focused on children's social development, to business context presents a unique challenge. Our methodology involved employing advanced AI tools, specifically large language models (LLMs), to analyze and predict changes in various business environments. This approach marks a significant shift from traditional AI applications, which predominantly rely on numerical data, to leveraging LLMs for textual data analysis. Our goal was not to focus on specific companies but to develop and validate generic models applicable across different organizational contexts. By analyzing forecasts for several existing companies, we aimed to validate our model's reliability.
    Date: 2024–10–29
    URL: https://d.repec.org/n?u=RePEc:osf:thesis:qz8hk_v1
  15. By: Felix Drinkall; Janet B. Pierrehumbert; Stefan Zohren
    Abstract: Large language models (LLMs) have shown remarkable success in language modelling due to scaling laws found in model size and the hidden dimension of the model's text representation. Yet, we demonstrate that compressed representations of text can yield better performance in LLM-based regression tasks. In this paper, we compare the relative performance of embedding compression in three different signal-to-noise contexts: financial return prediction, writing quality assessment and review scoring. Our results show that compressing embeddings, in a minimally supervised manner using an autoencoder's hidden representation, can mitigate overfitting and improve performance on noisy tasks, such as financial return prediction; but that compression reduces performance on tasks that have high causal dependencies between the input and target data. Our results suggest that the success of interpretable compressed representations such as sentiment may be due to a regularising effect.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.02199
  16. By: Chen Ziyi; Gu Jia-wen
    Abstract: We study expected utility maximization problem with constant relative risk aversion utility function in a complete market under the reinforcement learning framework. To induce exploration, we introduce the Tsallis entropy regularizer, which generalizes the commonly used Shannon entropy. Unlike the classical Merton's problem, which is always well-posed and admits closed-form solutions, we find that the utility maximization exploratory problem is ill-posed in certain cases, due to over-exploration. With a carefully selected primary temperature function, we investigate two specific examples, for which we fully characterize their well-posedness and provide semi-closed-form solutions. It is interesting to find that one example has the well-known Gaussian distribution as the optimal strategy, while the other features the rare Wigner semicircle distribution, which is equivalent to a scaled Beta distribution. The means of the two optimal exploratory policies coincide with that of the classical counterpart. In addition, we examine the convergence of the value function and optimal exploratory strategy as the exploration vanishes. Finally, we design a reinforcement learning algorithm and conduct numerical experiments to demonstrate the advantages of reinforcement learning.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.01269
  17. By: Iuliia Alekseenko; Dmitry Dagaev; Sofia Paklina; Petr Parshakov
    Abstract: A Keynesian beauty contest is a wide class of games of guessing the most popular strategy among other players. In particular, guessing a fraction of a mean of numbers chosen by all players is a classic behavioral experiment designed to test iterative reasoning patterns among various groups of people. The previous literature reveals that the level of sophistication of the opponents is an important factor affecting the outcome of the game. Smarter decision makers choose strategies that are closer to theoretical Nash equilibrium and demonstrate faster convergence to equilibrium in iterated contests with information revelation. We replicate a series of classic experiments by running virtual experiments with modern large language models (LLMs) who play against various groups of virtual players. We test how advanced the LLMs' behavior is compared to the behavior of human players. We show that LLMs typically take into account the opponents' level of sophistication and adapt by changing the strategy. In various settings, most LLMs (with the exception of Llama) are more sophisticated and play lower numbers compared to human players. Our results suggest that LLMs (except Llama) are rather successful in identifying the underlying strategic environment and adopting the strategies to the changing set of parameters of the game in the same way that human players do. All LLMs still fail to play dominant strategies in a two-player game. Our results contribute to the discussion on the accuracy of modeling human economic agents by artificial intelligence.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.03158
  18. By: Lionel Fontagné (CES - Centre d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique, PSE - Paris School of Economics - UP1 - Université Paris 1 Panthéon-Sorbonne - ENS-PSL - École normale supérieure - Paris - PSL - Université Paris Sciences et Lettres - EHESS - École des hautes études en sciences sociales - ENPC - École nationale des ponts et chaussées - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement); Francesca Micocci (IMT - School for Advanced Studies Lucca); Armando Rungi (IMT - School for Advanced Studies Lucca)
    Abstract: This paper introduces a causal machine learning approach to investigate the impact of the EU-Canada Comprehensive Economic Trade Agreement (CETA). We propose a matrix completion algorithm on French customs data to obtain multidimensional counterfactuals at the firm, product and destination levels. We find a small but significant positive impact on average at the product-level intensive margin. On the other hand, the extensive margin shows product churning due to the treaty beyond regular entry-exit dynamics: one product in eight that was not previously exported substitutes almost as many that are no longer exported. When we delve into the heterogeneity, we find that the effects of the treaty are higher for products at a comparative advantage. Focusing on multiproduct firms, we find that they adjust their portfolio in Canada by reallocating towards their first and most exported product due to increasing local market competition after trade liberalization. Finally, multidimensional counterfactuals allow us to evaluate the general equilibrium effect of the CETA. Specifically, we observe trade diversion, as exports to other destinations are re-directed to Canada.
    Keywords: Free Trade Agreements, International Trade, Causal Inference, Machine Learning, Matrix Completion
    Date: 2025–01
    URL: https://d.repec.org/n?u=RePEc:hal:cesptp:halshs-04913313
  19. By: Robin Jarry (LIRMM | ICAR - Image & Interaction - LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier - CNRS - Centre National de la Recherche Scientifique - UM - Université de Montpellier); Marc Chaumont (UNIMES - Nîmes Université, LIRMM | ICAR - Image & Interaction - LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier - CNRS - Centre National de la Recherche Scientifique - UM - Université de Montpellier); Laure Berti-Equille (IRD - Institut de Recherche pour le Développement, UMR 228 Espace-Dev, Espace pour le développement - IRD - Institut de Recherche pour le Développement - UPVD - Université de Perpignan Via Domitia - AU - Avignon Université - UR - Université de La Réunion - UNC - Université de la Nouvelle-Calédonie - UG - Université de Guyane - UA - Université des Antilles - UM - Université de Montpellier); Gérard Subsol (LIRMM | ICAR - Image & Interaction - LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier - CNRS - Centre National de la Recherche Scientifique - UM - Université de Montpellier)
    Abstract: Monitoring local socio-economic variations is essential for tracking progress toward sustainable development goals. However, measuring these variations can be challenging, as it requires data collection at least twice, which is both expensive and time-consuming. To address this issue, researchers have proposed remote sensing and deep learning methods to predict socio-economic indicators. However, subtracting two predicted socio-economic indicators from different dates leads to inaccurate results. We propose a novel method for predicting socio-economic variations using satellite image time series to achieve more reliable predictions. Our method leverages both spatial and temporal information to enhance the final prediction. In our experiments, we observed that it outperforms state-of-the-art methods.
    Keywords: Remote Sensing, Image Time Series, Deep Learning, Transformer, Socio-economic indicator
    Date: 2024–11–25
    URL: https://d.repec.org/n?u=RePEc:hal:journl:lirmm-04895134
  20. By: de Avila, Rogerio
    Abstract: The increased availability of administrative data and big data, coupled with advances in causal modeling and data analytics, presents new opportunities to enhance program evaluation in public policy and social sciences. This thesis investigates how these modern theory-driven approaches can be integrated with traditional methodologies to address complex causal questions, enhancing evaluations' effectiveness, timeliness, and comprehensiveness. Guided by substantial theoretical frameworks such as those proposed by Funnell and Rogers (2011) and empirical studies like Pearl (2009), this research addresses gaps in data utilization, ethical standards, and the application of machine learning. Specific challenges include improving the precision and comprehensiveness of data analysis, ensuring ethical data use as advocated by frameworks like the Five Safes, and enhancing interdisciplinary collaboration and training. This thesis aims to demonstrate significant advancements in program evaluation by bridging these gaps, proposing a paradigm shift towards a more integrated and data-informed approach in public policy and social sciences.
    Date: 2024–11–07
    URL: https://d.repec.org/n?u=RePEc:osf:thesis:z7der_v1
  21. By: Li-Chun Huang
    Abstract: In this study, we applied the NEAT (NeuroEvolution of Augmenting Topologies) algorithm to stock trading using multiple technical indicators. Our approach focused on maximizing earning, avoiding risk, and outperforming the Buy & Hold strategy. We used progressive training data and a multi-objective fitness function to guide the evolution of the population towards these objectives. The results of our study showed that the NEAT model achieved similar returns to the Buy & Hold strategy, but with lower risk exposure and greater stability. We also identified some challenges in the training process, including the presence of a large number of unused nodes and connections in the model architecture. In future work, it may be worthwhile to explore ways to improve the NEAT algorithm and apply it to shorter interval data in order to assess the potential impact on performance.
    Date: 2024–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2501.14736
  22. By: Victor Chernozhukov; Mert Demirer; Esther Duflo; Iv\'an Fern\'andez-Val
    Abstract: We warmly thank Kosuke Imai, Michael Lingzhi Li, and Stefan Wager for their gracious and insightful comments. We are particularly encouraged that both pieces recognize the importance of the research agenda the lecture laid out, which we see as critical for applied researchers. It is also great to see that both underscore the potential of the basic approach we propose - targeting summary features of the CATE after proxy estimation with sample splitting. We are also happy that both papers push us (and the reader) to continue thinking about the inference problem associated with sample splitting. We recognize that our current paper is only scratching the surface of this interesting agenda. Our proposal is certainly not the only option, and it is exciting that both papers provide and assess alternatives. Hopefully, this will generate even more work in this area.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.01548
  23. By: Ying, Xiangji; Vorland, Colby J.; Qureshi, Riaz; Brown, Andrew William (Indiana University School of Public Health-Bloomington); Kilicoglu, Halil; Saldanha, Ian; DeVito, Nicholas J; Mayo-Wilson, Evan
    Abstract: Background: Selective non-reporting of studies and study results undermines trust in randomized controlled trials (RCTs). Changes to clinical trial outcomes are sometimes associated with bias. Manually comparing trial documents to identify changes in trial outcomes is time consuming. Objective: This study aims to assess the capacity of the Generative Pretrained Transformer 4 (GPT-4) large language model in detecting and describing changes in trial outcomes within ClinicalTrials.gov records. Methods: We will first prompt GPT-4 to define trial outcomes using five elements (i.e., domain, specific measurement, specific metric, method of aggregation, and time point). We will then prompt GPT-4 to identify outcome changes between the prospective versions of registrations and the most recent versions of registrations. We will use a random sample of 150 RCTs (~1, 500 outcomes) registered on ClinicalTrials.gov. We will include “Completed” trials categorized as “Phase 3” or “Not Applicable” and with results posted on ClinicalTrials.gov. Two independent raters will rate GPT-4’s judgements, and we will assess GPT-4’s accuracy and reliability. We will also explore the heterogeneity in GPT-4’s performance by the year of trial registration and trial type (i.e., applicable clinical trials, NIH-funded trials, and other trials). Discussion: We aim to develop methods that could assist systematic reviewers, peer reviewers, journal editors, and readers in monitoring changes in clinical trial outcomes, streamlining the review process, and improving transparency and reliability of clinical trial reporting.
    Date: 2024–02–29
    URL: https://d.repec.org/n?u=RePEc:osf:metaar:npvwr_v1

This nep-cmp issue is ©2025 by Stan Miles. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.