nep-rmg New Economics Papers
on Risk Management
Issue of 2025–03–24
twenty-two papers chosen by
Stan Miles, Thompson Rivers University


  1. Adaptive Nesterov Accelerated Distributional Deep Hedging for Efficient Volatility Risk Management By Lei Zhao; Lin Cai; Wu-Sheng Lu
  2. A deep BSDE approach for the simultaneous pricing and delta-gamma hedging of large portfolios consisting of high-dimensional multi-asset Bermudan options By Balint Negyesi; Cornelis W. Oosterlee
  3. Risk Measures for DC Pension Plan Decumulation By Peter A. Forsyth; Yuying Li
  4. Housing in the Greater Paris Area as an Inflation Hedge? By Yasmine Zouari; Aya Nasreddine
  5. Analyzing Risk Exposure Determinants in European Banking: A Regulatory Perspective By Arnone, Massimo; Costantiello, Alberto; Leogrande, Angelo
  6. Robust and Efficient Deep Hedging via Linearized Objective Neural Network By Lei Zhao; Lin Cai
  7. Handling model risk with XVAs By Cyril Bénézet; Stéphane Crépey
  8. Robust Optimization of Rank-Dependent Models with Uncertain Probabilities By Guanyu Jin; Roger J. A. Laeven; Dick den Hertog
  9. Unleashing the Potential of Large Language Models in the Finance Industry By Lee, Heungmin
  10. Modelling the term-structure of default risk under IFRS 9 within a multistate regression framework By Arno Botha; Tanja Verster; Roland Breedt
  11. Analysis of Optimal Portofolio Formation Using Markowitz Model and Portofolio Performance Evaluation By Purdanto, Andisyah
  12. Scaling Limits for Exponential Hedging in the Brownian Framework By Yan Dolinksy; Xin Zhang
  13. Enhancing Portfolio Rebalancing Efficiency Using Binomial Distribution: A Case Study of Beating the Nifty Index with good CAGR By Chaudhari, Saurav L.
  14. Bankruptcy analysis using images and convolutional neural networks (CNN) By Luiz Tavares; Jose Mazzon; Francisco Paletta; Fabio Barros
  15. From Offer to Close: A Machine Learning Approach to Forecast Real Estate Transaction Outcomes By Zhao, Yu
  16. Multi-Layer Deep xVA: Structural Credit Models, Measure Changes and Convergence Analysis By Kristoffer Andersson; Alessandro Gnoatto
  17. Utilizing Effective Dynamic Graph Learning to Shield Financial Stability from Risk Propagation By Guanyuan Yu; Qing Li; Yu Zhao; Jun Wang; YiJun Chen; Shaolei Chen
  18. A Method for Evaluating the Interpretability of Machine Learning Models in Predicting Bond Default Risk Based on LIME and SHAP By Yan Zhang; Lin Chen; Yixiang Tian
  19. Gradients can train reward models: An Empirical Risk Minimization Approach for Offline Inverse RL and Dynamic Discrete Choice Model By Enoch H. Kang; Hema Yoganarasimhan; Lalit Jain
  20. Ensemble RL through Classifier Models: Enhancing Risk-Return Trade-offs in Trading Strategies By Zheli Xiong
  21. Capital, Performance, and Regulation: An In-depth Analysis of Private Equity's Evolution By Bonthala, Ram; Purohit, Advaith; Haile, Dagim; Munipalle, Pravith; Krishnan, Pranav
  22. HedgeAgents: A Balanced-aware Multi-agent Financial Trading System By Xiangyu Li; Yawen Zeng; Xiaofen Xing; Jin Xu; Xiangmin Xu

  1. By: Lei Zhao; Lin Cai; Wu-Sheng Lu
    Abstract: In the field of financial derivatives trading, managing volatility risk is crucial for protecting investment portfolios from market changes. Traditional Vega hedging strategies, which often rely on basic and rule-based models, are hard to adapt well to rapidly changing market conditions. We introduce a new framework for dynamic Vega hedging, the Adaptive Nesterov Accelerated Distributional Deep Hedging (ANADDH), which combines distributional reinforcement learning with a tailored design based on adaptive Nesterov acceleration. This approach improves the learning process in complex financial environments by modeling the hedging efficiency distribution, providing a more accurate and responsive hedging strategy. The design of adaptive Nesterov acceleration refines gradient momentum adjustments, significantly enhancing the stability and speed of convergence of the model. Through empirical analysis and comparisons, our method demonstrates substantial performance gains over existing hedging techniques. Our results confirm that this innovative combination of distributional reinforcement learning with the proposed optimization techniques improves financial risk management and highlights the practical benefits of implementing advanced neural network architectures in the finance sector.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.17777
  2. By: Balint Negyesi; Cornelis W. Oosterlee
    Abstract: A deep BSDE approach is presented for the pricing and delta-gamma hedging of high-dimensional Bermudan options, with applications in portfolio risk management. Large portfolios of a mixture of multi-asset European and Bermudan derivatives are cast into the framework of discretely reflected BSDEs. This system is discretized by the One Step Malliavin scheme (Negyesi et al. [2024, 2025]) of discretely reflected Markovian BSDEs, which involves a $\Gamma$ process, corresponding to second-order sensitivities of the associated option prices. The discretized system is solved by a neural network regression Monte Carlo method, efficiently for a large number of underlyings. The resulting option Deltas and Gammas are used to discretely rebalance the corresponding replicating strategies. Numerical experiments are presented on both high-dimensional basket options and large portfolios consisting of multiple options with varying early exercise rights, moneyness and volatility. These examples demonstrate the robustness and accuracy of the method up to $100$ risk factors. The resulting hedging strategies significantly outperform benchmark methods both in the case of standard delta- and delta-gamma hedging.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.11706
  3. By: Peter A. Forsyth; Yuying Li
    Abstract: As the developed world replaces Defined Benefit (DB) pension plans with Defined Contribution (DC) plans, there is a need to develop decumulation strategies for DC plan holders. Optimal decumulation can be viewed as a problem in optimal stochastic control. Formulation as a control problem requires specification of an objective function, which in turn requires a definition of reward and risk. An intuitive specification of reward is the total withdrawals over the retirement period. Most retirees view risk as the possibility of running out of savings. This paper investigates several possible left tail risk measures, in conjunction with DC plan decumulation. The risk measures studied include (i) expected shortfall (ii) linear shortfall and (iii) probability of shortfall. We establish that, under certain assumptions, the set of optimal controls associated with all expected reward and expected shortfall Pareto efficient frontier curves is identical to the set of optimal controls for all expected reward and linear shortfall Pareto efficient frontier curves. Optimal efficient frontiers are determined computationally for each risk measure, based on a parametric market model. Robustness of these strategies is determined by testing the strategies out-of-sample using block bootstrapping of historical data.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.16364
  4. By: Yasmine Zouari (Métis Lab EM Normandie - EM Normandie - École de Management de Normandie = EM Normandie Business School); Aya Nasreddine (CEROS - Centre d'Etudes et de Recherches sur les Organisations et la Stratégie - UPN - Université Paris Nanterre)
    Abstract: In this article, we use the framework of inflation beta to test the capacity of physical residential real estate to hedge against inflation and its components, and compare it to the inflation hedge ability of various financial assets. Specifically, the housing asset is represented by the residential market in the communes of the "Grand Paris" metropolis with the different components of inflation. We start by analyzing the residential market in this area, its fundamentals, characteristics and dynamic. Then, applying the hierarchical clustering technique, we divide the Greater Paris area into five homogenous groups of communes and test its hedging ability using both correlation and regression analysis. Residential assets are confirmed to be a hedge against inflation, particularly against its unexpected component and thanks to its capital return rather than the rental return. On the other hand, the listed real estate does not provide the same hedging properties and thus cannot be considered as a substitute for this aim
    Keywords: Direct housing, Grand Paris Metropolis, Hedging ability, Inflation, Direct housing "Grand Paris" metropolis listed real estate inflation hedging ability asset management, "Grand Paris" metropolis, listed real estate, inflation, hedging ability, asset management
    Date: 2023–07–12
    URL: https://d.repec.org/n?u=RePEc:hal:journl:hal-04956272
  5. By: Arnone, Massimo; Costantiello, Alberto; Leogrande, Angelo
    Abstract: The paper deals only with the identification of the determinants of total risk exposure amount within the European banking system, while the importance of TREA within Basel III regulatory regimes is focused. The research provides the integration of an econometric investigation with high-end machine learning techniques for the identification of the influential financial variables of TREA. The most relevant financial determinants of TREA were identified as LCR, CRWEA, LA, and OREA. These also reflect complex interdependencies-for instance, the negative value of TREA and LCR would suggest that there were trade-offs made between risk-taking and liquidity management. Thus, the positive relationship with CRWEA, and even more so with derivatives over assets, underlines intrinsic risks from credit exposures and related to financial instruments' complexity. The report further iterates that there should be mechanisms for appropriate risk-weighting, adequate liquidity buffers, and proper operational controls so that the financial system can become significantly more stable and resilient. This work will put forward actionable recommendations to policy makers, regulators, and financial institutions on mitigating systemic vulnerabilities and further optimizing their strategies for compliance in view of an increasingly volatile financial landscape, leveraging from traditional econometric modeling insights with machine learning.
    Date: 2025–01–06
    URL: https://d.repec.org/n?u=RePEc:osf:osfxxx:2u4jb_v1
  6. By: Lei Zhao; Lin Cai
    Abstract: Deep hedging represents a cutting-edge approach to risk management for financial derivatives by leveraging the power of deep learning. However, existing methods often face challenges related to computational inefficiency, sensitivity to noisy data, and optimization complexity, limiting their practical applicability in dynamic and volatile markets. To address these limitations, we propose Deep Hedging with Linearized-objective Neural Network (DHLNN), a robust and generalizable framework that enhances the training procedure of deep learning models. By integrating a periodic fixed-gradient optimization method with linearized training dynamics, DHLNN stabilizes the training process, accelerates convergence, and improves robustness to noisy financial data. The framework incorporates trajectory-wide optimization and Black-Scholes Delta anchoring, ensuring alignment with established financial theory while maintaining flexibility to adapt to real-world market conditions. Extensive experiments on synthetic and real market data validate the effectiveness of DHLNN, demonstrating its ability to achieve faster convergence, improved stability, and superior hedging performance across diverse market scenarios.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.17757
  7. By: Cyril Bénézet (LaMME - Laboratoire de Mathématiques et Modélisation d'Evry - ENSIIE - Ecole Nationale Supérieure d'Informatique pour l'Industrie et l'Entreprise - UEVE - Université d'Évry-Val-d'Essonne - Université Paris-Saclay - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement, ENSIIE - Ecole Nationale Supérieure d'Informatique pour l'Industrie et l'Entreprise); Stéphane Crépey (LPSM (UMR_8001) - Laboratoire de Probabilités, Statistique et Modélisation - SU - Sorbonne Université - CNRS - Centre National de la Recherche Scientifique - UPCité - Université Paris Cité, UPCité - Université Paris Cité)
    Abstract: In this paper we revisit Burnett (2021) & Burnett and Williams (2021)'s notion of hedging valuation adjustment (HVA), originally intended to deal with dynamic hedging frictions such as transaction costs, in the direction of model risk. The corresponding HVA reconciles a global fair valuation model with the local models used by the different desks of the bank. Model risk and dynamic hedging frictions indeed deserve a reserve, but a risk-adjusted one, so not only an HVA, but also a contribution to the KVA of the bank. The orders of magnitude of the effects involved suggest that local models should not so much be managed via reserves, as excluded altogether.
    Keywords: Pricing models, Model risk, Calibration, Market risk, Counterparty credit risk, Transaction Costs, Cross Valuation Adjustments (XVAs)
    Date: 2024
    URL: https://d.repec.org/n?u=RePEc:hal:journl:hal-03675291
  8. By: Guanyu Jin; Roger J. A. Laeven; Dick den Hertog
    Abstract: This paper studies distributionally robust optimization for a large class of risk measures with ambiguity sets defined by $\phi$-divergences. The risk measures are allowed to be non-linear in probabilities, are represented by a Choquet integral possibly induced by a probability weighting function, and include many well-known examples (for example, CVaR, Mean-Median Deviation, Gini-type). Optimization for this class of robust risk measures is challenging due to their rank-dependent nature. We show that for many types of probability weighting functions including concave, convex and inverse $S$-shaped, the robust optimization problem can be reformulated into a rank-independent problem. In the case of a concave probability weighting function, the problem can be further reformulated into a convex optimization problem with finitely many constraints that admits explicit conic representability for a collection of canonical examples. While the number of constraints in general scales exponentially with the dimension of the state space, we circumvent this dimensionality curse and provide two types of upper and lower bounds algorithms. They yield tight upper and lower bounds on the exact optimal value and are formally shown to converge asymptotically. This is illustrated numerically in two examples given by a robust newsvendor problem and a robust portfolio choice problem.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.11780
  9. By: Lee, Heungmin
    Abstract: The rapid advancements in large language models (LLMs) have ushered in a new era of transformative potential for the finance industry. This paper explores the latest developments in the application of LLMs across key areas of the finance domain, highlighting their significant impact and future implications. In the realm of financial analysis and modelling, LLMs have demonstrated the ability to outperform traditional models in tasks such as stock price prediction, portfolio optimization, and risk assessment. By processing vast amounts of financial data and leveraging their natural language understanding capabilities, these models can generate insightful analyses, identify patterns, and provide data-driven recommendations to support decision-making processes. The conversational capabilities of LLMs have also revolutionized the customer service landscape in finance. LLMs can engage in natural language dialogues, addressing customer inquiries, providing personalized financial advice, and even handling complex tasks like loan applications and investment planning. This integration of LLMs into financial institutions has the potential to enhance customer experiences, improve response times, and reduce the workload of human customer service representatives. Furthermore, LLMs are making significant strides in the realm of risk management and compliance. These models can analyze complex legal and regulatory documents, identify potential risks, and suggest appropriate remedial actions. By automating routine compliance tasks, such as anti-money laundering (AML) checks and fraud detection, LLMs can help financial institutions enhance their risk management practices and ensure better compliance, mitigating the risk of costly penalties or reputational damage. As the finance industry continues to embrace the transformative potential of LLMs, it will be crucial to address the challenges surrounding data privacy, algorithmic bias, and the responsible development of these technologies. By navigating these considerations, the finance sector can harness the full capabilities of LLMs to drive innovation, improve efficiency, and ultimately, enhance the overall financial ecosystem.
    Date: 2025–01–03
    URL: https://d.repec.org/n?u=RePEc:osf:osfxxx:ahkd3_v1
  10. By: Arno Botha; Tanja Verster; Roland Breedt
    Abstract: The lifetime behaviour of loans is notoriously difficult to model, which can compromise a bank's financial reserves against future losses, if modelled poorly. Therefore, we present a data-driven comparative study amongst three techniques in modelling a series of default risk estimates over the lifetime of each loan, i.e., its term-structure. The behaviour of loans can be described using a nonstationary and time-dependent semi-Markov model, though we model its elements using a multistate regression-based approach. As such, the transition probabilities are explicitly modelled as a function of a rich set of input variables, including macroeconomic and loan-level inputs. Our modelling techniques are deliberately chosen in ascending order of complexity: 1) a Markov chain; 2) beta regression; and 3) multinomial logistic regression. Using residential mortgage data, our results show that each successive model outperforms the previous, likely as a result of greater sophistication. This finding required devising a novel suite of simple model diagnostics, which can itself be reused in assessing sampling representativeness and the performance of other modelling techniques. These contributions surely advance the current practice within banking when conducting multistate modelling. Consequently, we believe that the estimation of loss reserves will be more timeous and accurate under IFRS 9.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.14479
  11. By: Purdanto, Andisyah
    Abstract: The year 2020 was a challenging year for investment worldwide, especially for the stock market. This was due to the World Health Organization (WHO) declaring Covid-19 a pandemic. This resulted in the composite stock price index (IHSG) dropping from 6300 to 3900. High volatility occurred from March 2020 until the end of 2022, and traders took advantage of this by engaging in high-risk short selling. The purpose of this research is to analyze the formation of a Markowitz portfolio and evaluate the performance of portfolios formed from the IDX30 index and the BSE Sensex index, focusing on the period from one month before the rebound, which is from August 2020 to January 2023. This analysis aims to provide guidance in selecting companies for investment. The research methodology is descriptive research. The methodology used is Markowitz modeling to obtain an optimal portfolio, followed by evaluation using the Treynor, Sharpe, and Jensen indexes. The results, for the IDX30 index, an optimal portfolio comprising six stocks achieved an expected annual return of 19.97% with a risk level of 8.77%. In contrast, the optimal portfolio for the BSE Sensex index consisted of eight stocks, yielding an expected annual return of 28.4% with a risk level of 4%. Regarding performance, the portfolio formed from the BSE Sensex index outperformed the IDX30 portfolio when assessed using the Sharpe indices. However, considering the Jensen and Treynor index, the optimal portfolio formed from the IDX30 index exhibited superior performance.
    Date: 2024–08–26
    URL: https://d.repec.org/n?u=RePEc:osf:osfxxx:h2yja_v1
  12. By: Yan Dolinksy; Xin Zhang
    Abstract: In this paper, we consider scaling limits of exponential utility indifference prices for European contingent claims in the Bachelier model. We show that the scaling limit can be represented in terms of the \emph{specific relative entropy}, and in addition we construct asymptotic optimal hedging strategies. To prove the upper bound for the limit, we formulate the dual problem as a stochastic control, and show there exists a classical solution to its HJB equation. The proof for the lower bound relies on the duality result for exponential hedging in discrete time.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.17186
  13. By: Chaudhari, Saurav L. (HTNP Industries)
    Abstract: This paper explores the application of the Binomial Distribution Theorem in optimizing portfolio rebalancing strategies to outperform the Nifty Index. A model based on the Binomial distribution is proposed for identifying entry and exit points in stocks, aiming for a 30\% Compound Annual Growth Rate (CAGR). Our empirical analysis demonstrates that by systematically applying this technique, portfolio managers can significantly enhance returns while maintaining risk levels comparable to the benchmark index. This method shows potential for outperforming traditional rebalancing strategies. Extensions of this theorem, including Monte Carlo simulations and Black-Scholes adjustments, are incorporated to further refine the model and enhance its effectiveness.
    Date: 2024–10–23
    URL: https://d.repec.org/n?u=RePEc:osf:osfxxx:u5q97_v1
  14. By: Luiz Tavares; Jose Mazzon; Francisco Paletta; Fabio Barros
    Abstract: The marketing departments of financial institutions strive to craft products and services that cater to the diverse needs of businesses of all sizes. However, it is evident upon analysis that larger corporations often receive a more substantial portion of available funds. This disparity arises from the relative ease of assessing the risk of default and bankruptcy in these more prominent companies. Historically, risk analysis studies have focused on data from publicly traded or stock exchange-listed companies, leaving a gap in knowledge about small and medium-sized enterprises (SMEs). Addressing this gap, this study introduces a method for evaluating SMEs by generating images for processing via a convolutional neural network (CNN). To this end, more than 10, 000 images, one for each company in the sample, were created to identify scenarios in which the CNN can operate with higher assertiveness and reduced training error probability. The findings demonstrate a significant predictive capacity, achieving 97.8% accuracy, when a substantial number of images are utilized. Moreover, the image creation method paves the way for potential applications of this technique in various sectors and for different analytical purposes.
    Date: 2025–01
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.15726
  15. By: Zhao, Yu
    Abstract: Accurately forecasting whether a real estate transaction will close is crucial for agents, lenders, and investors, impacting resource allocation, risk management, and client satisfaction. This task, however, is complex due to a combination of economic, procedural, and behavioral factors that influence transaction outcomes. Traditional machine learning approaches, particularly gradient boosting models like Gradient Boost Decision Tree, have proven effective for tabular data, outperforming deep learning models on structured datasets. However, recent advances in attention-based deep learning models present new opportunities to capture temporal dependencies and complex interactions within transaction data, potentially enhancing prediction accuracy. This article explores the challenges of forecasting real estate transaction closures, compares the performance of machine learning models, and examines how attention-based models can improve predictive insights in this critical area of real estate analytics.
    Date: 2024–11–08
    URL: https://d.repec.org/n?u=RePEc:osf:osfxxx:sxmq2_v1
  16. By: Kristoffer Andersson; Alessandro Gnoatto
    Abstract: We propose a structural default model for portfolio-wide valuation adjustments (xVAs) and represent it as a system of coupled backward stochastic differential equations. The framework is divided into four layers, each capturing a key component: (i) clean values, (ii) initial margin and Collateral Valuation Adjustment (ColVA), (iii) Credit/Debit Valuation Adjustments (CVA/DVA) together with Margin Valuation Adjustment (MVA), and (iv) Funding Valuation Adjustment (FVA). Because these layers depend on one another through collateral and default effects, a naive Monte Carlo approach would require deeply nested simulations, making the problem computationally intractable. To address this challenge, we use an iterative deep BSDE approach, handling each layer sequentially so that earlier outputs serve as inputs to the subsequent layers. Initial margin is computed via deep quantile regression to reflect margin requirements over the Margin Period of Risk. We also adopt a change-of-measure method that highlights rare but significant defaults of the bank or counterparty, ensuring that these events are accurately captured in the training process. We further extend Han and Long's (2020) a posteriori error analysis to BSDEs on bounded domains. Due to the random exit from the domain, we obtain an order of convergence of $\mathcal{O}(h^{1/4-\epsilon})$ rather than the usual $\mathcal{O}(h^{1/2})$. Numerical experiments illustrate that this method drastically reduces computational demands and successfully scales to high-dimensional, non-symmetric portfolios. The results confirm its effectiveness and accuracy, offering a practical alternative to nested Monte Carlo simulations in multi-counterparty xVA analyses.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.14766
  17. By: Guanyuan Yu; Qing Li; Yu Zhao; Jun Wang; YiJun Chen; Shaolei Chen
    Abstract: Financial risks can propagate across both tightly coupled temporal and spatial dimensions, posing significant threats to financial stability. Moreover, risks embedded in unlabeled data are often difficult to detect. To address these challenges, we introduce GraphShield, a novel approach with three key innovations: Enhanced Cross-Domain Infor mation Learning: We propose a dynamic graph learning module to improve information learning across temporal and spatial domains. Advanced Risk Recognition: By leveraging the clustering characteristics of risks, we construct a risk recognizing module to enhance the identification of hidden threats. Risk Propagation Visualization: We provide a visualization tool for quantifying and validating nodes that trigger widespread cascading risks. Extensive experiments on two real-world and two open-source datasets demonstrate the robust performance of our framework. Our approach represents a significant advancement in leveraging artificial intelligence to enhance financial stability, offering a powerful solution to mitigate the spread of risks within financial networks.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.13979
  18. By: Yan Zhang; Lin Chen; Yixiang Tian
    Abstract: Interpretability analysis methods for artificial intelligence models, such as LIME and SHAP, are widely used, though they primarily serve as post-model for analyzing model outputs. While it is commonly believed that the transparency and interpretability of AI models diminish as their complexity increases, currently there is no standardized method for assessing the inherent interpretability of the models themselves. This paper uses bond market default prediction as a case study, applying commonly used machine learning algorithms within AI models. First, the classification performance of these algorithms in default prediction is evaluated. Then, leveraging LIME and SHAP to assess the contribution of sample features to prediction outcomes, the paper proposes a novel method for evaluating the interpretability of the models themselves. The results of this analysis are consistent with the intuitive understanding and logical expectations regarding the interpretability of these models.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.19615
  19. By: Enoch H. Kang; Hema Yoganarasimhan; Lalit Jain
    Abstract: We study the problem of estimating Dynamic Discrete Choice (DDC) models, also known as offline Maximum Entropy-Regularized Inverse Reinforcement Learning (offline MaxEnt-IRL) in machine learning. The objective is to recover reward or $Q^*$ functions that govern agent behavior from offline behavior data. In this paper, we propose a globally convergent gradient-based method for solving these problems without the restrictive assumption of linearly parameterized rewards. The novelty of our approach lies in introducing the Empirical Risk Minimization (ERM) based IRL/DDC framework, which circumvents the need for explicit state transition probability estimation in the Bellman equation. Furthermore, our method is compatible with non-parametric estimation techniques such as neural networks. Therefore, the proposed method has the potential to be scaled to high-dimensional, infinite state spaces. A key theoretical insight underlying our approach is that the Bellman residual satisfies the Polyak-Lojasiewicz (PL) condition -- a property that, while weaker than strong convexity, is sufficient to ensure fast global convergence guarantees. Through a series of synthetic experiments, we demonstrate that our approach consistently outperforms benchmark methods and state-of-the-art alternatives.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.14131
  20. By: Zheli Xiong
    Abstract: This paper presents a comprehensive study on the use of ensemble Reinforcement Learning (RL) models in financial trading strategies, leveraging classifier models to enhance performance. By combining RL algorithms such as A2C, PPO, and SAC with traditional classifiers like Support Vector Machines (SVM), Decision Trees, and Logistic Regression, we investigate how different classifier groups can be integrated to improve risk-return trade-offs. The study evaluates the effectiveness of various ensemble methods, comparing them with individual RL models across key financial metrics, including Cumulative Returns, Sharpe Ratios (SR), Calmar Ratios, and Maximum Drawdown (MDD). Our results demonstrate that ensemble methods consistently outperform base models in terms of risk-adjusted returns, providing better management of drawdowns and overall stability. However, we identify the sensitivity of ensemble performance to the choice of variance threshold {\tau}, highlighting the importance of dynamic {\tau} adjustment to achieve optimal performance. This study emphasizes the value of combining RL with classifiers for adaptive decision-making, with implications for financial trading, robotics, and other dynamic environments.
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.17518
  21. By: Bonthala, Ram; Purohit, Advaith; Haile, Dagim; Munipalle, Pravith; Krishnan, Pranav
    Abstract: This paper explores the evolution of private equity (PE), tracing its origins to early investment models and analyzing its modern developments. The focus is on understanding the dynamics of PE performance during downturns, the role of dry powder, and the challenges of regulation and transparency. Additionally, insights from interviews with local private equity professionals shed light on decision-making, risk management, and valuation methods in the private equity industry today.
    Date: 2024–10–08
    URL: https://d.repec.org/n?u=RePEc:osf:osfxxx:8t7rx_v1
  22. By: Xiangyu Li; Yawen Zeng; Xiaofen Xing; Jin Xu; Xiangmin Xu
    Abstract: As automated trading gains traction in the financial market, algorithmic investment strategies are increasingly prominent. While Large Language Models (LLMs) and Agent-based models exhibit promising potential in real-time market analysis and trading decisions, they still experience a significant -20% loss when confronted with rapid declines or frequent fluctuations, impeding their practical application. Hence, there is an imperative to explore a more robust and resilient framework. This paper introduces an innovative multi-agent system, HedgeAgents, aimed at bolstering system robustness via ``hedging'' strategies. In this well-balanced system, an array of hedging agents has been tailored, where HedgeAgents consist of a central fund manager and multiple hedging experts specializing in various financial asset classes. These agents leverage LLMs' cognitive capabilities to make decisions and coordinate through three types of conferences. Benefiting from the powerful understanding of LLMs, our HedgeAgents attained a 70% annualized return and a 400% total return over a period of 3 years. Moreover, we have observed with delight that HedgeAgents can even formulate investment experience comparable to those of human experts (https://hedgeagents.github.io/).
    Date: 2025–02
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2502.13165

This nep-rmg issue is ©2025 by Stan Miles. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.