Abstract: |
This study presents a rigorous mathematical approach to the optimization of
round and betting policies in Blackjack, using Markov Decision Processes (MDP)
and Expected Utility Theory. The analysis considers a direct confrontation
between a player and the dealer, simplifying the dynamics of the game. The
objective is to develop optimal strategies that maximize expected utility for
risk profiles defined by constant (CRRA) and absolute (CARA) aversion utility
functions. Dynamic programming algorithms are implemented to estimate optimal
gambling and betting policies with different levels of complexity. The
evaluation is performed through simulations, analyzing histograms of final
returns. The results indicate that the advantage of applying optimized round
policies over the "basic strategy" is slight, highlighting the efficiency of
the last one. In addition, betting strategies based on the exact composition
of the deck slightly outperform the Hi-Lo counting system, showing its
effectiveness. The optimized strategies include versions suitable for mental
use in physical environments and more complex ones requiring computational
processing. Although the computed strategies approximate the theoretical
optimal performance, this study is limited to a specific configuration of
rules. As a future challenge, it is proposed to explore strategies under other
game configurations, considering additional players or deeper penetration of
the deck, which could pose new technical challenges. |