nep-gth 2024-09-09 papers

on Game Theory

Issue of 2024‒09‒09
eleven papers chosen by
Sylvain Béal, Université de Franche-Comté

Bargaining via Weber's law By V. G. Bardakhchyan; A. E. Allahverdyan
Algorithmic Pricing and Liquidity in Securities Markets By Colliard, Jean-Edouard; Foucault, Thierry; Lovo, Stefano
Capturing the Complexity of Human Strategic Decision-Making with Machine Learning By Jian-Qiao Zhu; Joshua C. Peterson; Benjamin Enke; Thomas L. Griffiths
A Geometric Nash Approach in Tuning the Learning Rate in Q-Learning Algorithm By Kwadwo Osei Bonsu
Reinforcement Learning in High-frequency Market Making By Yuheng Zheng; Zihan Ding
Homo Oeconomicus as the Homo Moralis’ Party Pooper: Heterogeneous Morality in Public Good Games By Thomas Eichner; Marco Runkel
Centralization in Attester-Proposer Separation By Mallesh Pai; Max Resnick
Spillovers from legal cooperation to tacit collusion By Jeroen Hinloopen; Stephen Martin; Leonard Treuren
Getting the Agent to Wait By Maryam Saeedi; Yikang Shen; Ali Shourideh
The emergence of enforcement By Anderlini, Luca; Felli, Leonardo; Piccione, Michele
Critical Edges in Financial Networks By Michel Alexandre; Thiago Christiano Silva; Francisco Aparecido Rodrigues

By:	V. G. Bardakhchyan; A. E. Allahverdyan
Abstract:	We solve the two-player bargaining problem using Weber's law in psychophysics, which is applied to the perception of utility changes. By applying this law, one of the players (or both of them) defines lower and upper utility thresholds, such that once the lower threshold is established, bargaining continues in the inter-threshold domain where the solutions are acceptable to both parties. This provides a sequential solution to the bargaining problem, and it can be implemented iteratively reaching well-defined outcomes. The solution is Pareto-optimal, symmetric, and is invariant to affine-transformations of utilities. For susceptible players, iterations are unnecessary, so they converge in one stage toward the (axiomatic) Nash solution of the bargaining problem. This situation also accounts for the asymmetric Nash solution, where the weights of each player are expressed via their Weber constants. Thus the Nash solution is reached without external arbiters and without requiring the independence of irrelevant alternatives. For non-susceptible players our approach leads to different results.
Date:	2024–08
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2408.02492

Algorithmic Pricing and Liquidity in Securities Markets

By:	Colliard, Jean-Edouard (HEC Paris); Foucault, Thierry (HEC Paris); Lovo, Stefano (HEC Paris)
Abstract:	We let ``Algorithmic Market Makers'' (AMs), using Q-learning algorithms, determine prices for a risky asset in a standard market making game with adverse selection and compare these prices to the Nash equilibrium of the game. We observe that AMs effectively adapt to adverse selection, adjusting prices post-trade as anticipated. However, AMs charge a markup over the competitive price and this markup increases when adverse selection costs decrease, in contrast to the predictions of the Nash equilibrium. We attribute this unexpected pattern to the diminished learning capacity of AMs when faced with increased profit variance.
Keywords:	Algorithmic pricing; Market Making; Adverse Selection; Market Power; Reinforcement learning
JEL:	D43 G10 G14
Date:	2022–10–20
URL:	https://d.repec.org/n?u=RePEc:ebg:heccah:1459

Capturing the Complexity of Human Strategic Decision-Making with Machine Learning

By:	Jian-Qiao Zhu; Joshua C. Peterson; Benjamin Enke; Thomas L. Griffiths
Abstract:	Understanding how people behave in strategic settings--where they make decisions based on their expectations about the behavior of others--is a long-standing problem in the behavioral sciences. We conduct the largest study to date of strategic decision-making in the context of initial play in two-player matrix games, analyzing over 90, 000 human decisions across more than 2, 400 procedurally generated games that span a much wider space than previous datasets. We show that a deep neural network trained on these data predicts people's choices better than leading theories of strategic behavior, indicating that there is systematic variation that is not explained by those theories. We then modify the network to produce a new, interpretable behavioral model, revealing what the original network learned about people: their ability to optimally respond and their capacity to reason about others are dependent on the complexity of individual games. This context-dependence is critical in explaining deviations from the rational Nash equilibrium, response times, and uncertainty in strategic decisions. More broadly, our results demonstrate how machine learning can be applied beyond prediction to further help generate novel explanations of complex human behavior.
Date:	2024–08
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2408.07865

A Geometric Nash Approach in Tuning the Learning Rate in Q-Learning Algorithm

By:	Kwadwo Osei Bonsu
Abstract:	This paper proposes a geometric approach for estimating the $\alpha$ value in Q learning. We establish a systematic framework that optimizes the {\alpha} parameter, thereby enhancing learning efficiency and stability. Our results show that there is a relationship between the learning rate and the angle between a vector T (total time steps in each episode of learning) and R (the reward vector for each episode). The concept of angular bisector between vectors T and R and Nash Equilibrium provide insight into estimating $\alpha$ such that the algorithm minimizes losses arising from exploration-exploitation trade-off.
Date:	2024–08
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2408.04911

Reinforcement Learning in High-frequency Market Making

By:	Yuheng Zheng; Zihan Ding
Abstract:	This paper establishes a new and comprehensive theoretical analysis for the application of reinforcement learning (RL) in high-frequency market making. We bridge the modern RL theory and the continuous-time statistical models in high-frequency financial economics. Different with most existing literature on methodological research about developing various RL methods for market making problem, our work is a pilot to provide the theoretical analysis. We target the effects of sampling frequency, and find an interesting tradeoff between error and complexity of RL algorithm when tweaking the values of the time increment $\Delta$ $-$ as $\Delta$ becomes smaller, the error will be smaller but the complexity will be larger. We also study the two-player case under the general-sum game framework and establish the convergence of Nash equilibrium to the continuous-time game equilibrium as $\Delta\rightarrow0$. The Nash Q-learning algorithm, which is an online multi-agent RL method, is applied to solve the equilibrium. Our theories are not only useful for practitioners to choose the sampling frequency, but also very general and applicable to other high-frequency financial decision making problems, e.g., optimal executions, as long as the time-discretization of a continuous-time markov decision process is adopted. Monte Carlo simulation evidence support all of our theories.
Date:	2024–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2407.21025

Homo Oeconomicus as the Homo Moralis’ Party Pooper: Heterogeneous Morality in Public Good Games

By:	Thomas Eichner; Marco Runkel
Abstract:	The main insight of this paper is that moral behavior does not necessarily alleviate coordination problems or may even worsen them, if individuals possess different degrees of morality. We characterize heterogenous Alger-Weibull morality preferences in a canonical model of voluntary contributions to a public good. The analysis reveals a novel polarization effect which traces back to a ’preference for leadership’ and weakens (strengthens) the incentive to contribute to the public good for individuals with below (above) average morality. Equilibrium public good provision is not increased by morality, as long as there are homo oeconomicus individuals. An increase in morality of an individual may reduce total provision of the public good, if heterogeneity is large enough. Redistributive transfers are no longer neutral.
Keywords:	moral behaviour, Kantian ethics, heterogeneity, public goods
JEL:	C72 D91 H41
Date:	2024
URL:	https://d.repec.org/n?u=RePEc:ces:ceswps:_11231

Centralization in Attester-Proposer Separation

By:	Mallesh Pai; Max Resnick
Abstract:	We show that Execution Tickets and Execution Auctions dramatically increase centralization in the market for block proposals, even without multi-block MEV concerns. Previous analyses have insufficiently or incorrectly modeled the interaction between ahead-of-time auctions and just-in-time (JIT) auctions. We study a model where bidders compete in an execution auction ahead of time, and then the winner holds a JIT auction to resell the proposal rights when the slot arrives. During the execution auction, bidders only know the distribution of their valuations. Bidders then draw values from their distributions and compete in the JIT auction. We show that a bidder who wins the execution auction is substantially advantaged in the JIT auction since they can set a reserve price higher than their own realized value for the slot to increase their revenue. As a result, there is a strong centralizing force in the execution auction, which allows the ex-ante strongest bidder to win the execution auction every time, and similarly gives them the strongest incentive to buy up all the tickets. Similar results trivially apply if the resale market is imperfect, since that only reinforces the advantages of the ex-ante strong buyer. To reiterate, these results do not require the bidders to employ multi-block MEV strategies, although if they did, it would likely amplify the centralizing effects.
Date:	2024–08
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2408.03116

Spillovers from legal cooperation to tacit collusion

By:	Jeroen Hinloopen; Stephen Martin; Leonard Treuren
Abstract:	Antitrust laws prohibit collusion by private firms, yet many types of interfirm coopera tion are legal. Using laboratory experiments, we study spillovers from legal cooperation in one market to tacit collusion in a different market. Subjects sequentially play two homogeneous goods Bertrand games once against the same opponent. We vary whether subjects can form binding price agreements in the first market. We find that allowing subjects to coordinate their prices in the first market significantly increases prices in the second market, elevating the incidence of non-competitive market prices by more than 60 percent. This shows that repeated interaction and communication are not necessary to achieve non-competitive prices, as long as subjects can form binding agreements in a different market. Additional treatments suggest that commitment and multimarket contact are both necessary and sufficient for spillovers from legal cooperation to tacit collusion to emerge.
Date:	2023–06
URL:	https://d.repec.org/n?u=RePEc:ete:msiper:746847

Getting the Agent to Wait

By:	Maryam Saeedi; Yikang Shen; Ali Shourideh
Abstract:	We examine the strategic interaction between an expert (principal) maximizing engagement and an agent seeking swift information. Our analysis reveals: When priors align, relative patience determines optimal disclosure -- impatient agents induce gradual revelation, while impatient principals cause delayed, abrupt revelation. When priors disagree, catering to the bias often emerges, with the principal initially providing signals aligned with the agent's bias. With private agent beliefs, we observe two phases: one engaging both agents, followed by catering to one type. Comparing personalized and non-personalized strategies, we find faster information revelation in the non-personalized case, but higher quality information in the personalized case.
Date:	2024–07
URL:	https://d.repec.org/n?u=RePEc:arx:papers:2407.19127

The emergence of enforcement

By:	Anderlini, Luca; Felli, Leonardo; Piccione, Michele
Abstract:	How do mechanisms that enforce cooperation emerge in a society where none are available and agents are endowed with just raw power that allows a more powerful agent to expropriate a less powerful one? We study a model where expropriation is costly and agents can choose whether to engage in surplus-augmenting cooperation or engage in expropriation. While in bilateral relations, if cooperation is not overwhelmingly productive and expropriation is not too costly, the latter will prevent cooperation, when there are three or more agents, powerful ones can become enforcers of cooperation for agents ranked below them. In equilibrium they will expropriate smaller amounts from multiple weaker cooperating agents who in turn will not deviate for fear of being expropriated more heavily because of their larger expropriation proceeds. Surprisingly, the details of the power structure are irrelevant for the existence of equilibria with enforcement provided that enough agents are present and one is ranked above all others. These details are instead key to the existence of other highly noncooperative equilibria that are obtained in certain cases.
Keywords:	Jungle, power structures, enforcement, rule of law
JEL:	C79 D00 D01 D31 K19 K40 K49
Date:	2024
URL:	https://d.repec.org/n?u=RePEc:zbw:wzbmbh:301156

Critical Edges in Financial Networks

By:	Michel Alexandre; Thiago Christiano Silva; Francisco Aparecido Rodrigues
Abstract:	In this study, we propose a method for the identification of influential edges in financial networks. In our approach, the critical edges are those whose removal would cause a large impact on the systemic risk of the financial network. We apply this framework to a thorough Brazilian data set to identify critical bank-firm edges. In our data set, banks and firms are connected through two financial networks: the interbank network and the bank-firm loan network. We found at least 18% of the edges are critical, in the sense they have a significant impact on the systemic risk of the network. We then employed machine learning (ML) techniques to predict the critical status and – for a large level of the initial shock – the sign of the impact of bank-firm edges on the systemic risk. The level of accuracy obtained in these prediction exercises was very high (above 90%). Posterior analysis through Shapley values shows: i) the PageRank of the edge’s destination node (the firm) is the main driver of the critical status of the edges; and ii) the sign of the edges’ impact depends on the degree of the edge’s origin node (the bank).
Date:	2024–08
URL:	https://d.repec.org/n?u=RePEc:bcb:wpaper:594

This nep-gth issue is ©2024 by Sylvain Béal. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.