Game Theory
May 19, 2024
Basics
Normal form of a game is where
- Players are
- is the set of actions available to player
- is the payoff of player
Players make the choices based on expected payoff
Mixed Strategies
Mixed Strategy is the probability distribution .
- is the probability of playing in mixed strategy
- and
- Strategy Profile is
Players make the choices based on expected payoff
Bayesian Games
Bayesian Game is
- Players are
- is the set of actions available to player
- is the set of possible types of player
- is the payoff of player
- is the distribution over types
Players make the choices based on expected payoff
Stochastic Games
If is an infinite sequence of payoffs for player , then the average payoff is
If is the discount factor, the future discounted reward is
Some strategies in an infinitely repeated prisoner's dilemma game include:
- Tit for Tat
- Trigger
- Grim
Stochastic Game is
Behavioral Game Theory
In the real world people are irational. We can relax the assumption that all players play the best response, and assume the quantal response strategy; that is high-utility actions played often than low-utility actions, which is caracterized as the softmax function where is the sensitivity to differences in utility.
Another way to think: agents choose where is a random shock for action .
Strategy profile is a Quantile Response Equilbrium (QRE) with precision if every player is simultaneously quantally responding to the profile of the other agents' strategies
level-k model: every player performs a finite number of steps of strategic reasoning:
- level-0: nonstrategic distribution of action (often uniform)
- level-1: best response to level-0 players - i.e., reading one step ahead
- level-2: best response to level-1, or to level-1 and level-2 players - i.e., reading two steps ahead, and so on...
Evolutionary Game Theory
Inspiration comes from biology and how the population learns to evolve.
Example: Prisoner's Dilemma
Let be a matrix where
- is the payoff from both players coordinate
- is the payoff from coordinating and opponent defecting
- is the payoff from defecting and opponent coordinating
- is the payoff from both players defecting
Let strategy profile for player be where is the probability to coorporate, and is probability to defect. Then the payoff for player is
Strategy for a symmetric game with payoff matrix is an Evolutionary Stable Strategy (ESS), which is analagous of a Nash Equilbrium in Evolutionary Game Theory, if
- (if 1 holds)
Population in an ESS is resistant to invasion by a small number of mutants playing a different strategy because they won't fit well with the population. Let a player belong to one of the subgroup of population . If player belongs to , then player will have the same strategy profile . The idea is the sub-populations that performed the best would grow, and those that did not perform well would shrink.
The Replicator Dynamics is the natural selection process that determines how populations playing specific strategies evolve. The fitness of the subpopulation (or strategy) is and the average fitness of the population is . The Replicator Dynamics is
If time is discrete, given hyperparameter (decay), then it is
Going back to the Prisoner's Dilemma Example, sub population can be the players who choose to coordinate and can be the players who choose to defect.