Skip to content

Expected Payoff Matrices

Lawrence Thatcher edited this page Nov 21, 2016 · 16 revisions

Expected Payoff

Dr. Goodrich gave a hint on how to avoid doing a bunch of repeated play simulations each generation, by finding the long-run expected payoff matrix. (Quoted below)

The most efficient approach is to figure out what strategy A against strategy B would earn given a particular gamma, V(A|B) for all A and B pairs. This becomes the payoff matrix, and you use replicator or imitator dynamics on the V(A|B)'s.

General Formula

The formula for the long term discounted expected reward is:

V(A|B) = Σt=0[ γt U1(At | Bt)]

Games

We can store these pre-computed V(A|B) values here for the various games:

Prisoner's Dilemma

Payoff Matrix

C D
C (R, R) (S, T)
CD (T, S) (P, P)

Where:

  • R = 3
  • T = 5
  • S = 1
  • P = 2

Expected Payoff Matrix

AC AD TfT NTfT
AC R1-γ S1-γ R1-γ S1-γ
AD T1-γ P1-γ T + γP1-γ S + 1-γ
TfT R1-γ S + 1-γ R1-γ S + Pγ + Tγ2 + Rγ3
1-γ4
NTfT T1-γ T + 1-γ T + Pγ + Sγ2 + Rγ3
1-γ4
P + Rγ
1 - γ2

Note: This should be read where the row is the first player, and the column the second. So the entry on row AC and column AD should be intepreted as: V(AC | AD)

Stag Hunt

Payoff Matrix

C D
C (5, 5) (1, 3)
CD (3, 1) (3, 3)

Note: This expected payoff matrix for this could be thought of the same as for the Prisoner's Dilemma above, using the following values:

  • R = 5
  • T = 3
  • S = 1
  • P = 3

Reformed Expected Payoff Matrix

This is the same payoff matrix as above, but since P == T we've re-written it by swapping out all of the P elements with T.

Using:

  • R = 5
  • T = 3
  • S = 1
AC AD TfT NTfT
AC R1-γ S1-γ R1-γ S1-γ
AD T1-γ T1-γ T + γT1-γ S + 1-γ
TfT R1-γ S + 1-γ R1-γ S + Tγ + Tγ2 + Rγ3
1-γ4
NTfT T1-γ T + 1-γ T + Tγ + Sγ2 + Rγ3
1-γ4
T + Rγ
1 - γ2

Battle of the Sexes

This matrix also allows choice of gender, which can be thought of as having 8 different strategies.

Where:

  • R = 3
  • T = 5
  • S = 1
  • P = 2

Battle of the Sexes Expected Payoff Matrix

(H) AC (H) AD (H) TfT (H) NTfT (W) AC (W) AD (W) TfT (W) NTfT
(H) AC P1-γ S1-γ P1-γ S1-γ R1-γ S1-γ R1-γ S1-γ
(H) AD P1-γ R1-γ P + γR1-γ R + γP1-γ P1-γ T1-γ P + γT1-γ T + γP1-γ
(H) TfT P1-γ S + γR1-γ P1-γ S + Rγ + Pγ2 + Pγ3
1-γ4
R1-γ S + γT1-γ R1-γ S + Tγ + Pγ2 + Rγ3
1-γ4
(H) NTfT P1-γ R + γS1-γ P + Rγ + Sγ2 + Pγ3
1-γ4
R + Pγ
1 - γ2
P1-γ T + γS1-γ P + Tγ + Sγ2 + Rγ3
1-γ4
T + Rγ
1 - γ2
(W) AC T1-γ P1-γ T1-γ P1-γ R1-γ P1-γ R1-γ P1-γ
(W) AD S1-γ R1-γ S + γR1-γ R + γS1-γ S1-γ P1-γ S + γP1-γ P + γS1-γ
(W) TfT T1-γ P + γR1-γ T1-γ P + Rγ + Sγ2 + Tγ3
1-γ4
R1-γ P1-γ R1-γ P + Pγ + Sγ2 + Rγ3
1-γ4
(W) NTfT S1-γ R + γR1-γ S + Rγ + Pγ2 + Tγ3
1-γ4
R + Tγ
1 - γ2
S1-γ P1-γ S + Pγ + Pγ2 + Rγ3
1-γ4
P + Rγ
1 - γ2