-
Notifications
You must be signed in to change notification settings - Fork 0
Lab Specifications
The purposes of this lab are:
- To let you play with ideas from game theory.
- To teach you about how repeated play games differ from the normal game theory context.
- To let you think about developing a utility-based strategic agent for a competitive environment.
When you complete this lab, you should understand the following:
- Why the tit-for-tat strategy is such a good idea.
- Where game theoretic assumptions fail in repeated games.
- The issues involved in designing a competitive agent (such as inter-agent modeling, anticipation, forgiveness, commitment, etc.). You should also gain practice writing technical documents. Check out the grading guidelines at the bottom of the description.
You will conduct a tournament that tests how different strategies perform in repeated-play prisoner's dilemma game. You will create your own strategy, and play against the following agents:
- 1. An agent who employs the one-shot equilibrium solution (always defect)
- 2. An agent who chooses randomly
- 3. An agent who always cooperates with you (and never confesses)
- 4. An agent who employs the tit-for-tat strategy (reviewed below), and
- 5. An agent who employs the tit-for-two-tats strategy (also reviewed below).
- 6. An agent who uses the Pavlov strategy (reviewed below).
- 7. An agent who uses the Win-Stay/Lose-Shift strategy (reviewed below).
- 8. An agent who uses the Never Forgive strategy (reviewed below).
You should try out how well each strategy works when there are 5 trials, 100 trials, and 200 trials. You should also play a variant where, after each interaction, you flip a biased coin to decide if you continue to another round; the coin turns up heads p percent of the time, and whenever the coin turns up heads you continue. Conduct the experiment for p = 0.75, 0.99, and 0.9. Each agent will compete against all other agents, and the total scores will be recorded. You should use the prisoner's dilemma payoff matrix in ordinal form with 5 the most preferred outcome and 1 the least preferred.
P1/P2 | Cooperate 2 | Defect 2 |
---|---|---|
Cooperate 1 | (3,3) | (1,5) |
Defect 1 | (5,1) | (2,2) |
In the payoff matrix, cooperate means cooperating with the other prisoner by remaining silent, and defect means defecting from the team strategy by confessing to the police. If you play the game 100 trials, your score is the sum of all of the payoffs you received so, for example, if you and your opponent each cooperated, then your score would be 3*100. High scores win.
When you are done, you should turn in:
- One page describing and justifying your strategy.
- A description of the results for each number of trials listed above (5, 100, and 200); since this is a data intensive lab, take some time to present the information in an intelligible format.
- A description of the results for each of the probabilities above (0.75, 0.9, and 0.99); since this is a data intensive lab, take some time to present the information in an intelligible format.
- One page explaining why the winner won, and discussing how their strategy performed and why.
Your report will be graded on four criteria: style, grammar, analysis, content.
Style: Is there an adequate introduction? Does the conclusions section make conclusions justified by the data? Is there a logical flow to the organization of the paper?
Grammar: Is the language adequate?
Analysis: Does the report present not only what is found but why? Have you made conclusions that are not justified? Do you include multiple trials to account for uncertainty? Have you reported both means and variances of data?
Content: Have you completed what was assigned? Is there some originality in the work?