AI in Adversarial Games

Game Status

Tic-tac-toe is fully built and plays perfectly. It is unbeatable. It needs to be optimized and code should be cleaner.
Connect-4 is built and AI development is in progress.
Pyramix is built and a human-level heuristic-greedy player has been implemented.
Texas Hold'em is currently being rebuilt with better optimization.
Chess, Black Jack, checkers, dollar-auction, and prisoners dilemma are currently unbuilt.

Proposed Time Line

By Sunday, Nov. 11:
- minimax implemented for connect 4 + heuristics
- test RL on simple game
- theoretical results on pyramix (computational complexity etc.)
- poker fully debugged and pristine
By Sunday, Nov. 18:
- super human pyramix player
- connect 4 solved
- RL tested on pyramix
By Sunday, Nov. 25:
- our "above and beyond week"
- shocking new discoveries
- world is changed for the better
By Sunday, Dec. 2:
- presentation made
- final touches

Algorithms to test

Minimax + Alpha-Beta Pruning
Deep Q Learning
Deep Deterministic Policy Gradients
Actor Critique
Monte Carlo Tree Search

To Do's:

Test Transfer Learning
Multi-Agent Reinforcement Learning
Try Imitation Learning

The General Path:

Ultimate goal: the stock market is the most competitive adversarial game in the world.
To get there, we need to be really, really good players.
Games like Catan are stochastic and require sharp long-term planning.
Games like texas hold'em require intelligent understanding of probability spaces and bankroll management.
We plan to build AIs for these games with classical techniques like minimax w/ alpha-beta pruning and to beat them with modern reinforcement learning techniques.
To get minimax working well, we're attempting to beat a simpler purely deterministic game like connect-4.
If we're going to try to make minimax work on connect-4, we better be sure that we can get minimax working ~~very~~ well on tic-tac-toe.
We're also solving a more esoteric game called pyramix which seems to be a great mix of AI planning and reinforcement learning.

Interesting Resources:

RL Overview (From Stanford CS234 [1])

Reinforcement Learning: Learning to make good sequences of decisions.
- Optimization: Find an optimal way to make decisions––yielding best outcomes.
- Delayed consequences: When planning, decisions involve reasoning about not just immediate benefit of a decision but how its longer term ramifications; when learning, temporal credit assignment is hard (what caused later high or low rewards?).
- Exploration: Learning about the world by making decisions (agent as scientist); Censored data, only get a reward (label) for decision made.
- Generalization: Policy is mapping from past experience to action. AI Planning (vs RL)
- Minimax is AI planning, no need for exploration, already given model of how decisions impact world
- RL needs to explore the world. Imitation Learning
- Reduces RL to Supervised Learning
- Avoids exploration problem
- Limitations: can be expensive to collect data

Techniques to Test

DQN
Policy Gradients and Actor Critic
TRPO and PPO
SVG, DDPG, and Stochastic Computation Graphs
RND

Sources

[1] Stanford CS234

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
2048		2048
OpenAI-gym-environment		OpenAI-gym-environment
black-jack		black-jack
blotto		blotto
checkers		checkers
chess		chess
chinese-checkers		chinese-checkers
connect-4		connect-4
dollar-auction		dollar-auction
minesweeper		minesweeper
optimal-elevator-routing		optimal-elevator-routing
poker		poker
prisoners-dilemma		prisoners-dilemma
pyramix		pyramix
settlers-of-catan		settlers-of-catan
snake		snake
tic-tac-toe		tic-tac-toe
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI in Adversarial Games

Game Status

Proposed Time Line

Algorithms to test

To Do's:

The General Path:

Interesting Resources:

RL Overview (From Stanford CS234 [1])

Techniques to Test

Sources

About

Releases

Packages

Contributors 4

Languages

GrantStenger/ai-for-adversarial-games

Folders and files

Latest commit

History

Repository files navigation

AI in Adversarial Games

Game Status

Proposed Time Line

Algorithms to test

To Do's:

The General Path:

Interesting Resources:

RL Overview (From Stanford CS234 [1])

Techniques to Test

Sources

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages