The StarCraft Multi-Agent Exploration Challenges

Bechmarks for Efficient Exploration of Multi-stage Tasks Completion and Utilization of Environmental Factors


SMAC-Exp Offense	SMAC-Exp Defense

🧚 Preview

The StarCraft Multi-Agent Exploration Challenges : Learning Multi-Stage Tasks and Environmental Factors without Precise Reward Functions.
Mingyu Kim*, Jihwan Oh*, Yongsik Lee, Joonkee Kim, Seonghwan Kim, Song Chong, Se-Young Yun.
(*:equal contribution)

Paper : https://ieeexplore.ieee.org/document/10099458
Project page : https://osilab-kaist.github.io/smac_exp/
Paperwithcodes : https://paperswithcode.com/paper/the-starcraft-multi-agent-challenges-learning
Learning curves : https://url.kr/mak6gq
Tensorboard logs and checkpoints : https://url.kr/92bp83

✍️ Notice

PyMARL : The framework for deep multi-agent reinforcement learning with PyTorch.
SMAC : The environments for research in the field of collaborative multi-agent reinforcement learning (MARL) based on Blizzard's StarCraft II RTS game. We work on based on these.
SMAC_Exp : It provides its environment and exceutes training and testing RL algorithms based on PyMARL with both SMAC and SMAC_Exp.

🆚 SMAC vs SMAC_Exp

SMAC is the standard benchmark of Multi-Agent Reinforcement Learning. It is mainly concerned with ensuring that all agents cooperatively eliminate approaching adversaries only through fine manipulation with obvious reward functions.
SMAC_Exp contains total 8 maps and categorizes into three types like defense, offense, challenging. It is interested in the exploration capability of MARL algorithms to efficiently learn implicitly multi-stage tasks and environmental factors as well as micro-control.

Main Issues	SMAC	SMAC_Exp
Agents micro-control	O	O
Multi-stage tasks	▵	O
Environmental factors	▵	O

🌏 Maps

SG, Mar and M refer each Siege Tank, Marauder, Marine units.

Defense

Name	Ally Units	Enemy Units	Opponents approach
`defense_infantry`	1 Mar & 4 M	1 Mar & 6 M	One-sided
`defense_armored`	1 SG Tank, 1 Tank, 1 Mar & 5 M	2 Tank, 2 Mar & 9 M	Two-sided
`defense_outnumbered`	1 SG Tank, 1 Tank, 1 Mar & 5 M	2 Tank, 3 Mar & 10 M	Two-sided

Offense

Name	Ally Units	Enemy Units	Distance & formation
`offense_near`	3 SG Tank, 3 Tank, 3 Mar & 4 M	1 SG Tank, 2 Tank, 2 Mar & 4 M	Near & Spread
`offense_distant`	3 SG Tank, 3 Tank, 3 Mar & 4 M	1 SG Tank, 2 Tank, 2 Mar & 4 M	Distant & Spread
`offense_complicated`	3 SG Tank, 3 Tank, 3 Mar & 4 M	1 SG Tank, 2 Tank, 2 Mar & 4 M	Complicated & Spread

Challenging

Name	Ally Units	Enemy Units	Opponents approach
`defense_superhard`	1 SG Tank, 1 Tank, 1 Mar & 5 M	2 Tank, 3 Mar & 10 M	Two-sided

Name	Ally Units	Enemy Units	Distance & formation
`offense_superhard`	1 SG Tank, 2 Tank, 2 Mar & 4 M	1 SG Tank, 2 Tank, 2 Mar & 4 M	Complicated & Gathered

🎮 Implemented Algorithms

Algorithm	Category	Paper Links
`IQL`	Value based	paper
`VDN`	Value based	paper
`QTRAN`	Value based	paper
`QMIX`	Value based	paper
`DIQL`	Distributional Value based	paper
`DDN`	Distributional Value based	paper
`DMIX`	Distributional Value based	paper
`DRIMA`	Distributional Value based	paper
`COMA`	Policy Gradient based	paper
`MASAC`	Policy Gradient based	paper
`MADDPG`	Policy Gradient based	paper
`MAPPO`	Policy Gradient based	paper

⚙️ Installation instructions

- Please pay attention to the version of SC2 you are using for your experiments. 
- Performance is *not* always comparable between versions. 
- The results in SMAC (https://arxiv.org/abs/1902.04043) use SC2.4.6.2.69232 not SC2.4.10.
- wget http://blzdistsc2-a.akamaihd.net/Linux/SC2.4.6.2.69232.zip

1️⃣ Cloning SMAC_Exp

git clone https://github.com/osilab-kaist/smac_plus.git

2️⃣ Download and set up StarCraft II

bash install_sc2.sh

- This will download SC2 into the pymarl/3rdparty folder, or using symbolic link to use SC2.

3️⃣ Install required packages

- The requirements.txt file can be used to install the necessary packages into a virtual environment (not recommended).
- After install requirements, install torch suitable for the environment.

4️⃣ Move map directoryes to StarCraftII map directory

- Move SMAC_Maps / SMAC_Plus_Maps directories to StarCraftII/Maps/.

mv SMAC_Plus_Maps ./pymarl/3rdparty/StarCraftII/Maps/
mv SMAC_Maps ./pymarl/3rdparty/StarCraftII/Maps/

- You should have a structure like these:

smac_exp
├── pymarl
│   ├── docker
│   ├── 3rdparty
│   │   └── StarCraftII 
│   │       ├── Maps
│   │       │   ├── SMAC_Maps
│   │       │   └── SMAC_Plus_Maps
│   │       └── ...
│   ├── src
│   └── results
├── smac_plus
├── requirements.txt
└── install_sc2.sh

🏃Run an experiment

Episode experience buffer

cd ./pymarl
python src/main.py --alg=qmix --env-config=smac_plus with env_args.map_name=offense_hard
python src/main.py --alg=qmix --env-config=smac with env_args.map_name=2s3z

Parallel experience beffer

cd ./pymarl
python src/main.py --alg=qmix --env-config=smac_plus with env_args.map_name=offense_hard runner=parallel batch_size_run=20
python src/main.py --alg=qmix --env-config=smac with env_args.map_name=2s3z runner=parallel batch_size_run=20

      - The config files act as defaults for an algorithm or environment.
      - They are all located in src/config.
      - --config refers to the config files in src/config/algs.
      - --env-config refers to the config files in src/config/envs.
      - All results will be stored in the results folder.

🏃Run an test

cd ./pymarl
python src/main.py --alg=qmix --env-config=smac_plus with env_args.map_name=offense_hard save_replay=True checkpoint_path={checkpoint_dir_path} load_step={n_steps} test_nepisode={n_test}
python src/main.py --alg=qmix --env-config=smac with env_args.map_name=2s3z save_replay=True checkpoint_path={checkpoint_dir_path} load_step={n_steps} test_nepisode={n_test}

Logs and Checkpoints

While we are developing a benchmark from Feb., 2022, we encountered an unexpectable difficulty like deleting stored result files owing to malfunction of computation resources.
Therefore, despite of restroing these files, we provides partial information of pretrained checkpoints and tensorboard logs. Please refer to this URL(https://url.kr/92bp83)
Instead, we completely provide training curves of all algorithms and its test scores. Please see the provided csv files (https://url.kr/mak6gq).

Par : Parallel experience buffer
Seq : Sequential experience buffer

O : Exsistence of tensorboard logs and checkpoints.   
▵ : Exsitence of either tensorboard logs or checkpoints.   
X : Absence of tensorboard logs and checkpoints.

Defense scenarios

	(Par)Def_infantry			(Par)Def_armored			(Par)Def_outnumbered
	1	2	3	1	2	3	1	2	3
IQL	O	O	O	O	O	O	O	O	O
VDN	-	-	-	-	-	-	-	-	-
QMIX	-	-	-	-	-	-	-	-	-
QTRAN	-	-	-	-	-	-	-	-	-
COMA	O	O	O	O	O	O	O	O	O
MASAC	O	O	O	O	O	O	O	O	O
MAPPO	-	-	-	-	-	-	-	-	-
DIQL	O	O	O	O	O	O	O	O	O
DDN	-	-	-	-	-	-	-	-	-
DMIX	-	-	-	-	-	-	-	-	-
DRIMA	-	-	-	-	-	-	-	-	-

	(Seq)Def_infantry			(Seq)Def_armored			(Seq)Def_outnumbered
	1	2	3	1	2	3	1	2	3
IQL	O	O	O	O	O	O	O	O	O
VDN	O	O	O	O	O	O	-	-	-
QMIX	O	O	O	O	O	-	O	O	-
QTRAN	O	O	O	O	O	O	-	-	-
COMA	O	O	-	O	O	O	O	O	O
MADDPG	O	O	-	O	O	-	O	O	O
MASAC	O	O	O	O	O	O	O	O	O
MAPPO	-	-	-	-	-	-	-	-	-
DIQL	O	O	O	O	O	O	O	O	O
DDN	O	O	O	O	O	O	O	O	O
DMIX	O	O	O	O	O	O	O	O	O
DRIMA	O	O	-	O	▵(Model)	-	O	▵(Model)	-

Offensive scenarios

	(Par)Off_near			(Par)Off_distant			(Par)Off_complicated
	1	2	3	1	2	3	1	2	3
IQL	O	O	O	O	O	O	O	O	O
VDN	-	-	-	-	-	-	-	-	-
QMIX	-	-	-	-	-	-	-	-	-
QTRAN	-	-	-	-	-	-	-	-	-
COMA	O	O	O	O	O	O	O	O	O
MASAC	O	O	O	O	O	O	O	O	O
MAPPO	-	-	-	-	-	-	-	-	-
DIQL	O	O	O	O	O	O	O	O	O
DDN	-	-	-	-	-	-	-	-	-
DMIX	-	-	-	-	-	-	-	-	-
DRIMA	-	-	-	-	-	-	-	-	-

	(Seq)Off_near			(Seq)Off_distant			(Seq)Off_complicated
	1	2	3	1	2	3	1	2	3
IQL	-	-	-	-	-	-	-	-	-
VDN	-	-	-	-	-	-	-	-	-
QMIX	▵(Model)	▵(Model)	-	O	-	O	O	O	-
QTRAN	-	-	-	-	-	-	-	-	-
COMA	O	O	O	O	O	O	-	-	-
MADDPG	O	O	O	-	-	-	O	O	-
MASAC	-	-	-	-	-	-	-	-	-
MAPPO	-	-	-	-	-	-	-	-	-
DIQL	-	-	-	-	-	-	-	-	-
DDN	-	-	-	-	-	-	-	-	-
DMIX	-	-	-	-	-	-	-	-	-
DRIMA	-	▵(Model)	O	O	O	-	O	O	O

Challenging scenarios

	(Par)Off_hard			(Par)Off_superhard
	1	2	3	1	2	3
IQL	O	O	O	O	O	O
VDN	O	O	O	▵(Model)	O	O
QMIX	O	O	O	O	O	O
QTRAN	O	O	O	-	-	-
COMA	O	O	O	O	O	O
MASAC	O	O	O	O	O	O
MAPPO	-	-	-	-	-	-
DIQL	O	O	O	O	O	O
DDN	O	O	O	-	-	-
DMIX	O	O	O	O	O	O
DRIMA	-	-	-	-	-	-

	(Seq)Off_hard			(Seq)Off_superhard
	1	2	3	1	2	3
IQL	-	-	-	-	-	-
VDN	-	-	-	-	-	-
QMIX	O	O	O	O	O	O
QTRAN	-	-	-	-	-	-
COMA	O	O	O	O	O	O
MADDPG	-	-	-	O	O	O
MASAC	-	-	-	-	-	-
MAPPO	-	-	-	-	-	-
DIQL	-	-	-	O	O	O
DDN	-	-	-	O	O	O
DMIX	-	-	-	O	O	O
DRIMA	▵(Model)	▵(Model)	▵(Model)	O	O	O

🤝 License

The original SMAC environment and PyMARL code follow the MIT license and Apache 2.0 license respectively. The proposed SMAC-Exp environment and the modified PyMARL code are also released under the MIT license and Apache 2.0 license each.

📌 Citation

@ARTICLE{10099458,
  author={Kim, Mingyu and Oh, Jihwan and Lee, Yongsik and Kim, Joonkee and Kim, Seonghwan and Chong, Song and Yun, Seyoung},
  journal={IEEE Access}, 
  title={The StarCraft Multi-Agent Exploration Challenges: Learning Multi-Stage Tasks and Environmental Factors Without Precise Reward Functions}, 
  year={2023},
  volume={11},
  number={},
  pages={37854-37868},
  doi={10.1109/ACCESS.2023.3266652}}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

The StarCraft Multi-Agent Exploration Challenges

🧚 Preview

✍️ Notice

🆚 SMAC vs SMAC_Exp

🌏 Maps

Defense

Offense

Challenging

🎮 Implemented Algorithms

⚙️ Installation instructions

🏃Run an experiment

🏃Run an test

Logs and Checkpoints

Defense scenarios

Offensive scenarios

Challenging scenarios

🤝 License

📌 Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

The StarCraft Multi-Agent Exploration Challenges

🧚 Preview

✍️ Notice

🆚 SMAC vs SMAC_Exp

🌏 Maps

Defense

Offense

Challenging

🎮 Implemented Algorithms

⚙️ Installation instructions

🏃Run an experiment

🏃Run an test

Logs and Checkpoints

Defense scenarios

Offensive scenarios

Challenging scenarios

🤝 License

📌 Citation