Skip to content

Latest commit

 

History

History
417 lines (311 loc) · 20.3 KB

README.md

File metadata and controls

417 lines (311 loc) · 20.3 KB

The StarCraft Multi-Agent Exploration Challenges

Bechmarks for Efficient Exploration of Multi-stage Tasks Completion and Utilization of Environmental Factors

SMAC-Exp Offense SMAC-Exp Defense

🧚 Preview

The StarCraft Multi-Agent Exploration Challenges : Learning Multi-Stage Tasks and Environmental Factors without Precise Reward Functions.
Mingyu Kim*, Jihwan Oh*, Yongsik Lee, Joonkee Kim, Seonghwan Kim, Song Chong, Se-Young Yun.
(*:equal contribution)


✍️ Notice

  • PyMARL : The framework for deep multi-agent reinforcement learning with PyTorch.
  • SMAC : The environments for research in the field of collaborative multi-agent reinforcement learning (MARL) based on Blizzard's StarCraft II RTS game. We work on based on these.
  • SMAC_Exp : It provides its environment and exceutes training and testing RL algorithms based on PyMARL with both SMAC and SMAC_Exp.

🆚 SMAC vs SMAC_Exp

  • SMAC is the standard benchmark of Multi-Agent Reinforcement Learning. It is mainly concerned with ensuring that all agents cooperatively eliminate approaching adversaries only through fine manipulation with obvious reward functions.

  • SMAC_Exp contains total 8 maps and categorizes into three types like defense, offense, challenging. It is interested in the exploration capability of MARL algorithms to efficiently learn implicitly multi-stage tasks and environmental factors as well as micro-control.

Main Issues SMAC SMAC_Exp
Agents micro-control O O
Multi-stage tasks O
Environmental factors O

🌏 Maps

  • SG, Mar and M refer each Siege Tank, Marauder, Marine units.

Defense

drawing drawing drawing
Name Ally Units Enemy Units Opponents approach
defense_infantry 1 Mar & 4 M 1 Mar & 6 M One-sided
defense_armored 1 SG Tank, 1 Tank, 1 Mar & 5 M 2 Tank, 2 Mar & 9 M Two-sided
defense_outnumbered 1 SG Tank, 1 Tank, 1 Mar & 5 M 2 Tank, 3 Mar & 10 M Two-sided

Offense

drawing drawing drawing
Name Ally Units Enemy Units Distance & formation
offense_near 3 SG Tank, 3 Tank, 3 Mar & 4 M 1 SG Tank, 2 Tank, 2 Mar & 4 M Near & Spread
offense_distant 3 SG Tank, 3 Tank, 3 Mar & 4 M 1 SG Tank, 2 Tank, 2 Mar & 4 M Distant & Spread
offense_complicated 3 SG Tank, 3 Tank, 3 Mar & 4 M 1 SG Tank, 2 Tank, 2 Mar & 4 M Complicated & Spread

Challenging

drawing drawing
Name Ally Units Enemy Units Opponents approach
defense_superhard 1 SG Tank, 1 Tank, 1 Mar & 5 M 2 Tank, 3 Mar & 10 M Two-sided
Name Ally Units Enemy Units Distance & formation
offense_superhard 1 SG Tank, 2 Tank, 2 Mar & 4 M 1 SG Tank, 2 Tank, 2 Mar & 4 M Complicated & Gathered

🎮 Implemented Algorithms

Algorithm Category Paper Links
IQL Value based paper
VDN Value based paper
QTRAN Value based paper
QMIX Value based paper
DIQL Distributional Value based paper
DDN Distributional Value based paper
DMIX Distributional Value based paper
DRIMA Distributional Value based paper
COMA Policy Gradient based paper
MASAC Policy Gradient based paper
MADDPG Policy Gradient based paper
MAPPO Policy Gradient based paper

⚙️ Installation instructions

- Please pay attention to the version of SC2 you are using for your experiments. 
- Performance is *not* always comparable between versions. 
- The results in SMAC (https://arxiv.org/abs/1902.04043) use SC2.4.6.2.69232 not SC2.4.10.
- wget http://blzdistsc2-a.akamaihd.net/Linux/SC2.4.6.2.69232.zip

1️⃣ Cloning SMAC_Exp

git clone https://github.com/osilab-kaist/smac_plus.git

2️⃣ Download and set up StarCraft II

bash install_sc2.sh

       - This will download SC2 into the pymarl/3rdparty folder, or using symbolic link to use SC2.


3️⃣ Install required packages

      - The requirements.txt file can be used to install the necessary packages into a virtual environment (not recommended).
      - After install requirements, install torch suitable for the environment.

4️⃣ Move map directoryes to StarCraftII map directory

      - Move SMAC_Maps / SMAC_Plus_Maps directories to StarCraftII/Maps/.

mv SMAC_Plus_Maps ./pymarl/3rdparty/StarCraftII/Maps/
mv SMAC_Maps ./pymarl/3rdparty/StarCraftII/Maps/

       - You should have a structure like these:

smac_exp
├── pymarl
│   ├── docker
│   ├── 3rdparty
│   │   └── StarCraftII 
│   │       ├── Maps
│   │       │   ├── SMAC_Maps
│   │       │   └── SMAC_Plus_Maps
│   │       └── ...
│   ├── src
│   └── results
├── smac_plus
├── requirements.txt
└── install_sc2.sh


🏃Run an experiment

  • Episode experience buffer
cd ./pymarl
python src/main.py --alg=qmix --env-config=smac_plus with env_args.map_name=offense_hard
python src/main.py --alg=qmix --env-config=smac with env_args.map_name=2s3z
  • Parallel experience beffer
cd ./pymarl
python src/main.py --alg=qmix --env-config=smac_plus with env_args.map_name=offense_hard runner=parallel batch_size_run=20
python src/main.py --alg=qmix --env-config=smac with env_args.map_name=2s3z runner=parallel batch_size_run=20

      - The config files act as defaults for an algorithm or environment.
      - They are all located in src/config.
      - --config refers to the config files in src/config/algs.
      - --env-config refers to the config files in src/config/envs.
      - All results will be stored in the results folder.


🏃Run an test

cd ./pymarl
python src/main.py --alg=qmix --env-config=smac_plus with env_args.map_name=offense_hard save_replay=True checkpoint_path={checkpoint_dir_path} load_step={n_steps} test_nepisode={n_test}
python src/main.py --alg=qmix --env-config=smac with env_args.map_name=2s3z save_replay=True checkpoint_path={checkpoint_dir_path} load_step={n_steps} test_nepisode={n_test}

Logs and Checkpoints

  • While we are developing a benchmark from Feb., 2022, we encountered an unexpectable difficulty like deleting stored result files owing to malfunction of computation resources.
  • Therefore, despite of restroing these files, we provides partial information of pretrained checkpoints and tensorboard logs. Please refer to this URL(https://url.kr/92bp83)
  • Instead, we completely provide training curves of all algorithms and its test scores. Please see the provided csv files (https://url.kr/mak6gq).
Par : Parallel experience buffer
Seq : Sequential experience buffer
O : Exsistence of tensorboard logs and checkpoints.   
▵ : Exsitence of either tensorboard logs or checkpoints.   
X : Absence of tensorboard logs and checkpoints.   

Defense scenarios

(Par)Def_infantry (Par)Def_armored (Par)Def_outnumbered
1 2 3 1 2 3 1 2 3
IQL O O O O O O O O O
VDN - - - - - - - - -
QMIX - - - - - - - - -
QTRAN - - - - - - - - -
COMA O O O O O O O O O
MASAC O O O O O O O O O
MAPPO - - - - - - - - -
DIQL O O O O O O O O O
DDN - - - - - - - - -
DMIX - - - - - - - - -
DRIMA - - - - - - - - -
(Seq)Def_infantry (Seq)Def_armored (Seq)Def_outnumbered
1 2 3 1 2 3 1 2 3
IQL O O O O O O O O O
VDN O O O O O O - - -
QMIX O O O O O - O O -
QTRAN O O O O O O - - -
COMA O O - O O O O O O
MADDPG O O - O O - O O O
MASAC O O O O O O O O O
MAPPO - - - - - - - - -
DIQL O O O O O O O O O
DDN O O O O O O O O O
DMIX O O O O O O O O O
DRIMA O O - O ▵(Model) - O ▵(Model) -

Offensive scenarios

(Par)Off_near (Par)Off_distant (Par)Off_complicated
1 2 3 1 2 3 1 2 3
IQL O O O O O O O O O
VDN - - - - - - - - -
QMIX - - - - - - - - -
QTRAN - - - - - - - - -
COMA O O O O O O O O O
MASAC O O O O O O O O O
MAPPO - - - - - - - - -
DIQL O O O O O O O O O
DDN - - - - - - - - -
DMIX - - - - - - - - -
DRIMA - - - - - - - - -
(Seq)Off_near (Seq)Off_distant (Seq)Off_complicated
1 2 3 1 2 3 1 2 3
IQL - - - - - - - - -
VDN - - - - - - - - -
QMIX ▵(Model) ▵(Model) - O - O O O -
QTRAN - - - - - - - - -
COMA O O O O O O - - -
MADDPG O O O - - - O O -
MASAC - - - - - - - - -
MAPPO - - - - - - - - -
DIQL - - - - - - - - -
DDN - - - - - - - - -
DMIX - - - - - - - - -
DRIMA - ▵(Model) O O O - O O O

Challenging scenarios

(Par)Off_hard (Par)Off_superhard
1 2 3 1 2 3
IQL O O O O O O
VDN O O O ▵(Model) O O
QMIX O O O O O O
QTRAN O O O - - -
COMA O O O O O O
MASAC O O O O O O
MAPPO - - - - - -
DIQL O O O O O O
DDN O O O - - -
DMIX O O O O O O
DRIMA - - - - - -
(Seq)Off_hard (Seq)Off_superhard
1 2 3 1 2 3
IQL - - - - - -
VDN - - - - - -
QMIX O O O O O O
QTRAN - - - - - -
COMA O O O O O O
MADDPG - - - O O O
MASAC - - - - - -
MAPPO - - - - - -
DIQL - - - O O O
DDN - - - O O O
DMIX - - - O O O
DRIMA ▵(Model) ▵(Model) ▵(Model) O O O

🤝 License

  • The original SMAC environment and PyMARL code follow the MIT license and Apache 2.0 license respectively. The proposed SMAC-Exp environment and the modified PyMARL code are also released under the MIT license and Apache 2.0 license each.

📌 Citation

@ARTICLE{10099458,
  author={Kim, Mingyu and Oh, Jihwan and Lee, Yongsik and Kim, Joonkee and Kim, Seonghwan and Chong, Song and Yun, Seyoung},
  journal={IEEE Access}, 
  title={The StarCraft Multi-Agent Exploration Challenges: Learning Multi-Stage Tasks and Environmental Factors Without Precise Reward Functions}, 
  year={2023},
  volume={11},
  number={},
  pages={37854-37868},
  doi={10.1109/ACCESS.2023.3266652}}