Skip to content
/ A3C Public

PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) with parallel learning

Notifications You must be signed in to change notification settings

naivoder/A3C

Repository files navigation

Asynchronous Advantage Actor-Critic (A3C)

🚧👷🛑 Under Construction!!!

Table of Contents

  1. Overview
  2. Setup
  3. Results
  4. Analysis

Overview

This repository contains an implementation of the Asynchronous Advantage Actor-Critic (A3C) algorithm using PyTorch. A3C is a reinforcement learning method that leverages parallel actor-learners to stabilize and speed up training, providing faster convergence and improved performance in complex environments. The algorithm is evaluated on various Atari environments using Gymnasium.

Setup

Required Dependencies

It's recommended to use a Conda environment to manage dependencies and avoid conflicts. You can create and activate a new Conda environment with the following commands:

conda create -n rl python=3.11
conda activate rl

After activating the environment, install the required dependencies using:

pip install -r requirements.txt

Running the Algorithm

You can run the A3C algorithm on any supported Gymnasium Atari environment with a discrete action space using the following command:

python main.py --env 'MsPacmanNoFrameskip-v4'

Command-Line Arguments

  • Environment Selection: Use -e or --env to specify the Gymnasium environment. The default is None, so you must specify an environment.

    Example:

    python main.py --env 'PongNoFrameskip-v4'
  • Number of Training Episodes: Use --n_games to specify the number of games the agent should play during training.

    Example:

    python main.py --n_games 5000
  • Parallel Environments: Use --n_envs to specify the number of parallel environments to run during training. The default is 4.

    Example:

    python main.py --env 'AsterixNoFrameskip-v4' --n_envs 16

Using a Conda environment along with these flexible command-line options will help you efficiently manage your dependencies and customize the training process for your specific needs.

Results

AirRaid

Alien

Amidar

Assault

Asterix

Asteroids

Atlantis

BankHeist

BattleZone

BeamRider

Berzerk

Bowling

Boxing

Breakout

Carnival

Centipede

ChopperCommand

CrazyClimber

Analysis

Acknowledgements

Special thanks to Phil Tabor, an excellent teacher! I highly recommend his Youtube channel.

About

PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) with parallel learning

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages