Releases · hill-a/stable-baselines

added async_eigen_decomp parameter for ACKTR and set it to False by default (remove deprecation warnings)
added methods for calling env methods/setting attributes inside a VecEnv (thanks to @bjmuld)
updated gym minimum version

Contributors (since v2.0.0):

Thanks to @bjmuld @iambenzo @iandanforth @r7vme @brendenpetersen @huvar

Assets 2

20 Oct 08:28

araffin

v2.1.1

e6cd86e

Clean up dependencies + bug fix

fixed MpiAdam synchronization issue in PPO1 (thanks to @brendenpetersen) issue #50
fixed dependency issues (new mujoco-py requires a mujoco licence + gym broke MultiDiscrete space shape)

Assets 2

02 Oct 12:35

araffin

v2.1.0

bd0964f

Bug fixes

WARNING: This version contains breaking changes, please read the full details

added patch fix for equal function using gym.spaces.MultiDiscrete and gym.spaces.MultiBinary
fixes for DQN action_probability
re-added double DQN + refactored DQN policies breaking changes
replaced async with async_eigen_decomp in ACKTR/KFAC for python 3.7 compatibility
removed action clipping for prediction of continuous actions (see issue #36)
fixed NaN issue due to clipping the continuous action in the wrong place (issue #36)

Assets 2

18 Sep 09:19

araffin

v2.0.0

221f012

Tensorboard, refactoring and bug fixes

WARNING: This version contains breaking changes, please read the full details

Renamed DeepQ to DQN breaking changes
Renamed DeepQPolicy to DQNPolicy breaking changes
fixed DDPG behavior breaking changes
changed default policies for DDPG, so that DDPG now works correctly breaking changes
added more documentation (some modules from common).
added doc about using custom env
added Tensorboard support for A2C, ACER, ACKTR, DDPG, DeepQ, PPO1, PPO2 and TRPO
added episode reward to Tensorboard
added documentation for Tensorboard usage
added Identity for Box action space
fixed render function ignoring parameters when using wrapped environments
fixed PPO1 and TRPO done values for recurrent policies
fixed image normalization not occurring when using images
updated VecEnv objects for the new Gym version
added test for DDPG
refactored DQN policies
added registry for policies, can be passed as string to the agent
added documentation for custom policies + policy registration
fixed numpy warning when using DDPG Memory
fixed DummyVecEnv not copying the observation array when stepping and resetting
added pre-built docker images + installation instructions
added deterministic argument in the predict function
added assert in PPO2 for recurrent policies
fixed predict function to handle both vectorized and unwrapped environment
added input check to the predict function
refactored ActorCritic models to reduce code duplication
refactored Off Policy models (to begin HER and replay_buffer refactoring)
added tests for auto vectorization detection
fixed render function, to handle positional arguments

Assets 2

29 Aug 11:52

araffin

v1.0.7

3a57007

Bug fixes and documentation

added html documentation using sphinx + integration with read the docs
cleaned up README + typos
fixed normalization for DQN with images
fixed DQN identity test

Assets 2

20 Aug 15:01

araffin

v1.0.1

68dd857

Refactored Stable Baselines

refactored A2C, ACER, ACTKR, DDPG, DeepQ, GAIL, TRPO, PPO1 and PPO2 under a single constant class
added callback to refactored algorithm training
added saving and loading to refactored algorithms
refactored ACER, DDPG, GAIL, PPO1 and TRPO to fit with A2C, PPO2 and ACKTR policies
added new policies for most algorithms (Mlp, MlpLstm, MlpLnLstm, Cnn, CnnLstm and CnnLnLstm)
added dynamic environment switching (so continual RL learning is now feasible)
added prediction from observation and action probability from observation for all the algorithms
fixed graphs issues, so models wont collide in names
fixed behavior_clone weight loading for GAIL
fixed Tensorflow using all the GPU VRAM
fixed models so that they are all compatible with vectorized environments
fixed set_global_seed to update gym.spaces's random seed
fixed PPO1 and TRPO performance issues when learning identity function
added new tests for loading, saving, continuous actions and learning the identity function
fixed DQN wrapping for atari
added saving and loading for Vecnormalize wrapper
added automatic detection of action space (for the policy network)
fixed ACER buffer with constant values assuming n_stack=4
fixed some RL algorithms not clipping the action to be in the action_space, when using gym.spaces.Box
refactored algorithms can take either a gym.Environment or a str (if the environment name is registered)
Hoftix in ACER (compared to v1.0.0)

Future Work :

Finish refactoring HER
Refactor ACKTR and ACER for continuous implementation

Assets 2

20 Aug 13:23

araffin

v1.0.0

07e2393

v1.0.0 Pre-release

Pre-release

Do not use: bug in ACER, fixed in v1.0.1

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: hill-a/stable-baselines

Flexible Custom MLP Policies + bug fixes

Video Recorder

Hotfix PPO2

New VecEnv Features

Clean up dependencies + bug fix

Bug fixes

Tensorboard, refactoring and bug fixes

Bug fixes and documentation

Refactored Stable Baselines

v1.0.0