v0.4.7

PaParaZz1 released this 11 Apr 16:55

· 166 commits to main since this release

API Change

remove the requirements of sub fields (learn/collect/eval) in the policy config (users can define their own config formats)
use wandb as the default logger in task pipeline
remove value_network config field and implementations in SAC and related algorithms

Env

add dmc2gym env support and baseline (#451)
update pettingzoo to the latest version (#597)
polish icm/rnd+onppo config bugs and add app_door_to_key env (#564)
add lunarlander continuous TD3/SAC config
polish lunarlander discrete C51 config

Algorithm

add Procedure Cloning (PC) imitation learning algorithm (#514)
add Munchausen Reinforcement Learning (MDQN) algorithm (#590)
add reward/value norm methods: popart & value rescale & symlog (#605)
polish reward model config and training pipeline (#624)
add PPOF reward space demo support (#608)
add PPOF Atari demo support (#589)
polish dqn default config and env examples (#611)
polish comment and clean code about SAC

Enhancement

add language model (e.g. GPT) training utils (#625)
remove policy cfg sub fields requirements (#620)
add full wandb support (#579)

Fix

fix confusing shallow copy operation about next_obs (#641)
fix unsqueeze action_args in PDQN when shape is 1 (#599)
fix evaluator return_info tensor type bug (#592)
fix deque buffer wrapper PER bug (#586)
fix reward model save method compatibility bug
fix logger assertion and unittest bug
fix bfs test py3.9 compatibility bug
fix zergling collector unittest bug

Style

add DI-engine torch-rpc p2p communication docker (#628)
add D4RL docker (#591)
correct typo in task (#617)
correct typo in time_helper (#602)
polish readme and add treetensor example
update contributing doc

New Plan

Call for contributors about DI-engine (#621)

Full Changelog: v0.4.6...v0.4.7

Contributors: @PaParaZz1 @karroyan @zjowowen @ruoyuGao @kxzxvbk @nighood @song2181 @SolenoidWGT @PSHarold @jimmydengpeng @eltociear

Contributors

PSHarold, jimmydengpeng, and 9 other contributors

Assets 2