Results on Box2D environments #22

balasurajp · 2021-02-09T13:17:05Z

I tried to benchmark the follwing environments ['BipedalWalker-v2', 'BipedalWalkerHardcore-v2', 'CarRacing-v0', 'LunarLander-v2', 'LunarLanderContinuous-v2'] using ['A3C', 'DDPG', 'TD3', 'SAC', 'PG', 'TRPO', 'PPO', 'DPPO'] algorithms. Most of the combinations failed to learn the task and didn't converge. Only (SAC, LunarLanderContinuous-v2) and (TD3, LunarLanderContinuous-v2) learnt the task sub-optimally. . Can someone address this issue?

quantumiracle · 2021-06-26T02:43:17Z

Hi,
Did you use the default hyper-parameters provided in RLzoo? If so, we will take a look into this problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Results on Box2D environments #22

Results on Box2D environments #22

balasurajp commented Feb 9, 2021

quantumiracle commented Jun 26, 2021

Results on Box2D environments #22

Results on Box2D environments #22

Comments

balasurajp commented Feb 9, 2021

quantumiracle commented Jun 26, 2021