You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm currently working to speed up a project using the stable-baselines DDPG algorithm. My environment can be easily converted into VecEnv. But I figured out that your DDPG doesn't support the VecEnv.
I'm wondering why you didn't implement it? Lack of time, hard to deal with?
Thank you in advance.
The text was updated successfully, but these errors were encountered:
In short, SAC/TD3/DDPG were designed for using them on real robot, so where multiprocessing is not possible. I would recommend using PPO/A2C for fast training using massive multiprocessing (where sample efficiency is not a problem but wall clock time is)
In theory, SAC/TD3/DDPG could be extended to multiprocessing (this will maybe easier to do so with v3 #576 ), however, in practice, this would complicate the current code (maybe a wrapper around the replay buffer would be a solution).
Only partially solved, we would need to change all the logging around (this would complexify the code a bit) and for some algorithm (like DDPG), we are using a UnvecWrapper (for legacy reason, not planned in the next version) to have only one environment, so switching to a VecEnv may break other things unfortunately.
I'm currently working to speed up a project using the stable-baselines DDPG algorithm. My environment can be easily converted into VecEnv. But I figured out that your DDPG doesn't support the VecEnv.
I'm wondering why you didn't implement it? Lack of time, hard to deal with?
Thank you in advance.
The text was updated successfully, but these errors were encountered: