Fixed step used to log SAC summary #1008

krishpop · 2020-09-19T09:40:47Z

Description
Changed call to writer.add_summary to use self.num_timesteps in the case that num_timesteps is not reset after multiple calls to model.learn(..., reset_num_timesteps=False).

Motivation and Context
I have raised an issue to propose this change (required for new features and bug fixes)
Issues #1007 shows an example where tensorboard plotted line is discontinuous due to incorrect step counter
Types of changes
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)
Checklist:
I've read the CONTRIBUTION guide (required)
I have updated the changelog accordingly (required).
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.
I have ensured pytest and pytype both pass (by running make pytest and make type).

fixes hill-a#1007

araffin · 2020-09-20T09:54:52Z

stable_baselines/sac/sac.py

@@ -314,6 +314,7 @@ def setup_model(self):
                self.summary = tf.summary.merge_all()

    def _train_step(self, step, writer, learning_rate):
+        del step


if step is not used anymore, why not removing it from the arguments?

good point! I will do that instead, wasn't sure if it followed a train_step template used by other algorithms as well, but it appears that is not the case

removed step, and fixed issue of wrapping replay_buffer on each call to learn (now only happens on the first one). Although let me know if you think it is better to check if self.model.replay_buffer is wrapped instead using isinstance(self.model.replay_buffer, HindsightExperienceReplayWrapper)

docs/misc/changelog.rst

…ping of HER replay_buffer

araffin · 2020-09-21T19:07:34Z

stable_baselines/her/her.py

@@ -108,9 +109,11 @@ def setup_model(self):

    def learn(self, total_timesteps, callback=None, log_interval=100, tb_log_name="HER",
              reset_num_timesteps=True):
+        replay_wrapper = self.replay_wrapper if not self.wrapped_buffer else None


this look like another issue... Please open a separate PR (you can include the minimal code and traceback there to reproduce the error).
Although it is small, it is better to solve one problem at a time ;)

krishpop added 2 commits September 19, 2020 02:28

Fix step used to log SAC summary

0e79512

fixes hill-a#1007

Update changelog.rst

b7e0f40

Miffyli requested a review from araffin September 19, 2020 22:06

araffin reviewed Sep 20, 2020

View reviewed changes

docs/misc/changelog.rst Outdated Show resolved Hide resolved

Remove step in SAC._train_step arglist, fixes issue of redundant wrap…

8123b35

…ping of HER replay_buffer

krishpop force-pushed the hotfix-tb-log-sac branch from aad197b to 8123b35 Compare September 21, 2020 18:37

araffin reviewed Sep 21, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed step used to log SAC summary #1008

Fixed step used to log SAC summary #1008

krishpop commented Sep 19, 2020

araffin Sep 20, 2020

krishpop Sep 21, 2020

krishpop Sep 21, 2020

araffin Sep 21, 2020

Fixed step used to log SAC summary #1008

Are you sure you want to change the base?

Fixed step used to log SAC summary #1008

Conversation

krishpop commented Sep 19, 2020

araffin Sep 20, 2020

Choose a reason for hiding this comment

krishpop Sep 21, 2020

Choose a reason for hiding this comment

krishpop Sep 21, 2020

Choose a reason for hiding this comment

araffin Sep 21, 2020

Choose a reason for hiding this comment