Weird tensorboard plots in GAIL #577

rohitrango · 2019-11-23T19:21:23Z

Describe the bug
While training GAIL, I get plots which go back in the x-axis in tensorboard. Please see example.

Code example

from stable_baselines import GAIL
from stable_baselines.gail import ExpertDataset

dataset = ExpertDataset(expert_path='hopper.npz')
model = GAIL('MlpPolicy', 'Hopper-v2', dataset, tensorboard_log='logs/hopper-v2')
model.learn(total_timesteps=int(1e6))

Here is the graph that I get

System Info
Describe the characteristic of your environment:

Installed from source
Python version - 3.5.2
Tensorflow version - 1.14.0

This problem doesn't occur with TRPO where I get reasonable plots.

araffin · 2019-11-23T20:45:19Z

Hello,
This is a known bug (#143), and comes from total_episode_reward_logger, it will be partially solved in #556 .
The best way to get accurate plot is to use a Monitor wrapper and the legacy tensorboard integration (cf doc).

rohitrango · 2019-11-23T22:58:08Z

But then I can't use SubprocVecEnv (it gives me an error since csv readers can't be pickled)

Miffyli · 2019-11-24T03:55:01Z

You have to wrap each individual environment with Monitor wrapper rather than the VecEnv, i.e. you will end up with one CSV file per environment.

rohitrango · 2019-11-24T17:57:09Z

@Miffyli Could you elaborate a bit more on what to do? I need to have multiple processes, each with a separate normalized env, but I get the following error:

    actenv = gym.make(args.env_id)
    os.makedirs(osp.join(args.tb_log, 'ppo2'), exist_ok=True)
    allenvs = []
    for i in range(args.num_procs):
        tmpenv = Monitor(actenv, osp.join(args.tb_log, 'ppo2', '{}'.format(i), ), allow_early_resets=True)
        allenvs.append(lambda : tmpenv)
    env = SubprocVecEnv(allenvs, )

Here is the error trace:

Traceback (most recent call last):
  File "generate_trajs_ppo.py", line 54, in <module>
    env = SubprocVecEnv(allenvs, )
  File "/home/rohit/stable-baselines/stable_baselines/common/vec_env/subproc_vec_env.py", line 90, in __init__
    process.start()
  File "/usr/lib/python3.5/multiprocessing/process.py", line 105, in start
    self._popen = self._Popen(self)
  File "/usr/lib/python3.5/multiprocessing/context.py", line 281, in _Popen
    return Popen(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_forkserver.py", line 36, in __init__
    super().__init__(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_fork.py", line 20, in __init__
    self._launch(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_forkserver.py", line 48, in _launch
    reduction.dump(process_obj, buf)
  File "/usr/lib/python3.5/multiprocessing/reduction.py", line 59, in dump
    ForkingPickler(file, protocol).dump(obj)
  File "/home/rohit/stable-baselines/stable_baselines/common/vec_env/base_vec_env.py", line 297, in __getstate__
    return cloudpickle.dumps(self.var)
  File "/home/rohit/venv/lib/python3.5/site-packages/cloudpickle/cloudpickle.py", line 1125, in dumps
    cp.dump(obj)
  File "/home/rohit/venv/lib/python3.5/site-packages/cloudpickle/cloudpickle.py", line 482, in dump
    return Pickler.dump(self, obj)
  File "/usr/lib/python3.5/pickle.py", line 408, in dump
    self.save(obj)
  File "/usr/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/rohit/venv/lib/python3.5/site-packages/cloudpickle/cloudpickle.py", line 556, in save_function
    return self.save_function_tuple(obj)
  File "/home/rohit/venv/lib/python3.5/site-packages/cloudpickle/cloudpickle.py", line 758, in save_function_tuple
    save(state)
  File "/usr/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python3.5/pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.5/pickle.py", line 836, in _batch_setitems
    save(v)
  File "/usr/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python3.5/pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.5/pickle.py", line 841, in _batch_setitems
    save(v)
  File "/usr/lib/python3.5/pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python3.5/pickle.py", line 623, in save_reduce
    save(state)
  File "/usr/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python3.5/pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.5/pickle.py", line 836, in _batch_setitems
    save(v)
  File "/usr/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/rohit/venv/lib/python3.5/site-packages/cloudpickle/cloudpickle.py", line 1014, in save_file
    raise pickle.PicklingError("Cannot pickle files that are not opened for reading: %s" % obj.mode)
_pickle.PicklingError: Cannot pickle files that are not opened for reading: wt

araffin · 2019-11-24T18:00:44Z

Could you elaborate a bit more on what to do?

Please read the documentation, we have different examples showing the use of a Monitor wrapper.

I need to have multiple processes, each with a separate normalized env, but I get the following error:

Why don't you use a VecNormalize wrapper? It seems that the problem comes from your environment or at the preprocessing you are doing.

rohitrango · 2019-11-24T18:08:09Z

Using VecNormalize is not my problem, just to clarify. The stack trace clearly tells that this is an issue with the Monitor not being picklable. I have tried looking at #535 and the doc, but the doc only uses Monitor with DummyVecEnv which is not a problem for me either.

Miffyli · 2019-11-25T01:34:57Z

Took me a while to realize: If you look carefully at the SubprocVecEnv examples, the lambda functions are functions that create the environment, they do not return a premade environment. I.e. your code should be something along lines of (not checked for validity):

def create_env():
	actenv = gym.make(args.env_id)
	os.makedirs(osp.join(args.tb_log, 'ppo2'), exist_ok=True)
	return Monitor(actenv, osp.join(args.tb_log, 'ppo2', '{}'.format(i), ), allow_early_resets=True)

allenvs = []
for i in range(args.num_procs):
	allenvs.append(create_env)

env = SubprocVecEnv(allenvs, )

Note to self: This should probably be clarified in docs as it can be easily lost among all the lambdaing and whatnot.

araffin · 2019-11-25T09:16:48Z

This should probably be clarified in docs as it can be easily lost among all the lambdaing and whatnot.

I agree and disagree at the same time. In the doc, we only use lambda for one env and a python function for multiple env. Also, now that we have make_vec_env() this should also remove that problem.
But yes, a warning or a note could be useful.

EDIT: the type in the doc is wrong btw

rohitrango · 2019-11-25T18:09:29Z

@Miffyli Thanks, this seems to work. Now I realize that my problem was very subtle, and cloudpickle can pickle the function, but not the premade environment that I passed into it.
@araffin The SubprocVecEnv works with lambda functions if I do not use the Monitor wrapper since there are no unpicklable items. I can add a note in the appropriate doc if that's okay.

* Bump version * Add a message to PPO2 assert (closes #625) * Update replay buffer doctring (closes #610) * Don't specify a version for pytype * Fix `VecEnv` docstrings (closes #577) * Typo * Re-add python version for pytype

araffin added bug Something isn't working duplicate This issue or pull request already exists labels Nov 23, 2019

araffin added the RTFM Answer is the documentation label Nov 24, 2019

araffin added documentation Documentation should be updated and removed RTFM Answer is the documentation labels Nov 25, 2019

araffin added a commit that referenced this issue Dec 19, 2019

Fix VecEnv docstrings (closes #577)

f1f8470

araffin mentioned this issue Dec 19, 2019

Release v2.9.0 #629

Merged

11 tasks

araffin closed this as completed in #629 Dec 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weird tensorboard plots in GAIL #577

Weird tensorboard plots in GAIL #577

rohitrango commented Nov 23, 2019 •

edited

Loading

araffin commented Nov 23, 2019

rohitrango commented Nov 23, 2019

Miffyli commented Nov 24, 2019

rohitrango commented Nov 24, 2019 •

edited

Loading

araffin commented Nov 24, 2019

rohitrango commented Nov 24, 2019 •

edited

Loading

Miffyli commented Nov 25, 2019

araffin commented Nov 25, 2019 •

edited

Loading

rohitrango commented Nov 25, 2019

Weird tensorboard plots in GAIL #577

Weird tensorboard plots in GAIL #577

Comments

rohitrango commented Nov 23, 2019 • edited Loading

araffin commented Nov 23, 2019

rohitrango commented Nov 23, 2019

Miffyli commented Nov 24, 2019

rohitrango commented Nov 24, 2019 • edited Loading

araffin commented Nov 24, 2019

rohitrango commented Nov 24, 2019 • edited Loading

Miffyli commented Nov 25, 2019

araffin commented Nov 25, 2019 • edited Loading

rohitrango commented Nov 25, 2019

rohitrango commented Nov 23, 2019 •

edited

Loading

rohitrango commented Nov 24, 2019 •

edited

Loading

rohitrango commented Nov 24, 2019 •

edited

Loading

araffin commented Nov 25, 2019 •

edited

Loading