Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird tensorboard plots in GAIL #577

Closed
rohitrango opened this issue Nov 23, 2019 · 9 comments · Fixed by #629
Closed

Weird tensorboard plots in GAIL #577

rohitrango opened this issue Nov 23, 2019 · 9 comments · Fixed by #629
Labels
bug Something isn't working documentation Documentation should be updated duplicate This issue or pull request already exists

Comments

@rohitrango
Copy link

rohitrango commented Nov 23, 2019

Describe the bug
While training GAIL, I get plots which go back in the x-axis in tensorboard. Please see example.

Code example

from stable_baselines import GAIL
from stable_baselines.gail import ExpertDataset

dataset = ExpertDataset(expert_path='hopper.npz')
model = GAIL('MlpPolicy', 'Hopper-v2', dataset, tensorboard_log='logs/hopper-v2')
model.learn(total_timesteps=int(1e6))

Here is the graph that I get
Screenshot from 2019-11-23 14-18-54

System Info
Describe the characteristic of your environment:

  • Installed from source
  • Python version - 3.5.2
  • Tensorflow version - 1.14.0

This problem doesn't occur with TRPO where I get reasonable plots.

@araffin araffin added bug Something isn't working duplicate This issue or pull request already exists labels Nov 23, 2019
@araffin
Copy link
Collaborator

araffin commented Nov 23, 2019

Hello,
This is a known bug (#143), and comes from total_episode_reward_logger, it will be partially solved in #556 .
The best way to get accurate plot is to use a Monitor wrapper and the legacy tensorboard integration (cf doc).

@rohitrango
Copy link
Author

But then I can't use SubprocVecEnv (it gives me an error since csv readers can't be pickled)

@Miffyli
Copy link
Collaborator

Miffyli commented Nov 24, 2019

You have to wrap each individual environment with Monitor wrapper rather than the VecEnv, i.e. you will end up with one CSV file per environment.

@rohitrango
Copy link
Author

rohitrango commented Nov 24, 2019

@Miffyli Could you elaborate a bit more on what to do? I need to have multiple processes, each with a separate normalized env, but I get the following error:

    actenv = gym.make(args.env_id)
    os.makedirs(osp.join(args.tb_log, 'ppo2'), exist_ok=True)
    allenvs = []
    for i in range(args.num_procs):
        tmpenv = Monitor(actenv, osp.join(args.tb_log, 'ppo2', '{}'.format(i), ), allow_early_resets=True)
        allenvs.append(lambda : tmpenv)
    env = SubprocVecEnv(allenvs, )

Here is the error trace:

Traceback (most recent call last):
  File "generate_trajs_ppo.py", line 54, in <module>
    env = SubprocVecEnv(allenvs, )
  File "/home/rohit/stable-baselines/stable_baselines/common/vec_env/subproc_vec_env.py", line 90, in __init__
    process.start()
  File "/usr/lib/python3.5/multiprocessing/process.py", line 105, in start
    self._popen = self._Popen(self)
  File "/usr/lib/python3.5/multiprocessing/context.py", line 281, in _Popen
    return Popen(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_forkserver.py", line 36, in __init__
    super().__init__(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_fork.py", line 20, in __init__
    self._launch(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_forkserver.py", line 48, in _launch
    reduction.dump(process_obj, buf)
  File "/usr/lib/python3.5/multiprocessing/reduction.py", line 59, in dump
    ForkingPickler(file, protocol).dump(obj)
  File "/home/rohit/stable-baselines/stable_baselines/common/vec_env/base_vec_env.py", line 297, in __getstate__
    return cloudpickle.dumps(self.var)
  File "/home/rohit/venv/lib/python3.5/site-packages/cloudpickle/cloudpickle.py", line 1125, in dumps
    cp.dump(obj)
  File "/home/rohit/venv/lib/python3.5/site-packages/cloudpickle/cloudpickle.py", line 482, in dump
    return Pickler.dump(self, obj)
  File "/usr/lib/python3.5/pickle.py", line 408, in dump
    self.save(obj)
  File "/usr/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/rohit/venv/lib/python3.5/site-packages/cloudpickle/cloudpickle.py", line 556, in save_function
    return self.save_function_tuple(obj)
  File "/home/rohit/venv/lib/python3.5/site-packages/cloudpickle/cloudpickle.py", line 758, in save_function_tuple
    save(state)
  File "/usr/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python3.5/pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.5/pickle.py", line 836, in _batch_setitems
    save(v)
  File "/usr/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python3.5/pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.5/pickle.py", line 841, in _batch_setitems
    save(v)
  File "/usr/lib/python3.5/pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python3.5/pickle.py", line 623, in save_reduce
    save(state)
  File "/usr/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python3.5/pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.5/pickle.py", line 836, in _batch_setitems
    save(v)
  File "/usr/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/rohit/venv/lib/python3.5/site-packages/cloudpickle/cloudpickle.py", line 1014, in save_file
    raise pickle.PicklingError("Cannot pickle files that are not opened for reading: %s" % obj.mode)
_pickle.PicklingError: Cannot pickle files that are not opened for reading: wt

@araffin araffin added the RTFM Answer is the documentation label Nov 24, 2019
@araffin
Copy link
Collaborator

araffin commented Nov 24, 2019

Could you elaborate a bit more on what to do?

Please read the documentation, we have different examples showing the use of a Monitor wrapper.

I need to have multiple processes, each with a separate normalized env, but I get the following error:

Why don't you use a VecNormalize wrapper? It seems that the problem comes from your environment or at the preprocessing you are doing.

@rohitrango
Copy link
Author

rohitrango commented Nov 24, 2019

Using VecNormalize is not my problem, just to clarify. The stack trace clearly tells that this is an issue with the Monitor not being picklable. I have tried looking at #535 and the doc, but the doc only uses Monitor with DummyVecEnv which is not a problem for me either.

@Miffyli
Copy link
Collaborator

Miffyli commented Nov 25, 2019

Took me a while to realize: If you look carefully at the SubprocVecEnv examples, the lambda functions are functions that create the environment, they do not return a premade environment. I.e. your code should be something along lines of (not checked for validity):

def create_env():
	actenv = gym.make(args.env_id)
	os.makedirs(osp.join(args.tb_log, 'ppo2'), exist_ok=True)
	return Monitor(actenv, osp.join(args.tb_log, 'ppo2', '{}'.format(i), ), allow_early_resets=True)

allenvs = []
for i in range(args.num_procs):
	allenvs.append(create_env)

env = SubprocVecEnv(allenvs, )

Note to self: This should probably be clarified in docs as it can be easily lost among all the lambdaing and whatnot.

@araffin
Copy link
Collaborator

araffin commented Nov 25, 2019

This should probably be clarified in docs as it can be easily lost among all the lambdaing and whatnot.

I agree and disagree at the same time. In the doc, we only use lambda for one env and a python function for multiple env. Also, now that we have make_vec_env() this should also remove that problem.
But yes, a warning or a note could be useful.

EDIT: the type in the doc is wrong btw

@araffin araffin added documentation Documentation should be updated and removed RTFM Answer is the documentation labels Nov 25, 2019
@rohitrango
Copy link
Author

@Miffyli Thanks, this seems to work. Now I realize that my problem was very subtle, and cloudpickle can pickle the function, but not the premade environment that I passed into it.
@araffin The SubprocVecEnv works with lambda functions if I do not use the Monitor wrapper since there are no unpicklable items. I can add a note in the appropriate doc if that's okay.

araffin added a commit that referenced this issue Dec 19, 2019
@araffin araffin mentioned this issue Dec 19, 2019
11 tasks
araffin added a commit that referenced this issue Dec 19, 2019
* Bump version

* Add a message to PPO2 assert (closes #625)

* Update replay buffer doctring (closes #610)

* Don't specify a version for pytype

* Fix `VecEnv` docstrings (closes #577)

* Typo

* Re-add python version for pytype
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Documentation should be updated duplicate This issue or pull request already exists
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants