SAC: ValueError: setting an array element with a sequence (stable-baselines 2.10) #852

marianophielipp · 2020-05-12T17:54:06Z

Code:
[Custom made environment]
import gym
import numpy as np
from stable_baselines.sac.policies import MlpPolicy
from stable_baselines import SAC

model = SAC(MlpPolicy, env, verbose=1)

if train:
model.learn(total_timesteps=total_timesteps, log_interval=10)

Output:

| current_lr | 0.0003 |
| episodes | 10 |
| fps | 0 |
| mean 100 episode reward | -4 |
| n_updates | 0 |
| time_elapsed | 151 |
| total timesteps | 92 |

TypeError: only size-1 arrays can be converted to Python scalars

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "train_real_arm_perception_1.py", line 43, in
model.learn(total_timesteps=total_timesteps, log_interval=10)
File "/home/ipc/open_baselines/open_base/lib/python3.6/site-packages/stable_baselines/sac/sac.py", line 464, in learn
mb_infos_vals.append(self._train_step(step, writer, current_lr))
File "/home/ipc/open_baselines/open_base/lib/python3.6/site-packages/stable_baselines/sac/sac.py", line 343, in _train_step
out = self.sess.run(self.step_ops, feed_dict)
File "/home/ipc/open_baselines/open_base/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 950, in run
run_metadata_ptr)
File "/home/ipc/open_baselines/open_base/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1142, in _run
np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
File "/home/ipc/open_baselines/open_base/lib/python3.6/site-packages/numpy/core/_asarray.py", line 85, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.

Miffyli · 2020-05-12T17:57:47Z

Use env checker to see if your environment works correctly.

araffin · 2020-05-12T18:08:09Z

PS: Please use and read the issue template (the env checker is mentioned there too)

araffin · 2020-05-12T18:16:46Z

I think we need to output a better error message in SB3 (see #707 and #712)
Currently, we cannot do that properly because of the Unvecwrapper...

EDIT: the mentioned issue is not the same but it is related in term of unclear message

marianophielipp · 2020-05-12T19:31:50Z

check_env(env)
<class 'numpy.ndarray'>
Traceback (most recent call last):
File "", line 1, in
File "/home/ipc/open_baselines/open_base/lib/python3.6/site-packages/stable_baselines/common/env_checker.py", line 214, in check_env
_check_returned_values(env, observation_space, action_space)
File "/home/ipc/open_baselines/open_base/lib/python3.6/site-packages/stable_baselines/common/env_checker.py", line 99, in _check_returned_values
_check_obs(obs, observation_space, 'reset')
File "/home/ipc/open_baselines/open_base/lib/python3.6/site-packages/stable_baselines/common/env_checker.py", line 89, in _check_obs
"method does not match the given observation space".format(method_name))
AssertionError: The observation returned by the reset() method does not match the given observation space

a=env.observation_space.sample()
<class 'numpy.ndarray'>
b=env.reset()
<class 'numpy.ndarray'>
a.shape
(6,)
b.shape
(6,)
a
array([0.30428773, 0.42360216, 0.8966984 , 0.4622259 , 0.6768906 ,
0.5416117 ], dtype=float32)
b
array([ 0.74211503, 0.34441176, 0.33516484, 0.2 , -0.25 ,
0.1 ], dtype=float32)

Miffyli · 2020-05-12T19:34:20Z

As the error message says, observation from reset() differs from the one set by self.observation_space.

This is not a place for technical support, though. Please close this issue if there are no further enhancements/issues related to stable-baselines.

marianophielipp · 2020-05-12T19:40:27Z

ok. but seems to be the same. I debug everything. even compare with CartPole.

Miffyli · 2020-05-12T19:45:06Z

That is because SAC does not support discrete actions, only continuous ones (see docs).

marianophielipp · 2020-05-12T19:45:29Z

I'm using continuous actions

Miffyli · 2020-05-12T19:45:48Z

That is because SAC does not support discrete actions, only continuous ones (see docs). Indeed this error should be clarified in future updates.

Edit: Github derped my messages.

marianophielipp · 2020-05-12T19:46:32Z

the action space is self.action_space = spaces.Box(low=low_action, high=high_action, dtype=np.float32)

marianophielipp · 2020-05-12T19:46:56Z

but the complain in the state space which seems completely fine

Miffyli · 2020-05-12T19:48:15Z

CartPole uses discrete actions, that's why it is not working. Your example does not work because reset() function is wrong.

I am closing this issue as this is not a stable-baselines bug or enhancement suggestion, and the check for action spaces has already been noted.

marianophielipp · 2020-05-12T19:50:04Z

I know it use discrete action. My reset function seems fine, rigth data types, etc. as I posted the results up there.

araffin · 2020-05-12T19:51:55Z

Please fill the issue template completely next time, notably by formatting your code using markdown codeblock and giving a minimal working example, e.g.:

import gym
import numpy as np

class CustomEnv(gym.Env):
    def __init__(self):
        self.observation_space = gym.spaces.Box(low=-np.inf, high=np.inf, shape=(6,))
        self.action_space = gym.spaces.Box(low=-1, high=1, shape=(6,))

    def reset(self):
        return self.observation_space.sample()

    def step(self, action):
        return self.observation_space.sample(), 0.0, False, {}

from stable_baselines.common.env_checker import check_env

check_env(CustomEnv())

araffin · 2020-05-12T19:58:56Z

Related issues: #595 and #283

marianophielipp · 2020-05-12T20:01:52Z

ok. I will do that. BTW I see in the code of the check_env and print the arguments and I still dont understand what's wrong, but I suppose is my problem now:
CartPole:
print("Checking env", check_env(envg))
Box(4,) : obs [ 0.01070996 -0.04723248 -0.02073532 -0.027894 ]
<class 'gym.spaces.box.Box'> : t obs <class 'numpy.ndarray'>

My Custom Env:
print("Checking env", check_env(env))
<class 'numpy.ndarray'>
Box(6,) : obs [ 0.14489795 0.75911766 0.21703297 0.2 -0.25 0.1 ]
<class 'gym.spaces.box.Box'> : t obs <class 'numpy.ndarray'>

marianophielipp · 2020-05-12T21:06:50Z

Does works in stable-baselines 2.8.0

Miffyli added custom gym env Issue related to Custom Gym Env question Further information is requested labels May 12, 2020

araffin added the PR template not filled Please fill the pull request template label May 12, 2020

araffin added more information needed Please fill the issue template completely and removed PR template not filled Please fill the pull request template labels May 12, 2020

Miffyli added the No tech support We do not do tech support label May 12, 2020

Miffyli closed this as completed May 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SAC: ValueError: setting an array element with a sequence (stable-baselines 2.10) #852

SAC: ValueError: setting an array element with a sequence (stable-baselines 2.10) #852

marianophielipp commented May 12, 2020

Miffyli commented May 12, 2020

araffin commented May 12, 2020

araffin commented May 12, 2020 •

edited

Loading

marianophielipp commented May 12, 2020 •

edited

Loading

Miffyli commented May 12, 2020

marianophielipp commented May 12, 2020

Miffyli commented May 12, 2020

marianophielipp commented May 12, 2020

Miffyli commented May 12, 2020 •

edited

Loading

marianophielipp commented May 12, 2020

marianophielipp commented May 12, 2020

Miffyli commented May 12, 2020

marianophielipp commented May 12, 2020

araffin commented May 12, 2020

araffin commented May 12, 2020

marianophielipp commented May 12, 2020

marianophielipp commented May 12, 2020

SAC: ValueError: setting an array element with a sequence (stable-baselines 2.10) #852

SAC: ValueError: setting an array element with a sequence (stable-baselines 2.10) #852

Comments

marianophielipp commented May 12, 2020

Output:

| current_lr | 0.0003 | | episodes | 10 | | fps | 0 | | mean 100 episode reward | -4 | | n_updates | 0 | | time_elapsed | 151 | | total timesteps | 92 |

Miffyli commented May 12, 2020

araffin commented May 12, 2020

araffin commented May 12, 2020 • edited Loading

marianophielipp commented May 12, 2020 • edited Loading

Miffyli commented May 12, 2020

marianophielipp commented May 12, 2020

Miffyli commented May 12, 2020

marianophielipp commented May 12, 2020

Miffyli commented May 12, 2020 • edited Loading

marianophielipp commented May 12, 2020

marianophielipp commented May 12, 2020

Miffyli commented May 12, 2020

marianophielipp commented May 12, 2020

araffin commented May 12, 2020

araffin commented May 12, 2020

marianophielipp commented May 12, 2020

marianophielipp commented May 12, 2020

| current_lr | 0.0003 |
| episodes | 10 |
| fps | 0 |
| mean 100 episode reward | -4 |
| n_updates | 0 |
| time_elapsed | 151 |
| total timesteps | 92 |

araffin commented May 12, 2020 •

edited

Loading

marianophielipp commented May 12, 2020 •

edited

Loading

Miffyli commented May 12, 2020 •

edited

Loading