-
Notifications
You must be signed in to change notification settings - Fork 724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SAC: ValueError: setting an array element with a sequence (stable-baselines 2.10) #852
Comments
Use env checker to see if your environment works correctly. |
PS: Please use and read the issue template (the env checker is mentioned there too) |
check_env(env)
|
As the error message says, observation from This is not a place for technical support, though. Please close this issue if there are no further enhancements/issues related to stable-baselines. |
ok. but seems to be the same. I debug everything. even compare with CartPole. |
That is because SAC does not support discrete actions, only continuous ones (see docs). |
I'm using continuous actions |
That is because SAC does not support discrete actions, only continuous ones (see docs). Indeed this error should be clarified in future updates. Edit: Github derped my messages. |
the action space is self.action_space = spaces.Box(low=low_action, high=high_action, dtype=np.float32) |
but the complain in the state space which seems completely fine |
CartPole uses discrete actions, that's why it is not working. Your example does not work because I am closing this issue as this is not a stable-baselines bug or enhancement suggestion, and the check for action spaces has already been noted. |
I know it use discrete action. My reset function seems fine, rigth data types, etc. as I posted the results up there. |
Please fill the issue template completely next time, notably by formatting your code using markdown codeblock and giving a minimal working example, e.g.: import gym
import numpy as np
class CustomEnv(gym.Env):
def __init__(self):
self.observation_space = gym.spaces.Box(low=-np.inf, high=np.inf, shape=(6,))
self.action_space = gym.spaces.Box(low=-1, high=1, shape=(6,))
def reset(self):
return self.observation_space.sample()
def step(self, action):
return self.observation_space.sample(), 0.0, False, {}
from stable_baselines.common.env_checker import check_env
check_env(CustomEnv()) |
ok. I will do that. BTW I see in the code of the check_env and print the arguments and I still dont understand what's wrong, but I suppose is my problem now: My Custom Env: |
Does works in stable-baselines 2.8.0 |
Code:
[Custom made environment]
import gym
import numpy as np
from stable_baselines.sac.policies import MlpPolicy
from stable_baselines import SAC
model = SAC(MlpPolicy, env, verbose=1)
if train:
model.learn(total_timesteps=total_timesteps, log_interval=10)
Output:
| current_lr | 0.0003 |
| episodes | 10 |
| fps | 0 |
| mean 100 episode reward | -4 |
| n_updates | 0 |
| time_elapsed | 151 |
| total timesteps | 92 |
TypeError: only size-1 arrays can be converted to Python scalars
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "train_real_arm_perception_1.py", line 43, in
model.learn(total_timesteps=total_timesteps, log_interval=10)
File "/home/ipc/open_baselines/open_base/lib/python3.6/site-packages/stable_baselines/sac/sac.py", line 464, in learn
mb_infos_vals.append(self._train_step(step, writer, current_lr))
File "/home/ipc/open_baselines/open_base/lib/python3.6/site-packages/stable_baselines/sac/sac.py", line 343, in _train_step
out = self.sess.run(self.step_ops, feed_dict)
File "/home/ipc/open_baselines/open_base/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 950, in run
run_metadata_ptr)
File "/home/ipc/open_baselines/open_base/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1142, in _run
np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
File "/home/ipc/open_baselines/open_base/lib/python3.6/site-packages/numpy/core/_asarray.py", line 85, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.
The text was updated successfully, but these errors were encountered: