You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! Thanks for your careful description of the library and I meet a question when reading it.
When I think about the relation between the CategoricalPgAgent and the MujocoFfModel, I find that these outputs of MujocoFfModel's function forward are mu, log_std, v but according to the source code the results which CategoricalPgAgent receive are pi, value. I want to know whether pi here has the same meaning with mu? And if so, when I run the code, it encounters an error 'probability tensor contains either inf, nan or element < 0', and I find that the vector pi has some of its elements < 0. It is unreasonable cause pi stands for the probabilities of each action, doesn't it ? Hope you could help me better understand it.
The text was updated successfully, but these errors were encountered:
Hi! Thanks for your careful description of the library and I meet a question when reading it.
When I think about the relation between the CategoricalPgAgent and the MujocoFfModel, I find that these outputs of MujocoFfModel's function forward are mu, log_std, v but according to the source code the results which CategoricalPgAgent receive are pi, value. I want to know whether pi here has the same meaning with mu? And if so, when I run the code, it encounters an error 'probability tensor contains either
inf
,nan
or element < 0', and I find that the vector pi has some of its elements < 0. It is unreasonable cause pi stands for the probabilities of each action, doesn't it ? Hope you could help me better understand it.The text was updated successfully, but these errors were encountered: