Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question about CategoricalPgAgent #198

Open
LTEnjoy opened this issue Feb 2, 2021 · 0 comments
Open

A question about CategoricalPgAgent #198

LTEnjoy opened this issue Feb 2, 2021 · 0 comments

Comments

@LTEnjoy
Copy link

LTEnjoy commented Feb 2, 2021

Hi! Thanks for your careful description of the library and I meet a question when reading it.
When I think about the relation between the CategoricalPgAgent and the MujocoFfModel, I find that these outputs of MujocoFfModel's function forward are mu, log_std, v but according to the source code the results which CategoricalPgAgent receive are pi, value. I want to know whether pi here has the same meaning with mu? And if so, when I run the code, it encounters an error 'probability tensor contains either inf, nan or element < 0', and I find that the vector pi has some of its elements < 0. It is unreasonable cause pi stands for the probabilities of each action, doesn't it ? Hope you could help me better understand it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant