Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cross entropy is wrong #32

Closed
yjxaigithub opened this issue Aug 15, 2021 · 3 comments
Closed

cross entropy is wrong #32

yjxaigithub opened this issue Aug 15, 2021 · 3 comments

Comments

@yjxaigithub
Copy link

in https://github.com/tensorlayer/RLzoo/blob/master/rlzoo/common/distributions.py#L96, I think it should modified to "return -tf.reduce_sum(x*self._logits, axis=1)" for returning cross entropy because self._logits is already the logarithm of the probability .

@quantumiracle
Copy link
Member

I'm not sure why you said self._logits is already the logarithm of the probability. It refers to the logits before the softmax operation in standard literature, check the usage here.

@yjxaigithub
Copy link
Author

yjxaigithub commented Aug 20, 2021

The kl function and entropy imply the self._logits is already the logarithm of the probability. If not, the calculation of kl divergence and entropy are wrong

@quantumiracle
Copy link
Member

Ok, I saw what confused you. This issue can also be solved together with #31. In our implementation, self._logits is just the logits but not the log probability, and the KL function and entropy are not wrong in this sense. They should not follow what you suggested in #31. We intended to calculate the KL and entropy with the logits at inputs but not standard log probabilities, so it does not follow the standard math formulas as you suggested in #31. Both KL and entropy calculations contains a softmax function explicitly, it is why you see the exp and sum in our (complex) implementation.

I will close both issues. If you find other problems, feel free to report them!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants