You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey folks, I was looking into the loss function here and see there is a z loss added to the typical log loss definition.
The only paper I could surface on this was https://arxiv.org/abs/1604.08859 but I don't see the correlation. In the code it is logsumexp of the logits squared.
But in the paper it is calculating mean/variance for the z-loss you can see the author's implementation here I am asking for help either connecting the two or looking for resources on the current implementation if they are infact unrelated.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hey folks, I was looking into the loss function here and see there is a z loss added to the typical log loss definition.
The only paper I could surface on this was https://arxiv.org/abs/1604.08859 but I don't see the correlation. In the code it is logsumexp of the logits squared.
But in the paper it is calculating mean/variance for the z-loss you can see the author's implementation here I am asking for help either connecting the two or looking for resources on the current implementation if they are infact unrelated.
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions