Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About calculation of loss function #28

Open
Littleor opened this issue Nov 27, 2021 · 1 comment
Open

About calculation of loss function #28

Littleor opened this issue Nov 27, 2021 · 1 comment

Comments

@Littleor
Copy link

Littleor commented Nov 27, 2021

Hi, I have some questions about this loss function because there are some differences between the paper and the code.
First of all, the paper:
Screen Shot 2021-11-27 at 19 26 52
The loss is the sum of the average logsoftmax between the query point and other prototypes and the average distance between the query point and the corresponding prototypes.

But in the code:

       log_p_y = F.log_softmax(-dists, dim=1).view(n_class, n_query, -1)
       loss_val = -log_p_y.gather(2, target_inds).squeeze().view(-1).mean()

The loss in code is only the sum of LogSoftmax between the query point and the corresponding prototypes.

I'm confused. Is it my understanding of the code or my understanding of the paper?

@fabian57fabian
Copy link

I had the same confusion when replicated the paper's results.
You could basically run cross-entropy and results will be the same.
In this formula, he uses log_softmax on distances, then take only the right ones with gather function and compute mean on that.
So it is basically the same.

I did similarly in my own implementation:
https://github.com/fabian57fabian/prototypical-networks-few-shot-learning

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants