Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add fissa model #1310

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Conversation

Tokkiu
Copy link

@Tokkiu Tokkiu commented Jun 11, 2022

Add FISSA model to recbole.
https://doi.org/10.1145/3383313.3412247

@Wicknight Wicknight self-requested a review June 14, 2022 01:51
recbole/model/sequential_recommender/fissa.py Outdated Show resolved Hide resolved
recbole/model/sequential_recommender/fissa.py Show resolved Hide resolved
recbole/model/sequential_recommender/fissa.py Outdated Show resolved Hide resolved
@Tokkiu
Copy link
Author

Tokkiu commented Jun 15, 2022

@Wicknight Thanks for your kind advice. I have taken them into account and committed to a new version. Plz, check it.

@Tokkiu
Copy link
Author

Tokkiu commented Jun 15, 2022

@Wicknight Thanks for your advice again.
Now I just tested the FISSA and SASRec models with all default settings on the ml-100k dataset, which means they use almost all similar hyper parameters for a fair comparison.
The final performance is:

  • FISSA: ('recall@10', 0.1315), ('mrr@10', 0.0436), ('ndcg@10', 0.0637), ('hit@10', 0.1315), ('precision@10', 0.0131)
  • SASRec: ('recall@10', 0.1283), ('mrr@10', 0.0423), ('ndcg@10', 0.062), ('hit@10', 0.1283), ('precision@10', 0.0128)

@Wicknight
Copy link
Collaborator

@Wicknight Thanks for your advice again. Now I just tested the FISSA and SASRec models with all default settings on the ml-100k dataset, which means they use almost all similar hyper parameters for a fair comparison. The final performance is:

  • FISSA: ('recall@10', 0.1315), ('mrr@10', 0.0436), ('ndcg@10', 0.0637), ('hit@10', 0.1315), ('precision@10', 0.0131)
  • SASRec: ('recall@10', 0.1283), ('mrr@10', 0.0423), ('ndcg@10', 0.062), ('hit@10', 0.1283), ('precision@10', 0.0128)

@Tokkiu Thank you for your support and help to RecBole! I'll check it soon.

recbole/model/sequential_recommender/fissa.py Outdated Show resolved Hide resolved
recbole/model/sequential_recommender/fissa.py Outdated Show resolved Hide resolved
@Tokkiu
Copy link
Author

Tokkiu commented Jun 16, 2022

@Wicknight Thanks for your kind comments. I fixed them according to your advice. It makes sense!

@Tokkiu
Copy link
Author

Tokkiu commented Jun 18, 2022

@Wicknight Thank you for your comments. Is there anything need to be improved?

@Wicknight
Copy link
Collaborator

@Wicknight Thank you for your comments. Is there anything need to be improved?

Thank you for your support. I am testing on other datasets and waiting for the results. I will inform you as soon as I finish the test.

@Wicknight
Copy link
Collaborator

@Tokkiu During the test, I had a little doubt about the candidate items in line 116 and line 117. All items are used as candidate items in the code here. This increases the burden on the GPU and it seems to be inconsistent with the original intention of the paper. Is there any alternative in your opinion?

@Tokkiu
Copy link
Author

Tokkiu commented Jun 20, 2022

@Wicknight Thanks for your concern.
According to the paper, they adopt binary cross-entropy. Whereas we usually use CE and BPR in recbole for most sequential models. I also think it's improper to forward all items when calculating loss frequently. So from me, a viable way to solve this is using BCE + sampling instead of full CE. How do you think?

@Wicknight
Copy link
Collaborator

@Wicknight Thanks for your concern. According to the paper, they adopt binary cross-entropy. Whereas we usually use CE and BPR in recbole for most sequential models. I also think it's improper to forward all items when calculating loss frequently. So from me, a viable way to solve this is using BCE + sampling instead of full CE. How do you think?

@Tokkiu I think this is a good suggestion. Thank you for your help again! I have conducted certain tests, but it seems that I can not reproduce the results reported in the paper. Therefore, I would like to ask whether you have successfully reproduced the results of the paper?

@Tokkiu
Copy link
Author

Tokkiu commented Jun 21, 2022

@Wicknight Thanks for your concern. According to the paper, they adopt binary cross-entropy. Whereas we usually use CE and BPR in recbole for most sequential models. I also think it's improper to forward all items when calculating loss frequently. So from me, a viable way to solve this is using BCE + sampling instead of full CE. How do you think?

@Tokkiu I think this is a good suggestion. Thank you for your help again! I have conducted certain tests, but it seems that I can not reproduce the results reported in the paper. Therefore, I would like to ask whether you have successfully reproduced the results of the paper?

Yes, I have same conclusion. From my test results. The performance of FISSA is comparable with SASRec. But it can't outperform SASRec consistently as mentioned in paper.

@Wicknight
Copy link
Collaborator

@Wicknight Thanks for your concern. According to the paper, they adopt binary cross-entropy. Whereas we usually use CE and BPR in recbole for most sequential models. I also think it's improper to forward all items when calculating loss frequently. So from me, a viable way to solve this is using BCE + sampling instead of full CE. How do you think?

@Tokkiu I think this is a good suggestion. Thank you for your help again! I have conducted certain tests, but it seems that I can not reproduce the results reported in the paper. Therefore, I would like to ask whether you have successfully reproduced the results of the paper?

Yes, I have same conclusion. From my test results. The performance of FISSA is comparable with SASRec. But it can't outperform SASRec consistently as mentioned in paper.

Could you tell me how many datasets in the paper have you used for testing?Does this inconsistency occur on all datasets?

@Tokkiu
Copy link
Author

Tokkiu commented Jun 21, 2022

@Wicknight Thanks for your concern. According to the paper, they adopt binary cross-entropy. Whereas we usually use CE and BPR in recbole for most sequential models. I also think it's improper to forward all items when calculating loss frequently. So from me, a viable way to solve this is using BCE + sampling instead of full CE. How do you think?

@Tokkiu I think this is a good suggestion. Thank you for your help again! I have conducted certain tests, but it seems that I can not reproduce the results reported in the paper. Therefore, I would like to ask whether you have successfully reproduced the results of the paper?

Yes, I have same conclusion. From my test results. The performance of FISSA is comparable with SASRec. But it can't outperform SASRec consistently as mentioned in paper.

Could you tell me how many datasets in the paper have you used for testing?Does this inconsistency occur on all datasets?

I mainly test datasets in MovieLens. It shows similar performance on ml-1m and ml-10m with comparable performance on ml-100k.

@Wicknight
Copy link
Collaborator

@Wicknight Thanks for your concern. According to the paper, they adopt binary cross-entropy. Whereas we usually use CE and BPR in recbole for most sequential models. I also think it's improper to forward all items when calculating loss frequently. So from me, a viable way to solve this is using BCE + sampling instead of full CE. How do you think?

@Tokkiu I think this is a good suggestion. Thank you for your help again! I have conducted certain tests, but it seems that I can not reproduce the results reported in the paper. Therefore, I would like to ask whether you have successfully reproduced the results of the paper?

Yes, I have same conclusion. From my test results. The performance of FISSA is comparable with SASRec. But it can't outperform SASRec consistently as mentioned in paper.

Could you tell me how many datasets in the paper have you used for testing?Does this inconsistency occur on all datasets?

I mainly test datasets in MovieLens. It shows similar performance on ml-1m and ml-10m with comparable performance on ml-100k.

So you didn't test on the five datasets used in the paper?

@Tokkiu
Copy link
Author

Tokkiu commented Jun 21, 2022

Yes, did you test on them?How about performance gap?

@Wicknight
Copy link
Collaborator

Wicknight commented Jun 22, 2022

Yes, did you test on them?How about performance gap?

First of all, thank you for your PR and efforts to RecBole! I didn't test on all datasets, but the inconsistency occurred on several datasets. So If you want to launch this model, you must be able to reproduce the results of the paper. Thank you again for your contribution! I also hope you can continue to discuss the reasons for this inconsistency with me.

@Tokkiu
Copy link
Author

Tokkiu commented Jun 22, 2022

Yes, did you test on them?How about performance gap?

First of all, thank you for your PR and efforts to RecBole! I didn't test on all datasets, but the inconsistency occurred on several datasets. So If you want to launch this model, you must be able to reproduce the results of the paper. Thank you again for your contribution! I also hope you can continue to discuss the reasons for this inconsistency with me.

Nice suggestion! I am executing experiments on datasets mainly used in the paper. And I think the inconsistency may come from hyperparameter settings, e.g., max sequence length, dropout ratio, and negative sampling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants