Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Including method SVD.get_utility_matrix #5

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

guedes-joaofelipe
Copy link
Contributor

A function to get the utility matrix with users as rows and items as columns was added to the SVD class. This matrix may be useful for other matrix factorization methods. A test for the function was developed in the run_experiment.py script.

@gbolmier
Copy link
Owner

Hi @guedes-joaofelipe! Glad to see you interest with this PR :)

Usually the utility matrix is so large and sparse that we don't want to store it in its plain form to avoid memory issues. What are your use cases that would benefit from it?

Also, I'm not sure this logic would belong to the SVD class, it's more about preprocessing (which is not required by the SVD)

@guedes-joaofelipe
Copy link
Contributor Author

Hey @gbolmier!

Usually, I use the matrix form when I want to use the predictions from SVD together with another model (making a hybrid, for example). I'm currently using this form to train a MAB recommender.

But I guess the matrix form could be inserted outside the class. I just made this PR in case anyone would be interested in this function. Do you think it would be better if the function was contained in the utils.py script? (if it should be contained at all in any script)

@gbolmier
Copy link
Owner

If I understand well, you need this format for compatible issues with your other models. It's not really efficient in most recommendation settings to use that format, are you using a library requiring it?

@sicotfre
Copy link
Contributor

sicotfre commented Jul 23, 2020

Usually the utility matrix is so large and sparse that we don't want to store it in its plain form to avoid memory issues. What are your use cases that would benefit from it?

FYI Pandas handle sparse data very well with low memory footprint: https://pandas.pydata.org/docs/user_guide/sparse.html#sparse-data-structures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants