Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large Memory Usage even with a moderate dataset #24

Open
HarshdeepGupta opened this issue Sep 11, 2018 · 2 comments
Open

Large Memory Usage even with a moderate dataset #24

HarshdeepGupta opened this issue Sep 11, 2018 · 2 comments

Comments

@HarshdeepGupta
Copy link

Hi there, I have been trying to use case recommender for my dataset. I have nearly 380,000 users and 15,000 items in my dataset, and when I run the example on mostpopular algorithm, the memory usage just blows up. I allow it to run for a couple of hours but still no results come up. What might be the possible causes of this, and can this be made to work?
screen shot 2018-09-11 at 10 51 02 pm

@arthurfortes
Copy link
Member

Hi, @HarshdeepGupta .

Based on the size of your dataset, I believe that the structure (dictionaries of values, list of users and items, etc) that the framework creates during the pre-processing step is taking up all of the memory space (or much of it). In the ranking scenario (recommendation item) the recommendation is made for all users on dataset of all items that they have not seen and this tends to take a long time, even using a simple algorithm. In addition, another structure is created to store all the predictions, which ends up consuming more memory.

Unfortunately, I believe it is a deficiency of Case Recommender to handle large datasets, and this is certainly one thing that needs to be improved.

For now, I recommend you look at the read structures that are in this file and try to modify or use another framework in Python optimized with C, called Surprise.

@HarshdeepGupta
Copy link
Author

@arthurfortes Thanks for letting us know, it too about 16hrs on my Mac with i7 and 16GB RAM, but the results finally came. Thanks a lot for developing this library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants