Large Memory Usage even with a moderate dataset #24

HarshdeepGupta · 2018-09-11T14:52:45Z

Hi there, I have been trying to use case recommender for my dataset. I have nearly 380,000 users and 15,000 items in my dataset, and when I run the example on mostpopular algorithm, the memory usage just blows up. I allow it to run for a couple of hours but still no results come up. What might be the possible causes of this, and can this be made to work?

arthurfortes · 2018-09-11T22:29:45Z

Hi, @HarshdeepGupta .

Based on the size of your dataset, I believe that the structure (dictionaries of values, list of users and items, etc) that the framework creates during the pre-processing step is taking up all of the memory space (or much of it). In the ranking scenario (recommendation item) the recommendation is made for all users on dataset of all items that they have not seen and this tends to take a long time, even using a simple algorithm. In addition, another structure is created to store all the predictions, which ends up consuming more memory.

Unfortunately, I believe it is a deficiency of Case Recommender to handle large datasets, and this is certainly one thing that needs to be improved.

For now, I recommend you look at the read structures that are in this file and try to modify or use another framework in Python optimized with C, called Surprise.

HarshdeepGupta · 2018-09-12T10:18:01Z

@arthurfortes Thanks for letting us know, it too about 16hrs on my Mac with i7 and 16GB RAM, but the results finally came. Thanks a lot for developing this library.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large Memory Usage even with a moderate dataset #24

Large Memory Usage even with a moderate dataset #24

HarshdeepGupta commented Sep 11, 2018

arthurfortes commented Sep 11, 2018

HarshdeepGupta commented Sep 12, 2018

Large Memory Usage even with a moderate dataset #24

Large Memory Usage even with a moderate dataset #24

Comments

HarshdeepGupta commented Sep 11, 2018

arthurfortes commented Sep 11, 2018

HarshdeepGupta commented Sep 12, 2018