Following was one of Scalable Data Mining Assignments.
- Load the MovieLens data and Convert the data into a matrix (for now, let it be dense)
- (Optional: if the data is too big to handle, choose a subset of the ratings)
- Leave out some non zero entries as a test set
- Perform normalization
- Perform SVD
- Compute the low rank ratings matrix according to the basic latent factor model
- Test performance against the test data that is left out
First, download the data from: https://grouplens.org/datasets/movielens/
Used 1M rating dataset