This is the repository of the project of Mining Massive Data.
Authors: Yuanze Chen, Alex
Time: 25/10/2016
subset of Million Song Dataset, 10000 songs (compressed 1.8G).
using Locality Sensitive Hashing and cosin distance.
using Latent Factor Models.
using alternative optimization to find the latent factor of the user-song-count matrix.
using Gradient Dencent, SGD and mini-batch SGD to solve the latent factor problem.
compute the song similarity, and build a similarity network of songs.
using Topic-Specific PageRank to rank songs.
using network in project4 to construct a weighted adjacency matrix.
performing spectral clustering on it, support normalized and un-normalized graph Laplacian.