GitHub - chenyz0601/mmd-project: Mining Million Song Dataset

This is the repository of the project of Mining Massive Data.
Authors: Yuanze Chen, Alex
Time: 25/10/2016

Data

subset of Million Song Dataset, 10000 songs (compressed 1.8G).

Project 1: duplicate detection

using Locality Sensitive Hashing and cosin distance.

Project 2: song recommendation （part 1）

using Latent Factor Models.
using alternative optimization to find the latent factor of the user-song-count matrix.

Project 3: song recommendation （part 2）

using Gradient Dencent, SGD and mini-batch SGD to solve the latent factor problem.

Project 4: song ranking

compute the song similarity, and build a similarity network of songs.
using Topic-Specific PageRank to rank songs.

Project 5: song clustering

using network in project4 to construct a weighted adjacency matrix.
performing spectral clustering on it, support normalized and un-normalized graph Laplacian.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
project1		project1
project2		project2
project3		project3
project4		project4
project5		project5
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data

Project 1: duplicate detection

Project 2: song recommendation （part 1）

Project 3: song recommendation （part 2）

Project 4: song ranking

Project 5: song clustering

About

Releases

Packages

Languages

chenyz0601/mmd-project

Folders and files

Latest commit

History

Repository files navigation

Data

Project 1: duplicate detection

Project 2: song recommendation （part 1）

Project 3: song recommendation （part 2）

Project 4: song ranking

Project 5: song clustering

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages