Implementation of some well known clustering algorithms and their analysis.
-
football.csv
has the information about 18K football players and their different features, abilities and skills in the game including other attributes like their club, nationality, height etc. -
1_data_visualization.ipynb
contains visualizations of the information in csv files. -
2_KMeans.ipynb
contains implementation of Kmeans clustering algorithm from scratch without the use of any inbuilt libraries. -
3_Agglomerative.ipynb
contains implementation of Agglomerative hierarchical clustering. -
3_Divisive.ipynb
contains implementation of Divisive hierarchical clustering. -
4_DBSCAN.ipynb
contains implementation of DBSCAN clustering algorithm. -
Report.pdf
contains our detailed analysis on all the tasks and their comparison.
All the implementations have the initial code of data cleaning same. After each cell, some print statements are added to show the progress of the code.
To visualize the clusters in 2D, PCA
was used for dimensionality reduction.