Skip to content

This project aims to practice the steps of Crisp Data Mining ( CRISP-DM ). The repository includes 3 phases, data understanding, supervised learning, and unsupervised learning.

License

Notifications You must be signed in to change notification settings

maryamteimouri/DataAnalysis-and-KnowledgeDiscovery

Repository files navigation

Data Analysis and Knowledge Discovery

This project aims to practice the steps of Crisp Data Mining ( CRISP-DM ). The repository includes 3 phases, data understanding, supervised learning, and unsupervised learning.

  • In P1, data understanding, I practice looking at the data and checking data quality by plotting numeric and categorical features. Also, I apply some preprocessing methods like min-max scaling to [0,1], standardizing the features to 0 mean and unit variance, and one-hot encoding.

  • In P2, supervised learning, 3 classification methods are implemented; K nearest neighbor (KNN), ride regression, and KNN regression. For hyperparameter optimization, I used one-leave-out cross-validation.

  • In P3, Unsupervised learning, some preprocessing for data visualization methods are implemented; z-score standardization, principal component analysis (PCA), and dendrograms. Moreover, two clustering methods are applied; Agglomerative hierarchical and K-means clustering.

About

This project aims to practice the steps of Crisp Data Mining ( CRISP-DM ). The repository includes 3 phases, data understanding, supervised learning, and unsupervised learning.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published