Data Analysis and Knowledge Discovery

This project aims to practice the steps of Crisp Data Mining ( CRISP-DM ). The repository includes 3 phases, data understanding, supervised learning, and unsupervised learning.

In P1, data understanding, I practice looking at the data and checking data quality by plotting numeric and categorical features. Also, I apply some preprocessing methods like min-max scaling to [0,1], standardizing the features to 0 mean and unit variance, and one-hot encoding.
In P2, supervised learning, 3 classification methods are implemented; K nearest neighbor (KNN), ride regression, and KNN regression. For hyperparameter optimization, I used one-leave-out cross-validation.
In P3, Unsupervised learning, some preprocessing for data visualization methods are implemented; z-score standardization, principal component analysis (PCA), and dendrograms. Moreover, two clustering methods are applied; Agglomerative hierarchical and K-means clustering.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
DataUnderstanding.ipynb		DataUnderstanding.ipynb
LICENSE		LICENSE
README.md		README.md
SupervisedLearning.ipynb		SupervisedLearning.ipynb
UnsupervisedLearning.ipynb		UnsupervisedLearning.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Analysis and Knowledge Discovery

About

Releases

Packages

Languages

License

maryamteimouri/DataAnalysis-and-KnowledgeDiscovery

Folders and files

Latest commit

History

Repository files navigation

Data Analysis and Knowledge Discovery

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages