Digit recognition using the MNIST dataset and SciKitLearn
- Applying basic SKLearn libraries (Naive Bayes (incl. Gaussian Naive Bayes), Linear Regression, Logistic Regression, K-Nearest Neighbors) to identify object classes.
- Functions to print grids of images.
- Creating from scratch a bespoke image blurring function.
Basic natural language processing using altnet forum messages and SciKitLearn
- Applying basic SKLearn libraries (Naive Bayes, Linear Regression, Logistic Regression) to text data for message category prediction.
- Basic text pre-processing using NLTK and Regex.
- Creation of vocabularies for training machine learning models to predict message categories.
Unsupervised learning using SKLearn and the Kaggle Poisonous Mushrooms dataset
- Applying dimensionality reduction (principal component analysis) and basic unsupervised learning (Gaussian mixture models) to classify a highly dimensional dataset containing mushroom information to determine whether mushrooms are poisonous or non-poisonous.
- Optimizing Gaussian mixture models.