Here are all the challenges i've made so far. I've started several weeks ago, but now i clean and make those kernels public. Side note: some of my solutions require write permission on the hard drive, so very few kernel could have been released on Kaggle (because write permissions aren't allowed)...
- Use case: use a neural network on a classification task of clothes' images
- Data: 28x28 grayscale images associated with a label from 10 classes
- Concepts: Multi Layers Perceptron with Tensorflow .
- Use case: Distinguish images of dogs from cats
- Data: binary classification of images in color, 25,000 labeled and 12,500 unlabeled for the submission purpose
- Concepts: deep learning / computer vision using CNN with Tensorflow
- Use case: generation of new digits
- Data: the famous MNIST data intended to learn computer vision fundamentals
- Concepts: unsupervised deep learning with G.A.N - training of a Generator and a Discriminator
- Use case: Predict whether a question asked on Quora is sincere or not
- Data: 1.3M labelled questions text
- Concepts: supervised ML, 1st part : using N.L.T.K, tokenization, stemming, TF-IDF and CountVectorizer
- Concepts: Word embedding, Word2Vec, R.N.N
- Use case: playlist generators for video and music services like Netflix, YouTube and Spotify...
- Data: 100,000 ratings from 1000 users on 1700 movies (MovieLens 100K Dataset)
- Concepts: supervised ML, Hybrid recommender system (mix collaborative/content-based filtering) with lightFM
- Use case: forecast rentals of a city bikeshare system
- Data: datetime, weather infos, rentals number
- Concepts: supervised ML regression (__GradientBoosting Reg, Ridge/Lasso), metric: RMSLE
- Use case: predict whether income exceeds $50K/yr based on census, define people profiles
- Data: age, workclass, education, marital-status, occupation, race, sex, capital-gain/loss,hours-per-week, country.
- Concepts: supervised ML binary classification, model explanation / feature analysis, GridsearchCV.
- Use case: customer segmentation, target customers with whom you can start marketing strategy
- Data: customerID, gender, age, annual income (k$) & spending score from a supermarket mall customers
- Concepts: unsupervised ML (KMeans Clustering)
- Use case: anomalie / fraud detection
- Data: anonymized credit card transactions labeled as fraudulent or genuine, recorded over 2 days
- Concepts: supervised ML binary classification, classes highly imbalanced, metric: Area Under the Precision-Recall Curve (AUPRC), PCA, Synthetic Minority Over-sampling Technique (SMOTe) & LOF model (LocalOutlierFactor)
- Use case: predict a real estate price
- Data: Median house prices for California districts derived from the 1990 census.
- Concepts: supervised ML regression, analysis of geospatial data
- Use case: predict whether or not an applicant will be able to repay a loan
- Data: previous credits, POS (point of sales), cash loans, previous applications and repayment history
- Concepts: supervised ML binary classification (LightGBM, XGBoost), imbalanced classes, metric: area under the ROC curve
- Use case: predict behavior to retain customers
- Data: the ones who left, services, account infos, contract, payment method, charges, demographic info
- Concepts: supervised ML binary classification, metric: F1 score, hyperparameters tuning, pipelines of models
- Use case: predict sales prices
- Data: area, shape, condition, construction year...
- Concepts: supervised ML regression, practice feature engineering, RFs, and gradient boosting