We are a group of passionate data scientists and software developers with the mission to make intermediate and advanced topics on machine learning, data science and AI software engineering accessible to the wider data science community.
We create intermediate and advanced online courses on machine learning, data science and AI software development. We also maintain an open-source library for feature engineering: Feature-engine.
In addition, we talk, blog and participate in podcasts about machine learning, software development and open-source.
Check out the courses that we teach.
Courses | What you will learn |
---|---|
Feature engineering for machine learning | Learn to create new features, impute missing data, encode categorical variables, transform and discretize features and much more. |
Feature selection for machine learning | Learn to select features using wrapper, filter, embedded and hybrid methods, and build simpler and reliable models. |
Hyperparameter optimization for machine learning | Learn about grid and random search, Bayesian Optimization, Multi-fidelity models, Optuna, Hyperopt, Scikit-Optimize and more. |
Machine learning with imbalanced data | Learn about under- and over-sampling, ensemble and cost-sensitive methods and improve the performance of models trained on imbalanced data. |
Feature engineering for time series forecasting | Learn to create lag and window features, impute data in time series, encode categorical variabes and much more, specifically for forecasting. |
Forecasting with Machine Learning | Learn to perform time series forecasting with machine learning models like linear regression, random forests and xgboost. |
Machine Learning Interpretability | Learn to interpret the predictions of your white box and black box machine learning models. |
Find out more about machine learning through our books, and have the code at your fingertips.
Books | Summary |
---|---|
Python feature engineering Cookbook, third edition | Over 70 Python recipes to implement feature engineering in tabular, transactional, time series and text data. |
Feature selection in machine learning with Python | Over 20 methods to select the most predictive features and build simpler, faster, and more reliable machine learning models. |
The open-source libraries I contribute to.
Library | About | Sponsor us |
---|---|---|
Feature-engine | Multiple transformers for missind data imputation, categorical encoding, variable transformation and discretization, feature creation and more. | Sponsor us |
Get to know the creators and instructors of our courses.
Instructor | Role |
---|---|
Soledad Galli | Data scientist |
Kishan Manani | Data scientist |
Chris Samiullah | Software developer |
Follow us on social media or through our website to be up to date with our latest news.
Media | Summary |
---|---|
Train in Data | Enroll in our courses and books |
Newsletter | I talk about data science, machine learning and how to become a data scientist. |
YouTube | I post about data science, machine learning and how to become a data scientist. |
I talk about data science, machine learning and how to become a data scientist. | |
I tweet about data science, machine learning and how to become a data scientist. | |
I talk about data science, machine learning and how to become a data scientist. | |
Blog | I write about data science, machine learning, feature engineering and selection and more. |
We hope to see you around.