Serverless MLOps System on AWS

[Still in development] TODOs:

❗ transformations and storage of new user input

❗ drift detection algorithm

Overview

This project consists of two directories: database and ml-demo-api. The aim is to create and deploy an MLOps architecture that serves a model for mental health disorder prediction. The entire workflow consists of a 3 AWS Lambda functions and an AWS RDS MySQL database.

The architecture supports:

Training, logging and versioning the models.
Loading the best model and making predictions with it.
Storing each new input feature and model prediction.
Detecting data distribution drift and triggering model re-training.
Notification alerts about the training process.

Architecture Components

1. AWS RDS MySQL Database

The database directory is a local environment for a developer to execute queries on the database. The SQL queries support table creation, table population, trigger and stored procedure creation. It also supports counting the insertion of new labeled features to the table. When 100 new labeled features are inserted, a stored procedure triggers a Lambda function for data distribution shift check and the counter is reset back to 0.

2. AWS Lambda Functions

I have deployed 3 Dockerized Python Lambda functions. I chose a Docker container so I can easily build the environment and install the neccesary dependencies.

Data distribution shift function - this function runs an algorithm which detects if the distribution of data points per categories has shifted. This means that the real-world distribution of data is actually different than the distribution my model approximated and model re-training with newly stored data is needed.
Train function - loads the features from the feature table, trains the model and logs the training metrics on Weight&Biases platform.
Predict function - This function takes user input from the body of an HTTP POST request, loads the champion model, returns a prediction to the user, transforms the input data and label and stores them as a new labeled feature to the feature table.

3. AWS S3 Bucket

The bucket serves as a model storage where the train Lambda function saves the model, and the predict Lambda function loads the model from.

4. Weight&Biases Model Tracking

Tracks model metadata, model storage paths and the training process and evaluation metrics. Used for selecting the model which showed the best performance metrics during evaluation.

Deployment

Deploy dockerized 🐳 Lambda functions for prediction and training:

npm install -g aws-cdk

cdk bootstrap --region [REGION]

cdk deploy

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
database		database
ml-demo-api		ml-demo-api
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Serverless MLOps System on AWS

Overview

Architecture Components

1. AWS RDS MySQL Database

2. AWS Lambda Functions

3. AWS S3 Bucket

4. Weight&Biases Model Tracking

Deployment

License

About

Languages

License

rejsafranko/AWS-Serverless-MLOps

Folders and files

Latest commit

History

Repository files navigation

Serverless MLOps System on AWS

Overview

Architecture Components

1. AWS RDS MySQL Database

2. AWS Lambda Functions

3. AWS S3 Bucket

4. Weight&Biases Model Tracking

Deployment

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages