This demo showcases how to build, manage, and deploy machine learning models using AWS SageMaker and MLRun. It emphasizes the automation of ML workflows from development to production.
This demo is based on the SageMaker Payment Classification use case from the SageMaker's example repository (https://github.com/aws/amazon-sagemaker-examples/blob/main/use-cases/financial_payment_classification/financial_payment_classification.ipynb).
-
AWS SageMaker: A comprehensive service that enables developers and data scientists to build, train, and deploy machine learning (ML) models efficiently.
-
MLRun: An open-source MLOps framework designed to manage and automate your machine learning and data science lifecycle. In this demo, it is used to automate ML deployment and workflows.
-
Prerequisites: Ensure you have an AWS account with SageMaker enabled and MLRun installed in your environment.
-
Clone the repository: Clone this repository to your SageMaker notebook environment.
-
Set the environment variables in
mlrun.env
: Copy themlrun.env
file to your workspace and fill in the necessary environment variables such asAWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
,AWS_DEFAULT_REGION
,SAGEMAKER_ROLE
,MLRUN_DBPATH
,V3IO_USERNAME
, andV3IO_ACCESS_KEY
. -
Run the Jupyter notebook: Open and run the
financial-payment-pipeline.ipynb
notebook. This notebook contains the code for the financial payment classification pipeline. -
Monitor your runs: Track your runs in the MLRun dashboard. The dashboard provides a graphical interface for tracking your MLRun projects, functions, runs, and artifacts.
You can also open financial-payment-classification.ipynb
to review the SageMaker code and the MLRun code segments cell-by-cell. This notebook does not include the automated workflow, but rather the individual steps.
This demo also includes a workflow for automating the execution of the machine learning pipeline. To set this up:
-
Fork this repository: Create a copy of this repository in your own GitHub account by forking it.
-
Add Secrets to Your Repository: Navigate to the "Settings" tab in your GitHub repository, then click on "Secrets". Here, you need to add the following secrets, which will be used as environment variables in your workflow:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_DEFAULT_REGION
SAGEMAKER_ROLE
MLRUN_DBPATH
V3IO_ACCESS_KEY
Additionally, set the V3IO_USERNAME
environment variable to your username.
-
Commit and Push Your Changes: Make any necessary changes to the code, then commit and push these changes to your repository.
-
Create a Pull Request: Create a pull request to either the
staging
ormain
branch. Once the pull request is merged, it will trigger the GitHub action. You can review the pipeline execution in the MLRun UI, a link to which can be found in the workflow steps.
You can also run the workflow manually by navigating to the "Actions" tab in your repository and clicking on the workflow.
This demo is licensed under the Apache 2.0 License. For more details, please take a look at the LICENSE file.