Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mention Feature Store, Model Registry, AutoGluon #30

Merged
merged 3 commits into from
Nov 23, 2022
Merged

Conversation

athewsey
Copy link
Contributor

Issue #, if available: N/A - but hope to tackle #26

Description of changes:

As the MLOps maturity of the average business grows, more users are keen to be introduced to more advanced features like SageMaker Feature Store and SageMaker Model Registry even in their first steps with SageMaker.

Built-in algorithms have also moved on, with AutoGluon-Tabular ensembling showing compelling accuracy gains vs XGBoost+HPO on benchmark datasets.

This PR attempts to incorporate these features into our initial built-in algo exercise.

Known limitations:

  • This increases the complexity of the first hands-on, gearing it more towards demonstrating "art of the possible" and less towards tightly focussed enablement on a specific topic. Not yet clear whether it'd significantly affect timing of the workshop.
  • AutoGluon model deployment requires re-packing the trained model.tar.gz to add the inference.py script, which unfortunately introduces an extra ~5min wait on both .deploy() and .register() calls
  • Need to wait for data to propagate from online Feature Store to offline, which can in theory take up to 15mins but is practically often ~5min. Currently this is handled by exploring the Feature Store and building the Autopilot model (from raw CSV) during that period.



By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Overhaul built-in algo session to include brief demos of SageMaker
Feature Store and SageMaker Model Registry, and focus on AutoGluon /
AutoPilot over XGBoost + HPO.
Return to Autopilot->XGBoost->HPO as primary flow for exercise 1,
with optional extra notebook covering AutoGluon. Although AG delivers
high accuracy, it's too much to try and cover both algos in the time
available and we need to stick to XGB for HPO to make sense.
Now that XGBoost workshop notebooks use DataSci 3, update the
deployment template to pre-warm v3 and v2 instead of v2 and v1.
@athewsey athewsey marked this pull request as ready for review November 23, 2022 09:36
@athewsey athewsey merged commit f35d8f1 into main Nov 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant