In previous lessons, we spent much time talking about training a machine learning model, which is a multi-step process involving data preparation, feature engineering, training, evaluation, and model selection. The model training process can be very compute-intensive, with training times spanning across many hours, days, or weeks depending on the amount of data, type of algorithm used, and other factors. A trained model, on the other hand, is used to make decisions on new data quickly. In other words, it infers things about new data it is given based on its training. Making these decisions on new data on-demand is called real-time inferencing.
In this lab, you learn how to deploy a trained model that can be used as a webservice, hosted on an Azure Kubernetes Service (AKS) cluster. This process is what enables you to use your model for real-time inferencing.
The Azure Machine Learning designer simplifies the process by enabling you to train and deploy your model without writing any code.
-
In Azure portal, open the available machine learning workspace.
-
Select Launch now under the Try the new Azure Machine Learning studio message.
-
When you first launch the studio, you may need to set the directory and subscription. If so, you will see this screen:
For the directory, select Udacity and for the subscription, select Azure Sponsorship. For the machine learning workspace, you may see multiple options listed. Select any of these (it doesn't matter which) and then click Get started.
-
From the studio, select Designer in the left-hand menu. Next, select Sample 1: Regression - Automobile Price Prediction (Basic) under the New pipeline section. This will open a
visual pipeline authoring editor
.
-
In the settings panel on the right, select Select compute target.
-
In the
Set up compute target
editor, select the existing compute target, then select Save.
Note: If you are facing difficulties in accessing pop-up windows or buttons in the user interface, please refer to the Help section in the lab environment.
-
Select Submit to open the
Set up pipline run
editor.Please note that the button name in the UI is changed from Run to Submit.
-
In the
Setup pipeline run editor
, select Experiment, Create new and provideNew experiment name
: designer-run, and then select Submit. -
Wait for the pipeline run to complete. It will take around 10 minutes to complete the run.
-
Select Create inference pipeline, then select Real-time inference pipeline from the list to create a new inference pipeline.
-
Select Submit to open the
Set up pipeline run
editor.Please note that the button name in the UI is changed from Run to Submit.
-
In the
Setup pipeline run
editor, select Select existing, then select the experiment you created in an earlier step: designer-run. Select Submit to start the pipeline. -
Wait for pipeline run to complete. It will take around 7 minutes to complete the run.
-
After the inference pipeline run is finished, select Deploy to open the
Set up real-time endpoint
editor. -
In the
Set up real-time endpoint
editor, select your existing compute target, then select Deploy. -
Wait for the deployment to complete. The status of the deployment can be observed above the
Pipeline Authoring Editor
.
-
To view the deployed web service, select the Endpoints section in your Azure Portal Workspace.
-
Select the deployed web service: sample-1-regression---automobile to open the deployment details page.
Note: you have to select the text of the service name to open the deployment details page
-
Select the Consume tab to observe the following information:
-
Basic consumption info
displays the REST endpoint, Primary key, and Secondary key. -
Consumption option
shows code samples in C#, Python, and R on how to call the endpoint to consume the webservice.
-
Congratulations! You have just learned how to train and deploy a model to an Azure Kubernetes Service (AKS) cluster for real-time inferencing. You can now return to the Udacity portal to continue with the lesson.