Automated machine learning picks an algorithm and hyperparameters for you and generates a model ready for deployment. There are several options that you can use to configure automated machine learning experiments.
Configuration options available in automated machine learning:
- Select your experiment type: Classification, Regression or Time Series Forecasting
- Data source, formats, and fetch data
- Choose your compute target
- Automated machine learning experiment settings
- Run an automated machine learning experiment
- Explore model metrics
- Register and deploy model
You can create and run automated machine learning experiments in code using the Azure ML Python SDK or if you prefer a no code experience, you can also create your automated machine learning experiments in Azure Machine Learning Studio.
In this lab, you learn how to create, run, and explore automated machine learning experiments in the Azure Machine Learning Studio without a single line of code. As part of this lab, we will be using the Flight Delays
data set that is enhanced with the weather data. Based on the enriched dataset, we will use automated machine learning to find the best performing classification model to predict if a particular flight will be delayed by 15 minutes or more.
-
In Azure portal, open the available machine learning workspace.
-
Select Launch now under the Try the new Azure Machine Learning studio message.
-
When you first launch the studio, you may need to set the directory and subscription. If so, you will see this screen:
For the directory, select Udacity and for the subscription, select Azure Sponsorship. For the machine learning workspace, you may see multiple options listed. Select any of these (it doesn't matter which) and then click Get started.
-
From the studio, select Datasets, + Create dataset, From web files. This will open the
Create dataset from web files
dialog on the right. -
In the Web URL field provide the following URL for the training data file:
https://introtomlsampledata.blob.core.windows.net/data/flightdelays/flightdelays.csv
-
Provide
flightdelays-automl
as the Name, leave the remaining values at their defaults and select Next.
-
On the Settings and preview panel, set the column headers drop down to
All files have same headers
. -
Review the dataset and then select Next
-
Select columns from the dataset to include as part of your training data. Exclude the following columns: Path, Month, Year, Timezone, Year_R, Timezone_R, and then select Next
-
From the studio home, select Create new, Automated ML run
-
This will open a
Create a new automated machine learning experiment
page
-
Provide an experiment name: flight-delay
-
Select target column: ArrDel15
-
Select compute target: select the available compute
-
Select Next
-
Select task type: Classification, and then select View additional configuration settings
-
This will open the
Additional configurations
dialog. -
Provide the following information and then select Save
- Primary metric: AUC weighted
- Exit criteria, Training job time (hours):
1
- Exit criteria, Metric score threshold:
0.7
Note that we are setting a metric score threshold to limit the training time. In practice, for initial experiments, you will typically only set the training job time to allow AutoML to discover the best algorithm to use for your specific data.
-
The experiment will run for about 30 min. Note that most of the time will be spent in the data preparation step and once the data preparation is done, the experiment will take an additional 1-2 minutes to complete.
-
In the Details tab, observe the run status of the job.
-
Select the Models tab, and observe the various algorithms the AutoML is evaluating. You can also observe the corresponding AUC weighted scores for each algorithm.
Note that we have set a metric score threshold to limit the training time. As a result you might see only one algorithm in your models list.
-
Select Details and wait till the run status becomes Completed.
-
While you wait for the model training to complete, you can learn to view and understand the charts and metrics for your automated machine learning run by selecting Understand automated machine learning results.
-
The
Details
tab shows theBest model summary
. Next, select Algorithm name to review the model details. -
From the
Model details
tab, to view the various metrics to evaluate the best model performance, select View all other metrics. -
Review the model performance metrics and then select Close.
-
Next, select Metrics to review the various model performance curves, such as Precision-Recall, ROC, Calibration curve, Gain & Lift curves, and Confusion matrix.
Congratulations! You have trained and evaluated your first automated machine learning model. You can continue to experiment in the environment but are free to close the lab environment tab and return to the Udacity portal to continue with the lesson.