Skip to content

Latest commit

 

History

History
165 lines (88 loc) · 7.83 KB

File metadata and controls

165 lines (88 loc) · 7.83 KB

Lab Overview

Automated machine learning picks an algorithm and hyperparameters for you and generates a model ready for deployment. There are several options that you can use to configure automated machine learning experiments.

Configuration options available in automated machine learning:

  • Select your experiment type: Classification, Regression or Time Series Forecasting
  • Data source, formats, and fetch data
  • Choose your compute target
  • Automated machine learning experiment settings
  • Run an automated machine learning experiment
  • Explore model metrics
  • Register and deploy model

You can create and run automated machine learning experiments in code using the Azure ML Python SDK or if you prefer a no code experience, you can also create your automated machine learning experiments in Azure Machine Learning Studio.

In this lab, you learn how to create, run, and explore automated machine learning experiments in the Azure Machine Learning Studio without a single line of code. As part of this lab, we will be using the Flight Delays data set that is enhanced with the weather data. Based on the enriched dataset, we will use automated machine learning to find the best performing classification model to predict if a particular flight will be delayed by 15 minutes or more.

Exercise 1: Register Dataset with Azure Machine Learning studio

Task 1: Upload Dataset

  1. In Azure portal, open the available machine learning workspace.

  2. Select Launch now under the Try the new Azure Machine Learning studio message.

    Launch Azure Machine Learning studio.

  3. When you first launch the studio, you may need to set the directory and subscription. If so, you will see this screen:

    Launch Azure Machine Learning studio.

    For the directory, select Udacity and for the subscription, select Azure Sponsorship. For the machine learning workspace, you may see multiple options listed. Select any of these (it doesn't matter which) and then click Get started.

  4. From the studio, select Datasets, + Create dataset, From web files. This will open the Create dataset from web files dialog on the right.

    Image highlights the steps to open the create dataset from web files dialog.

  5. In the Web URL field provide the following URL for the training data file:

    https://introtomlsampledata.blob.core.windows.net/data/flightdelays/flightdelays.csv
    
  6. Provide flightdelays-automl as the Name, leave the remaining values at their defaults and select Next.

    Upload nyc-taxi-sample-data.csv from a URL.

Task 2: Preview Dataset

  1. On the Settings and preview panel, set the column headers drop down to All files have same headers.

  2. Review the dataset and then select Next

    Scroll right to review dataset.

Task 3: Select Columns

  1. Select columns from the dataset to include as part of your training data. Exclude the following columns: Path, Month, Year, Timezone, Year_R, Timezone_R, and then select Next

    Select columns from the dataset to include as part of your training data.

Task 4: Create Dataset

  1. Confirm the dataset details and select Create

    Confirm the details of the dataset you uploaded and then select Create.

Exercise 2: Setup New Automated Machine Learning Experiment

Task 1: Create New Automated Machine Learning Experiment

  1. From the studio home, select Create new, Automated ML run

    Create new Automated ML run from Azure Machine Learning studio.

  2. This will open a Create a new automated machine learning experiment page

Task 2: Select Training Data

  1. Select the dataset flightdelays-automl and then select Next

    Select the dataset flightdelays and then select Next.

Task 3: Create a new Automated ML run

  1. Provide an experiment name: flight-delay

  2. Select target column: ArrDel15

  3. Select compute target: select the available compute

  4. Select Next

    Configure a new Automated ML run.

Task 4: Setup Task type and Settings

  1. Select task type: Classification, and then select View additional configuration settings

    Select task type, classification.

  2. This will open the Additional configurations dialog.

  3. Provide the following information and then select Save

    1. Primary metric: AUC weighted
    2. Exit criteria, Training job time (hours): 1
    3. Exit criteria, Metric score threshold: 0.7

    Setup additional configurations.

    Note that we are setting a metric score threshold to limit the training time. In practice, for initial experiments, you will typically only set the training job time to allow AutoML to discover the best algorithm to use for your specific data.

Exercise 3: Start and Monitor Experiment

Task 1: Start Experiment

  1. Select Finish to start running the experiment

    Select Finish to start running the experiment.

Task 2: Monitor Experiment

  1. The experiment will run for about 30 min. Note that most of the time will be spent in the data preparation step and once the data preparation is done, the experiment will take an additional 1-2 minutes to complete.

  2. In the Details tab, observe the run status of the job.

    Run Details tab showing run status.

  3. Select the Models tab, and observe the various algorithms the AutoML is evaluating. You can also observe the corresponding AUC weighted scores for each algorithm.

    Models tab showing model performance metric.

    Note that we have set a metric score threshold to limit the training time. As a result you might see only one algorithm in your models list.

  4. Select Details and wait till the run status becomes Completed.

    Run Details tab showing run status.

  5. While you wait for the model training to complete, you can learn to view and understand the charts and metrics for your automated machine learning run by selecting Understand automated machine learning results.

Exercise 4: Review Best Model's Performance

Task 1: Review Best Model Performance

  1. The Details tab shows the Best model summary. Next, select Algorithm name to review the model details.

    Run Details tab showing recommended model.

  2. From the Model details tab, to view the various metrics to evaluate the best model performance, select View all other metrics.

    Model Details tab showing model summary.

  3. Review the model performance metrics and then select Close.

    Model performance metrics.

  4. Next, select Metrics to review the various model performance curves, such as Precision-Recall, ROC, Calibration curve, Gain & Lift curves, and Confusion matrix.

    Model Visualizations tab showing model performance curves.

Next Steps

Congratulations! You have trained and evaluated your first automated machine learning model. You can continue to experiment in the environment but are free to close the lab environment tab and return to the Udacity portal to continue with the lesson.