ML Notebook Enhancements #68

luisquintanilla · 2022-09-15T02:17:36Z

Add reference documentation links to classes like transforms and trainers. (i.e. LightGBM)
Include parameter names in method calls. (i.e. mlContext.Data.TrainTestSplit(data,testFraction: 0.2))
Use real data for examples. It makes it easier to understand the problem that's being solved opposed to randomly generated data.
Watch for code comments. Instead of embedding them in the code, promote them to text in a Markdown cell.

Put related code together. Break up cells containing large chunks of code and add Markdown cells explaining what each of the cells is doing.

Example

Original

var context =new MLContext(seed: 1);
var pipeline = context.Transforms.Concatenate("Features", "X")
  .Append(context.Auto().Regression("y", useLbfgs: false, useSdca: false, useFastForest: false));

var monitor = new NotebookMonitor();
var experiment = context.Auto().CreateExperiment();
experiment.SetPipeline(pipeline)
  .SetEvaluateMetric(RegressionMetric.RootMeanSquaredError, "y")
  .SetTrainingTimeInSeconds(30)
  .SetDataset(trainTestSplit.TrainSet, trainTestSplit.TestSet)
  .SetMonitor(monitor);

// Configure Visualizer			
monitor.SetUpdate(monitor.Display());

var res = await experiment.RunAsync();

Update

Initialize MLContext

MLContext is the starting point for all ML.NET applications.

var context =new MLContext(seed: 1);

Define training pipeline

Concatenate: Takes the input column X and creates a feature vector in the Features column.
Regression: Defines the task AutoML needs to find the best algorithm and hyperparameters for. In this case, Lbfgs, Sdca, and FastForest algorithms won't be explored since their respective parameters are set to false.

var pipeline = context.Transforms.Concatenate("Features", "X")
      .Append(context.Auto().Regression("y", useLbfgs: false, useSdca: false, useFastForest: false));

Initialize Monitor

The notebook monitor provides visualizations of the training progress as AutoML tries to find the best model for your data.

var monitor = new NotebookMonitor();

Initialize AutoML Experiment

An AutoML experiment is a collection of trials in which algorithms are explored.

var experiment = context.Auto().CreateExperiment();

Configure AutoML Experiment

The AutoML experiment tries to find the best algorithm using an evaluation metric. In this case, the evaluation metric selected is Root Mean Squared Error. The goal is to find the optimal evaluation metric in the provided training time which is set to 30 seconds. The longer you train, the more algorithms and hyperparameters AutoML is able to explore. The training set is the dataset that AutoML uses to train the model and the test set is used to calculate the evaluation metric to see how well a particular model selected by AutoML performs.

experiment.SetPipeline(pipeline)
        .SetEvaluateMetric(RegressionMetric.RootMeanSquaredError, "y")
        .SetTrainingTimeInSeconds(30)
        .SetDataset(trainTestSplit.TrainSet, trainTestSplit.TestSet)
        .SetMonitor(monitor);

Set monitor to display

monitor.SetUpdate(monitor.Display());

Run AutoML experiment

var res = await experiment.RunAsync();

NotebookMonitor: Display evaluation metric for best trial, active trial, and y-axis on graph.
When adding feeds, add link to document on how to reference them in VS / dotnet CLI
When installing NuGet packages that are not part of the BCL, list them in a Markdown cell where the packages are installed, and add a link to NuGet. (i.e. Microsoft.ML).

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ML Notebook Enhancements #68

ML Notebook Enhancements #68

luisquintanilla commented Sep 15, 2022 •

edited

Loading

ML Notebook Enhancements #68

ML Notebook Enhancements #68

Comments

luisquintanilla commented Sep 15, 2022 • edited Loading

Initialize MLContext

Define training pipeline

Initialize Monitor

Initialize AutoML Experiment

Configure AutoML Experiment

Set monitor to display

Run AutoML experiment

luisquintanilla commented Sep 15, 2022 •

edited

Loading