Skip to content
This repository has been archived by the owner on Dec 19, 2022. It is now read-only.

Navigating To Machine Learning Environment

Daniel I Varzari edited this page Dec 6, 2022 · 2 revisions

Background

The beautiful thing about Azure is that there is no installations required for the machine learning environment, all the data is either stored in the Data Lake directly, stored in spark tables in the Databricks tree menu for tables, or accessible through SQL commands through a connection to our PostgreSQL server.

First Steps

  1. Launch Databricks from the resource group
  2. Navigate to repos where you wish to develop
  3. Before running a notebook attach a compute
  4. You can now run cells and expect an output

Demonstration

Machine Learning Environment

Setting up a Compute Resource

  1. Have Databricks launched
  2. Navigate to "compute" in tree
  3. Click create compute or configure an existing one

Tips

  • Multi nodes are highly preferred
  • Auto Scalability is what allows our compute resources to accept the massive parameters of our models
  • Workers allow us to fall back on other computer resources if one get too overloaded 2-8 is sufficient for most applications

Demonstration

Compute Resource

MLflow - accessing Model Histories and Reports

  1. Have Databricks launched
  2. Navigate to Machine Learning from tree
  3. View models - (here you can go into details/serving)
  4. View model histories

Tips

  • Go to details for information about inputs and outputs of the endpoint
  • Go to serving for more information about the url to connect to the model

Demonstration

Models

View Model Experiments

  1. Have Databricks launched
  2. Go to source of model/navigate to a model notebook
  3. Click the flask
  4. View metrics and runtime details

Tips

  • From these views you can reproduce model results
  • We do not have to worry about re-running the model with new parameters so long as we are logging
  • Logging with Sklearn is done through autologging

Demonstration

Exper