diff --git a/docs/.gitignore b/docs/.gitignore new file mode 100644 index 000000000..9c463d282 --- /dev/null +++ b/docs/.gitignore @@ -0,0 +1 @@ +mdformatter/ \ No newline at end of file diff --git a/docs/Makefile b/docs/Makefile new file mode 100644 index 000000000..bba9a6735 --- /dev/null +++ b/docs/Makefile @@ -0,0 +1,17 @@ +.PHONY: docs +docs: setup format + +.PHONY: setup +setup: + @rm -rf mdformatter + @git clone https://github.com/caraml-dev/mdformatter.git + @pip install -r mdformatter/requirements.txt + +# The target below uses a non-existent doc overrides folder name to generate the final docs, +# as there are no overrides. +.PHONY: format +format: + @echo "Formatting maintainer docs ..." + @cd mdformatter && python -m mdformatter ../maintainer/templates ../maintainer/overrides ../maintainer/generated ../maintainer/values.json GITBOOK + @echo "Formatting user docs ..." + @cd mdformatter && python -m mdformatter ../user/templates ../user/overrides ../user/generated ../user/values.json GITBOOK diff --git a/docs/README.md b/docs/README.md index 066697c0b..57960c311 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,53 +1,19 @@ -# Merlin +# Docs -After you have built a model with high-quality training data and the perfect algorithm, it’s time to apply it to make predictions and serve the outcome for future decision making. -For many data scientists, model training can be done easily within their Jupyter notebook. However, things become trickier when it comes to productionizing the model to serve real traffic, which is engineering intensive. There are many tools available, but learning when and how to use them requires a lot of exploration, which can be a headache. +To learn about the basic concepts behind Merlin and how to use it, refer to the [User Docs](./user/generated). -## What is Merlin +To configure / deploy Merlin into a production cluster or troubleshoot an existing deployment, refer to the [Maintainer Docs](./maintainer). -Merlin is a platform designed to help users productionize their models quickly without deep knowledge on MLOps. Users only need to deploy their model into Merlin, and it will take care of the traffic routing and resources scaling in the background, saving lots of engineering hours and expertise required otherwise. +To understand the development process and the architecture, refer to the [Developer Docs](./developer). -## User Flow +## Contributing to the Docs -Productionizing a model with Merlin can be easily done in 3 steps, as detailed in the diagram below: +All docs are created for Gitbook. -![User flow](./diagrams/user_flow.drawio.svg) +Currently, the user docs and maintainer docs are templated using Jinja2. -1. **Deploy a model** +The templates can be found under `${folder}/templates` and the values for the templates reside in `${folder}/values.json`. To generate the final docs into `${folder}/generated`, run: - We want to make the deployment experience as seamless as possible, directly from Jupyter notebook. With the Merlin SDK, we can now upload the model and trigger the deployment pipeline, by simply calling a few functions in the notebook. Alternatively, Merlin UI supports the same, with just 1 click. - -2. **Setup serving endpoint** - - Once the model is deployed with an auto-generated HTTP endpoint, you can then specify the serving model version in the console. Give it a minute and your model will automagically be able to serve prediction. - -3. **Evaluate and iterate** - - The Merlin UI allows you to deploy and track different model versions and tag any version to run experiment easily. All model artifacts are synchronized into MLflow Tracking, which can be used to track and compare the model performance. - -## Key Concepts of Merlin - -The design of Merlin uses a few key concepts below, you should familiarize yourself with: - -**Project**: Project represents a namespace for a collection of model. For example, a project could be food Recommendations, driver allocation, ride pricing, etc. - -**Model**: Every model is associated with one (and only one) project and model endpoint. Model also can have zero or more model versions. In the entities' hierarchy of MLflow, a model corresponds to an MLflow experiment. - -**Model Version**: The model version represents an iteration within a model. A model version is associated with a run within MLflow. A Model Version can be deployed as a service, there can be multiple deployments of model version with different endpoint each. - -**Model Endpoint**: Every model has its own endpoint that contains routing rule(s) to an active model version endpoint (serving mode). This endpoint is usually used to serve traffics in production. The model version it is routed to changes in the background when a serving model version is changed. Hence there is no need to change the endpoint used to serve traffics when the serving model version is changed. - -**Model Version Endpoint**: A model version endpoint is a way to obtain model inference results in real-time, over the network (HTTP). This endpoint is unique to each model version. Model endpoint will route to the model version endpoint in the background, when the associated model version is set to serving. - -**Environment**: The environment’s name is a user-facing property that will be used to determine the target Kubernetes cluster where a model will be deployed to. The environment has two important properties, name and Kubernetes cluster. - -## Getting Started - -To start learning about using Merlin, check out: -{% page-ref page="../user/basics.md" %} - -To connect to an existing Merlin deployment, check out: -{% page-ref page="../user/connecting-to-merlin/README.md" %} - -To start deploying Merlin, check out: -{% page-ref page="../developer/deploying-merlin/README.md" %} +```sh +make docs +``` diff --git a/docs/SUMMARY.md b/docs/SUMMARY.md deleted file mode 100644 index 5ecbae4f6..000000000 --- a/docs/SUMMARY.md +++ /dev/null @@ -1,44 +0,0 @@ -# Table of contents - -## Introduction - -* [Merlin Overview](README.md) - -## Connecting to Merlin - -* [Connection Methods](user/connecting-to-merlin/README.md) -* [Python SDK](user/connecting-to-merlin/python-sdk.md) -* [Merlin CLI](user/connecting-to-merlin/merlin-cli.md) - -## User Guides - -* [Basic Concepts](user/basics.md) - * [Create a Model](user/model.md) - * [Create a Model Version](user/model_version.md) - * [Model Version Endpoint](user/model_version_endpoint.md) - * [Model Endpoint](user/model_endpoint.md) - * [Model Deployment and Serving](user/model_deployment_serving.md) - * [Delete a Model](user/model_deletion.md) - * [Delete a Model Version](user/model_version_deletion.md) -* [Batch Prediction](user/batch_prediction.md) -* [Transformer](user/transformer.md) - * [Standard Transformer](user/standard_transformer.md) - * [Standard Transformer Expression](user/transformer_expressions.md) - * [Standard Transformer for UPI](user/standard_transformer_upi.md) - * [Custom Transformer](user/custom_transformer.md) -* [Examples](user/examples/README.md) - * [Deploy Standard Models](user/examples/standard_model.md) - * [Deploy PyFunc Model](user/examples/pyfunc_model.md) - * [Run Batch Prediction Job](user/examples/batch_prediction.md) - * [Using Transformers](user/examples/transformer.md) - * [Others](user/examples/others.md) - -## Developer Guides - -* [Deploying Merlin](developer/deploying-merlin/README.md) - * [Local Development](developer/deploying-merlin/local_development.md) -* [Architecture Overview](developer/architecture.md) - -## Reference - -* [Limitations](reference/limitations.md) diff --git a/docs/developer/architecture.md b/docs/developer/architecture.md index 79f8e5526..b9558811f 100644 --- a/docs/developer/architecture.md +++ b/docs/developer/architecture.md @@ -36,9 +36,7 @@ The big advantage of a golang-migrate is that it can read migration files from t ### Merlin SDK -[Merlin SDK](./../user/connecting-to-merlin/python-sdk.md) is a python library for interacting with Merlin. Data scientist can install merlin-sdk from Pypi and import it into their Python project or Jupyter notebook. It provides all the functionalites that users are allowed to perform in Merlin. Models can only be logged via the SDK. - -Upon installing the sdk, you will also have access to the [Merlin CLI](./../user/connecting-to-merlin/merlin-cli.md) +[Merlin SDK](https://pypi.org/project/merlin-sdk/) is a python library for interacting with Merlin. Data scientist can install merlin-sdk from Pypi and import it into their Python project or Jupyter notebook. It provides all the functionalites that users are allowed to perform in Merlin. Models can only be logged via the SDK. ### CaraML MLP diff --git a/docs/developer/deploying-merlin/README.md b/docs/developer/deploying-merlin/README.md deleted file mode 100644 index 2a1d3aad0..000000000 --- a/docs/developer/deploying-merlin/README.md +++ /dev/null @@ -1,6 +0,0 @@ -# Deploying Merlin - -For getting started with Merlin in local development environment, click on [Local Development](./local_development.md). - - - diff --git a/docs/developer/deploying-merlin/local_development.md b/docs/developer/local-development.md similarity index 94% rename from docs/developer/deploying-merlin/local_development.md rename to docs/developer/local-development.md index 826b0b6e0..5443b149b 100644 --- a/docs/developer/deploying-merlin/local_development.md +++ b/docs/developer/local-development.md @@ -23,7 +23,7 @@ k3d cluster create $CLUSTER_NAME --image rancher/k3s:$K3S_VERSION --k3s-arg '--d ## Install Merlin -You can run [`quick_install.sh`](../../../scripts/quick_install.sh) to install Merlin and it's components: +You can run [`quick_install.sh`](../../scripts/quick_install.sh) to install Merlin and it's components: ```bash # From Merlin root directory, run: diff --git a/docs/images/autoscaling_policy.png b/docs/images/autoscaling_policy.png new file mode 100644 index 000000000..e21ed5ae9 Binary files /dev/null and b/docs/images/autoscaling_policy.png differ diff --git a/docs/images/configure_alert.png b/docs/images/configure_alert.png new file mode 100644 index 000000000..11e2544fe Binary files /dev/null and b/docs/images/configure_alert.png differ diff --git a/docs/images/configure_alert_models_list.png b/docs/images/configure_alert_models_list.png new file mode 100644 index 000000000..1ebce9ae3 Binary files /dev/null and b/docs/images/configure_alert_models_list.png differ diff --git a/docs/user/configure_standard_transformer.gif b/docs/images/configure_standard_transformer.gif similarity index 100% rename from docs/user/configure_standard_transformer.gif rename to docs/images/configure_standard_transformer.gif diff --git a/docs/images/deploy_model_version.png b/docs/images/deploy_model_version.png new file mode 100644 index 000000000..3fa018b23 Binary files /dev/null and b/docs/images/deploy_model_version.png differ diff --git a/docs/images/deployment_mode.png b/docs/images/deployment_mode.png index 8105f7b42..e7468d6d2 100644 Binary files a/docs/images/deployment_mode.png and b/docs/images/deployment_mode.png differ diff --git a/docs/images/redeploy_model_version.png b/docs/images/redeploy_model_version.png new file mode 100644 index 000000000..56d034041 Binary files /dev/null and b/docs/images/redeploy_model_version.png differ diff --git a/docs/images/serve_model_version.png b/docs/images/serve_model_version.png new file mode 100644 index 000000000..68da9e49a Binary files /dev/null and b/docs/images/serve_model_version.png differ diff --git a/docs/maintainer/.gitkeep b/docs/maintainer/.gitkeep deleted file mode 100644 index e69de29bb..000000000 diff --git a/docs/maintainer/generated/00_setting_up.md b/docs/maintainer/generated/00_setting_up.md new file mode 100644 index 000000000..0983ce850 --- /dev/null +++ b/docs/maintainer/generated/00_setting_up.md @@ -0,0 +1,10 @@ + +# Installing Merlin + +Merlin can be installed using the Helm charts located at [caraml-dev/helm-charts](https://github.com/caraml-dev/helm-charts/tree/main). + +Minimally, [MLP](https://github.com/caraml-dev/mlp) and [KServe](https://github.com/kserve/kserve) must be installed for Merlin to work. Besides these, a production deployment of Merlin would require other components such as networking, authorization policies, etc. to be set up. All of these capabilities are provided by the umbrella [CaraML chart](https://github.com/caraml-dev/helm-charts/tree/main/charts/caraml). It is recommended to install this chart using the appropriate toggles and configurations for its different sub-components. + +# Configuring Merlin + +Besides the configurations documented by the CaraML umbrella chart, detailed specs may be found under each of the sub-charts. For example, the [Merlin chart](https://github.com/caraml-dev/helm-charts/tree/main/charts/merlin)'s docs capture the list of configurable parameters. Additional configurations (`config.*`) accepted by Merlin may also be found [here](https://github.com/caraml-dev/merlin/blob/main/api/config/config.go#L46). \ No newline at end of file diff --git a/docs/maintainer/generated/01_troubleshooting.md b/docs/maintainer/generated/01_troubleshooting.md new file mode 100644 index 000000000..6c9b53a24 --- /dev/null +++ b/docs/maintainer/generated/01_troubleshooting.md @@ -0,0 +1,31 @@ + +# Troubleshooting Merlin + +Errors from the Merlin control plane APIs are typically retured to the users synchronously. However, at the moment, errors from some asynchronous operations may not be propagated back to the users (or even to the Merlin server). In such cases, the maintainers of Merlin may need to intervene, to diagnose the issue further. + +Common sources of information on the failures are described below. + +## Control Plane Logs + +Control plane container logs are a starting point for understanding the issue further. It is recommended that the logs are forwarded and persisted at a longer-term storage without which the logs will be lost on container restarts. + +For example, Stackdriver logs may be filtered as follows: + +``` +resource.labels.cluster_name="caraml-cluster" +resource.labels.namespace_name="caraml-namespace" +resource.labels.container_name="merlin" +``` + +## Data Plane Logs and Kubernetes Events + +Issues pertaining to model deployment timeouts are best identified by looking at the Kubernetes events. For example, deployments from a CaraML project called `sample` will be done into the Kubernetes namespace of the same name. + +``` +$ kubectl describe pod -n sample +$ kubectl get events --sort-by='.lastTimestamp' -n sample +``` + +As pods can only directly be examined while they exist (during the model deployment timeout window) and events are only available in the cluster for up to an hour, these steps must be taken during / immediately after the deployment. + +Where the predictor / transformer pod is found to be restarting from errors, the container logs would be useful in shedding light on the problem. It is recommended to also persist the data plane logs at a longer-term storage. \ No newline at end of file diff --git a/docs/maintainer/templates/00_setting_up.md b/docs/maintainer/templates/00_setting_up.md new file mode 100644 index 000000000..d7a9a341b --- /dev/null +++ b/docs/maintainer/templates/00_setting_up.md @@ -0,0 +1,10 @@ + +# Installing Merlin + +Merlin can be installed using the Helm charts located at [caraml-dev/helm-charts](https://github.com/caraml-dev/helm-charts/tree/main). + +Minimally, [MLP](https://github.com/caraml-dev/mlp) and [KServe](https://github.com/kserve/kserve) must be installed for Merlin to work. Besides these, a production deployment of Merlin would require other components such as networking, authorization policies, etc. to be set up. All of these capabilities are provided by the umbrella [CaraML chart](https://github.com/caraml-dev/helm-charts/tree/main/charts/caraml). It is recommended to install this chart using the appropriate toggles and configurations for its different sub-components. + +# Configuring Merlin + +Besides the configurations documented by the CaraML umbrella chart, detailed specs may be found under each of the sub-charts. For example, the [Merlin chart](https://github.com/caraml-dev/helm-charts/tree/main/charts/merlin)'s docs capture the list of configurable parameters. Additional configurations (`config.*`) accepted by Merlin may also be found [here](https://github.com/caraml-dev/merlin/blob/main/api/config/config.go#L46). diff --git a/docs/maintainer/templates/01_troubleshooting.md b/docs/maintainer/templates/01_troubleshooting.md new file mode 100644 index 000000000..6a6f8bb62 --- /dev/null +++ b/docs/maintainer/templates/01_troubleshooting.md @@ -0,0 +1,31 @@ + +# Troubleshooting Merlin + +Errors from the Merlin control plane APIs are typically retured to the users synchronously. However, at the moment, errors from some asynchronous operations may not be propagated back to the users (or even to the Merlin server). In such cases, the maintainers of Merlin may need to intervene, to diagnose the issue further. + +Common sources of information on the failures are described below. + +## Control Plane Logs + +Control plane container logs are a starting point for understanding the issue further. It is recommended that the logs are forwarded and persisted at a longer-term storage without which the logs will be lost on container restarts. + +For example, Stackdriver logs may be filtered as follows: + +``` +resource.labels.cluster_name="{{ merlin_cluster_name }}" +resource.labels.namespace_name="{{ merlin_namespace_name }}" +resource.labels.container_name="merlin" +``` + +## Data Plane Logs and Kubernetes Events + +Issues pertaining to model deployment timeouts are best identified by looking at the Kubernetes events. For example, deployments from a CaraML project called `sample` will be done into the Kubernetes namespace of the same name. + +``` +$ kubectl describe pod -n sample +$ kubectl get events --sort-by='.lastTimestamp' -n sample +``` + +As pods can only directly be examined while they exist (during the model deployment timeout window) and events are only available in the cluster for up to an hour, these steps must be taken during / immediately after the deployment. + +Where the predictor / transformer pod is found to be restarting from errors, the container logs would be useful in shedding light on the problem. It is recommended to also persist the data plane logs at a longer-term storage. diff --git a/docs/maintainer/values.json b/docs/maintainer/values.json new file mode 100644 index 000000000..7403b9134 --- /dev/null +++ b/docs/maintainer/values.json @@ -0,0 +1,4 @@ +{ + "merlin_cluster_name": "caraml-cluster", + "merlin_namespace_name": "caraml-namespace" +} \ No newline at end of file diff --git a/docs/user/autoscaling_policy.md b/docs/user/autoscaling_policy.md deleted file mode 100644 index 64296526d..000000000 --- a/docs/user/autoscaling_policy.md +++ /dev/null @@ -1,54 +0,0 @@ -## Autoscaling Policy - -Merlin supports configuratble autoscaling policy to ensure that user has complete control over the autoscaling behavior of their models. -There are 4 types of autoscaling metrics in Merlin: - -#### CPU utilization - -The autoscaling is based on the ration of model service's CPU usage and its CPU request. This autoscaling policy is available on all deployment mode. - -#### Memory utilization - -The autoscaling is based on the ration of model service's Memory usage and its Memory request. This autoscaling policy is available only on `SERVERLESS` deployment mode. - -#### Model Throughput (RPS) - -The autoscaling is based on RPS per replica of the model service. This autoscaling policy is available only on `SERVERLESS` deployment mode. - -#### Concurrency - -The autoscaling is based on number of concurrent request served by a replica of the model service. This autoscaling policy is available only on `SERVERLESS` deployment mode. - - -## Configuring Autoscaling Policy - -User can update autoscaling policy via Merlin SDK and Merlin UI - -### Configuring Autoscaling Policy via SDK - -Below is the example of configuring autoscaling policy of a `SERVERLESS` deployment to use `RPS` metrics. - -```python -import merlin -from merlin import DeploymentMode -from merlin.model import ModelType - -# Deploy using raw_deployment -merlin.set_url("http://localhost:5000") -merlin.set_project("my-project") -merlin.set_model("my-model", ModelType.TENSORFLOW) -model_dir = "test/tensorflow-model" - -with merlin.new_model_version() as v: - merlin.log_model(model_dir=model_dir) - -# Deploy using raw_deployment - endpoint = merlin.deploy(v1, deployment_mode=DeploymentMode.SERVERLESS, - autoscaling_policy=merlin.AutoscalingPolicy( - metrics_type=merlin.MetricsType.RPS, - target_value=20)) -``` - -### Configuring Autoscaling Policy via UI - -[![Configuring Autoscaling Policy](../images/deployment_mode.png)](https://user-images.githubusercontent.com/4023015/159232744-8aa23a87-9609-4825-9cb8-4bf0a7c0e4e1.mov) diff --git a/docs/user/basics.md b/docs/user/basics.md deleted file mode 100644 index 46594aa39..000000000 --- a/docs/user/basics.md +++ /dev/null @@ -1,9 +0,0 @@ -# Basic Concepts - -In order to get your first model deployed and serving, you will need to understand some basic concepts in Merlin: - -{% page-ref page="./model.md" %} -{% page-ref page="./model_version.md" %} -{% page-ref page="./model_version_endpoint.md" %} -{% page-ref page="./model_endpoint.md" %} -{% page-ref page="./model_deployment_serving.md" %} \ No newline at end of file diff --git a/docs/user/connecting-to-merlin/README.md b/docs/user/connecting-to-merlin/README.md deleted file mode 100644 index 854012749..000000000 --- a/docs/user/connecting-to-merlin/README.md +++ /dev/null @@ -1,29 +0,0 @@ -# Connecting to Merlin - -## [Python SDK](./python-sdk.md) - -- Log Model into Merlin -- Manage Model Version Endpoint and Model Version Endpoint -- Manage Batch Prediction Job - -## [Merlin CLI](./merlin-cli.md) - -- Deploy the Model Endpoint -- Undeploy the Model Endpoint -- Scaffold a new PyFunc project - -## Client Libraries - -Merlin provides [Go client library](../../api/client/client.go) to deploy and serve ML model in production. - -To connect to Merlin deployment, the client needs to be authenticated by Google OAuth2. You can use `google.DefaultClient()` to get the Application Default Credential. - -```go -googleClient, _ := google.DefaultClient(context.Background(), "https://www.googleapis.com/auth/userinfo.email") - -cfg := client.NewConfiguration() -cfg.BasePath = "http://merlin.dev/api/merlin/v1" -cfg.HTTPClient = googleClient - -apiClient := client.NewAPIClient(cfg) -``` diff --git a/docs/user/connecting-to-merlin/merlin-cli.md b/docs/user/connecting-to-merlin/merlin-cli.md deleted file mode 100644 index 9924449fe..000000000 --- a/docs/user/connecting-to-merlin/merlin-cli.md +++ /dev/null @@ -1,28 +0,0 @@ -# Merlin CLI - -The Merlin CLI can be installed directly using pip: - -```bash -pip install merlin-sdk -``` - -The CLI is a wrapper of [Merlin Python SDK](./python-sdk.md). - -```bash -$ merlin -Usage: merlin [OPTIONS] COMMAND [ARGS]... - - A simple command line tool. - - The Merlin CLI assumes that you already have a serialized model. - - To see the options for each command: merlin COMMAND --help - -Options: - --help Show this message and exit. - -Commands: - deploy Deploy the model - scaffold Generate PyFunc project - undeploy Undeploy the model -``` diff --git a/docs/user/connecting-to-merlin/python-sdk.md b/docs/user/connecting-to-merlin/python-sdk.md deleted file mode 100644 index d0be7bb7f..000000000 --- a/docs/user/connecting-to-merlin/python-sdk.md +++ /dev/null @@ -1,24 +0,0 @@ -# Python SDK - -The Merlin SDK can be installed directly using pip: - -```bash -pip install merlin-sdk -``` - -Users should then be able to connect to a Merlin deployment as follows - -```python -import merlin -from merlin.model import ModelType - -# Connect to an existing Merlin deployment -merlin.set_url("merlin.example.com") - -# Set the active model to the name given by parameter, if the model with the given name is not found, a new model will -# be created. -merlin.set_model("example-model", ModelType.PYFUNC) - -# Ensure that you're connected by printing out some Model Endpoints -merlin.list_model_endpoints() -``` diff --git a/docs/user/deployment_mode.md b/docs/user/deployment_mode.md deleted file mode 100644 index 6fe624fdd..000000000 --- a/docs/user/deployment_mode.md +++ /dev/null @@ -1,54 +0,0 @@ -## Deployment Mode - -Merlin supports 2 types of deployment mode: `SERVERLESS` and `RAW_DEPLOYMENT`. Under the hood, `SERVERLESS` deployment uses KNative as the serving stack, on the other hand `RAW_DEPLOYMENT` uses native [Kubernetes deployment resources](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/). - - -## Tradeoff - -The deployment mode supported by Merlin has its own advantage and weakness listed in table below. - - -### Serverless Deployment - -| Pros | Cons | -| ------------------------------------------------------------- | --------------------------------------------------------------------------- | -| Supports more advanced autoscaling policy (RPS, Concurrency) | Slower compared to `RAW_DEPLOYMENT` due to infrastructure overhead | -| Supports scale down to zero | | - -### Raw Deployment - -| Pros | Cons | -| ------------------------------------------------------------- | --------------------------------------------------------------------------- | -| Relatively faster compared to `SERVERLESS` | Supports only autoscaling based on CPU usage | -| Less infrastructure overhead and more cost efficient | | - - -## Configuring Deployment Mode - -User is able to configure the deployment mode of their model via Merlin SDK and Merlin UI. - -### Configuring Deployment Mode via SDK - -Example below will configure the deployment mode to use `RAW_DEPLOYMENT` - -```python -import merlin -from merlin import DeploymentMode -from merlin.model import ModelType - -# Deploy using raw_deployment -merlin.set_url("http://localhost:5000") -merlin.set_project("my-project") -merlin.set_model("my-model", ModelType.TENSORFLOW) -model_dir = "test/tensorflow-model" - -with merlin.new_model_version() as v: - merlin.log_model(model_dir=model_dir) - -# Deploy using raw_deployment -new_endpoint = merlin.deploy(v, deployment_mode=DeploymentMode.RAW_DEPLOYMENT) -``` - -### Configuring Deployment Mode via UI - -[![Configuring Deployment Mode](../images/deployment_mode.png)](https://user-images.githubusercontent.com/4023015/159232744-8aa23a87-9609-4825-9cb8-4bf0a7c0e4e1.mov) diff --git a/docs/user/examples/README.md b/docs/user/examples/README.md deleted file mode 100644 index 8af3dbb02..000000000 --- a/docs/user/examples/README.md +++ /dev/null @@ -1,19 +0,0 @@ -# Examples - -Examples of using Merlin for different purposes are available to be tried out as Jupyter notebooks in the links below. -You may want to clone the examples to your local directory and run them using Jupyter notebook. - -{% page-ref page="./standard_model.md" %} -{% page-ref page="./pyfunc_model.md" %} -{% page-ref page="./batch_prediction.md" %} -{% page-ref page="./transformer.md" %} -{% page-ref page="./others.md" %} - - - - - - - - - diff --git a/docs/user/generated/00_introduction.md b/docs/user/generated/00_introduction.md new file mode 100644 index 000000000..1a6369106 --- /dev/null +++ b/docs/user/generated/00_introduction.md @@ -0,0 +1,43 @@ + +# Merlin + +After you have built a model with high-quality training data and the perfect algorithm, it’s time to apply it to make predictions and serve the outcome for future decision making. +For many data scientists, model training can be done easily within their Jupyter notebook. However, things become trickier when it comes to productionizing the model to serve real traffic, which is engineering intensive. There are many tools available, but learning when and how to use them requires a lot of exploration, which can be a headache. + +## What is Merlin + +Merlin is a platform designed to help users productionize their models quickly without deep knowledge on MLOps. Users only need to deploy their model into Merlin, and it will take care of the traffic routing and resources scaling in the background, saving lots of engineering hours and expertise required otherwise. + +## User Flows + +Productionizing a model with Merlin can be easily done in 3 steps, as detailed in the diagram below: + +![User Flow](../../diagrams/user_flow.drawio.svg) + +1. **Deploy a model** + + We want to make the deployment experience as seamless as possible, directly from Jupyter notebook. With the Merlin SDK, we can now upload the model and trigger the deployment pipeline, by simply calling a few functions in the notebook. Alternatively, Merlin UI supports the same, with just 1 click. + +2. **Setup serving endpoint** + + Once the model is deployed with an auto-generated HTTP endpoint, you can then specify the serving model version in the console. Give it a minute and your model will automagically be able to serve prediction. + +3. **Evaluate and iterate** + + The Merlin UI allows you to deploy and track different model versions and tag any version to run experiment easily. All model artifacts are synchronized into MLflow Tracking, which can be used to track and compare the model performance. + +## Key Concepts of Merlin + +The design of Merlin uses a few key concepts below, you should familiarize yourself with: + +**Project**: Project represents a namespace for a collection of model. For example, a project could be food Recommendations, driver allocation, ride pricing, etc. + +**Model**: Every model is associated with one (and only one) project and model endpoint. Model also can have zero or more model versions. In the entities' hierarchy of MLflow, a model corresponds to an MLflow experiment. + +**Model Version**: The model version represents an iteration within a model. A model version is associated with a run within MLflow. A Model Version can be deployed as a service, there can be multiple deployments of model version with different endpoint each. + +**Model Endpoint**: Every model has its own endpoint that contains routing rule(s) to an active model version endpoint (serving mode). This endpoint is usually used to serve traffic in production. The model version it is routed to changes in the background when a serving model version is changed. Hence there is no need to change the endpoint used to serve traffics when the serving model version is changed. + +**Model Version Endpoint**: A model version endpoint is a way to obtain model inference results in real-time, over the network (HTTP). This endpoint is unique to each model version. Model endpoint will route to the model version endpoint in the background, when the associated model version is set to serving. + +**Environment**: The environment’s name is a user-facing property that will be used to determine the target Kubernetes cluster where a model will be deployed to. The environment has two important properties, name and Kubernetes cluster. \ No newline at end of file diff --git a/docs/user/generated/01_getting_started.md b/docs/user/generated/01_getting_started.md new file mode 100644 index 000000000..2b774bcd6 --- /dev/null +++ b/docs/user/generated/01_getting_started.md @@ -0,0 +1,47 @@ + +# Connecting to Merlin + +## Python SDK + +The Merlin SDK can be installed directly using pip: + +```bash +pip install merlin-sdk +``` + +Users should then be able to connect to a Merlin deployment as follows + +{% code title="getting_started.py" overflow="wrap" lineNumbers="true" %} +```python +import merlin +from merlin.model import ModelType + +# Connect to an existing Merlin deployment +merlin.set_url(merlin.example.com) + +# Set the active model to the name given by parameter, if the model with the given name is not found, a new model will +# be created. +merlin.set_model("example-model", ModelType.PYFUNC) + +# Ensure that you're connected by printing out some Model Endpoints +merlin.list_model_endpoints() +``` +{% endcode %} + +## Client Libraries + +Merlin provides [Go client library](https://github.com/caraml-dev/merlin/blob/main/api/client/client.go) to deploy and serve ML models. + +To connect to the Merlin deployment, the client needs to be authenticated by Google OAuth2. You can use `google.DefaultClient()` to get the Application Default Credential. + +{% code title="getting_started.go" overflow="wrap" lineNumbers="true" %} +```go +googleClient, _ := google.DefaultClient(context.Background(), "https://www.googleapis.com/auth/userinfo.email") + +cfg := client.NewConfiguration() +cfg.BasePath = "http://merlin.dev/api/merlin/v1" +cfg.HTTPClient = googleClient + +apiClient := client.NewAPIClient(cfg) +``` +{% endcode %} \ No newline at end of file diff --git a/docs/user/generated/02_creating_a_model.md b/docs/user/generated/02_creating_a_model.md new file mode 100644 index 000000000..adf54b8b2 --- /dev/null +++ b/docs/user/generated/02_creating_a_model.md @@ -0,0 +1,34 @@ + +# Creating a Model + +A Model represents a machine learning model. Each Model has a type. Currently Merlin supports both standard model types (PyTorch, SKLearn, Tensorflow, and XGBoost) and user-defined models (PyFunc model). + +Merlin also supports custom models. More info can be found here: {% page-ref page="./model_types/01_custom_model.md" %} + +Conceptually, a Model in Merlin is similar to a class in programming languages. To instantiate a Model, you’ll have to create a [Model Version](#creating-a-model-version). + +`merlin.set_model(, )` will set the active model to the name given by parameter. If the Model with given name is not found, a new Model will be created. + +{% code title="model_creation.py" overflow="wrap" lineNumbers="true" %} +```python +import merlin +from merlin.model import ModelType + +merlin.set_model("tensorflow-sample", ModelType.TENSORFLOW) +``` +{% endcode %} + +# Creating a Model Version + +A Model Version represents a snapshot of A particular Model iteration. A Model Version might contain artifacts which are deployable to Merlin. You'll also be able to attach information such as metrics and tags to a given Model Version. + +{% code title="model_version_creation.py" overflow="wrap" lineNumbers="true" %} +```python +with merlin.new_model_version() as v: + merlin.log_metric("metric", 0.1) + merlin.log_param("param", "value") + merlin.set_tag("tag", "value") + + merlin.log_model(model_dir='tensorflow-sample') +``` +{% endcode %} \ No newline at end of file diff --git a/docs/user/generated/03_deploying_a_model.md b/docs/user/generated/03_deploying_a_model.md new file mode 100644 index 000000000..81a13aa67 --- /dev/null +++ b/docs/user/generated/03_deploying_a_model.md @@ -0,0 +1,12 @@ + +# Deploying a Model + +To learn about deploying a model, please visit the following docs. + +{% page-ref page="./model_deployment/01_deploying_a_model_version.md" %} + +{% page-ref page="./model_deployment/02_serving_a_model_version.md" %} + +{% page-ref page="./model_deployment/03_configuring_transformers.md" %} + +{% page-ref page="./model_deployment/04_redeploying_a_model_version.md" %} \ No newline at end of file diff --git a/docs/user/generated/04_deleting_a_model.md b/docs/user/generated/04_deleting_a_model.md new file mode 100644 index 000000000..ea552465b --- /dev/null +++ b/docs/user/generated/04_deleting_a_model.md @@ -0,0 +1,63 @@ + +# Model Version Deletion + +A Merlin model version can be deleted only if it is not serving any endpoints and does not have any deployed endpoints or, if the base model is of the `pyfunc_v2` type, the model version must not have any active prediction jobs. Deleting a model version will result in the purging of the model version and its related entities, such as endpoints or prediction jobs, from the Merlin database. This action is **irreversible**. + +Model versions with related active prediction jobs or endpoints can not be deleted. + +## Model Version Deletion via the SDK +To delete a Model Version, you can call the `delete_model_version()` function from Merlin Python SDK. + +```python +merlin.set_project("test-project") + +merlin.set_model('test-model') + +version = merlin.active_model().get_version(id_version) + +version.delete_model_version() +``` + +## Model Version Deletion via the UI +To delete a model version from the UI, you can access the delete button directly on the model version list page. The dialog will provide information about entities that are blocking the deletion process or will be deleted along with the model version. + +- If the model version does not have any associated entities, a dialog like the one below will be displayed: +![Delete Model Version without linked entites](../../images/delete_model_version_no_entity.png) + +- If the model version has any associated active entities, a dialog like the one below (showing the entities blocking the deletion process) will be displayed: +![Delete Model Version with linked active entites](../../images/delete_model_version_active_entity.png) + +- If the model version has any associated inactive entities, a dialog like the one below (showing which entities will get deleted along with the deletion process) will be displayed: +![Delete Model Version with linked inactive entites](../../images/delete_model_version_inactive_entity.png) + +# Model Deletion + +{% hint style="info" %} +This feature is currently behind a toggle and may or may not be enabled on the Merlin controller, by the maintainers. +{% endhint %} + +A Merlin model can be deleted only if it is not serving any endpoints and does not have any deployed model versions or, if the model is of the `pyfunc_v2` type, none of its model versions must not have any active prediction jobs. Deleting a model will result in the purging of all the model versions associated with it, as well as related entities such as endpoints or prediction jobs (applicable for models of the `pyfunc_v2` type) from the Merlin database. This action is **irreversible**. + +A model with model versions that have any active prediction jobs or endpoints cannot be deleted. + +## Model Deletion Via the SDK +To delete a Model, you can call the `delete_model()` function from the Merlin Python SDK. + +```python +merlin.set_project("test-project") + +merlin.set_model('test-model') + +model = merlin.active_model() + +model.delete_model() +``` + +## Model Deletion via the UI +To delete a model from the UI, you can access the delete button directly on the model list page. The dialog will provide information about any entities that are blocking the deletion process. + +- If the model does not have any associated entities, a dialog like the one below will be displayed: +![Delete Model without linked entites](../../images/delete_model_no_entity.png) + +- If the model has any associated active entities, a dialog like the one below will be displayed: +![Delete Model with linked active entites](../../images/delete_model_active_entity.png) \ No newline at end of file diff --git a/docs/user/generated/05_configuring_alerts.md b/docs/user/generated/05_configuring_alerts.md new file mode 100644 index 000000000..15d49c7db --- /dev/null +++ b/docs/user/generated/05_configuring_alerts.md @@ -0,0 +1,21 @@ + +# Configuring Alerts + +{% hint style="info" %} +This feature is currently behind a toggle and may or may not be enabled on the Merlin controller, by the maintainers. +{% endhint %} + +Merlin uses a GitOps based alerting mechanism. Alerts can be configured for a model, on the Model Endpoint (i.e., for the model version that is in the 'Serving' state), from the models list UI. + +![Configure Alerts on Model Endpoint](../../images/configure_alert_models_list.png) + +## Metrics + +Alerting based on the following metrics are supported. For all metrics below, the transformer metrics, if exists, will also be taken into account. +* **Throughput:** This alert is triggered when the number of requests per second received by the model is lower than the threshold. +* **Latency:** This alert is triggered when the latency of model response time is higher than the threshold. +* **Error Rate:** This alert is triggerred when the percentage of erroneous responses from the model is more than the threshold. +* **CPU:** This alert is triggered when the percentage of CPU utilization is more than the threshold. +* **Memory:** This alert is triggered when the percentage of memory utilization is more than the threshold. + +![Alert Configuration](../../images/configure_alert.png) \ No newline at end of file diff --git a/docs/user/batch_prediction.md b/docs/user/generated/06_batch_prediction.md similarity index 97% rename from docs/user/batch_prediction.md rename to docs/user/generated/06_batch_prediction.md index 3bae730fc..fa66b5d12 100644 --- a/docs/user/batch_prediction.md +++ b/docs/user/generated/06_batch_prediction.md @@ -1,10 +1,11 @@ + # Batch Prediction The batch prediction job will be executed as a Spark Application running in a Spark cluster on top of Kubernetes. ## Prediction Job -Prediction Job is the resource introduced in Merlin for executing batch prediction. A Prediction Job is owned by the corresponding [Model Version](./model_version.md). One Model Version can have several Prediction Jobs and it maintains the history of all jobs ever created. Prediction Job has several important properties: +Prediction Job is the resource introduced in Merlin for executing batch prediction. A Prediction Job is owned by the corresponding Model Version. One Model Version can have several Prediction Jobs and it maintains the history of all jobs ever created. Prediction Job has several important properties: 1. **Id**: Unique ID of the prediction job 1. **Model / Model version**: Reference to the model version from which the prediction job is created @@ -25,7 +26,7 @@ Prediction Job has several state during its lifetime: 1. **Terminating**: Prediction jobs enter a terminating state if a user manually cancels a pending/running prediction job. 1. **Terminated**: Once the termination process is completed the prediction job will enter the terminated state. -![Prediction Job Lifecycle](../diagrams/prediction_job_lifecycle.drawio.svg) +![Prediction Job Lifecycle](../../diagrams/prediction_job_lifecycle.drawio.svg) ## Creating Secret/Service Account @@ -62,6 +63,7 @@ Source: https://github.com/GoogleCloudDataproc/spark-bigquery-connector To use view as data source instead of table you’ll have to set viewsEnabled to true and specify `viewMaterializationProject` and `viewMaterializationDataset`. Since the materialization of view will create a table, the service account should also have `roles/bigquery.dataEditor` in the pointed dataset. Below is an example: +{% code title="bq_source.py" overflow="wrap" lineNumbers="false" %} ```python bq_source = BigQuerySource("project.dataset.table_iris", features=["sepal_length", "sepal_width", "petal_length", "petal_width"], @@ -71,6 +73,7 @@ bq_source = BigQuerySource("project.dataset.table_iris", "viewMaterializationDataset" : "dsp" }) ``` +{% endcode %} ## Configuring Sink @@ -109,7 +112,7 @@ Class `PredictionJobResourceRequest` is useful to configure the resource request 1. `executor_memory_request`: executor memory request. e.g. 1Gi, 512Mi 1. `executor_replica`: number of executor replica. e.g. 1, 2 -Without specifying `PredictionJobResourceRequest` the prediction job will run with the system default as follow: +Without specifying `PredictionJobResourceRequest` the prediction job will run with the system default as follows: ``` executor_replica: 3 @@ -149,4 +152,4 @@ https://issues.apache.org/jira/browse/SPARK-30961 #### Work Around -Add `pyarrow==0.11.1` and `pandas==0.24.1` to conda `environment.yaml` of your model. +Add `pyarrow==0.11.1` and `pandas==0.24.1` to conda `environment.yaml` of your model. \ No newline at end of file diff --git a/docs/user/generated/07_examples.md b/docs/user/generated/07_examples.md new file mode 100644 index 000000000..0ec9f1584 --- /dev/null +++ b/docs/user/generated/07_examples.md @@ -0,0 +1,15 @@ + +# Examples + +Examples of using Merlin for different purposes are available to be tried out as Jupyter notebooks in the links below. +You may want to clone the examples to your local directory and run them using Jupyter notebook. + +{% page-ref page="./examples/01_standard_model.md" %} + +{% page-ref page="./examples/02_pyfunc_model.md" %} + +{% page-ref page="./examples/03_transformer.md" %} + +{% page-ref page="./examples/04_batch_prediction.md" %} + +{% page-ref page="./examples/05_others.md" %} \ No newline at end of file diff --git a/docs/user/limitations.md b/docs/user/generated/08_limitations.md similarity index 92% rename from docs/user/limitations.md rename to docs/user/generated/08_limitations.md index 2d08e7b83..3f906da28 100644 --- a/docs/user/limitations.md +++ b/docs/user/generated/08_limitations.md @@ -1,3 +1,4 @@ + # Limitations This article is an aggregation of the limits imposed on various components of the Merlin platform. @@ -47,4 +48,4 @@ Note that, to utilise this feature, the minimum replicas for the deployment shou ### Log History -Users can only view the logs that are still in the model’s container. Link to the associated stackdriver dashboard is provided in the log page to access past log. +Users can only view the logs that are still in the model’s container. Link to the associated Stackdriver dashboard is provided in the log page to access past log. \ No newline at end of file diff --git a/docs/user/examples/standard_model.md b/docs/user/generated/examples/01_standard_model.md similarity index 86% rename from docs/user/examples/standard_model.md rename to docs/user/generated/examples/01_standard_model.md index 6f2872bb9..da6ec5f1d 100644 --- a/docs/user/examples/standard_model.md +++ b/docs/user/generated/examples/01_standard_model.md @@ -1,3 +1,5 @@ + + # Deploy Standard Models Try out the notebooks below to learn how to deploy different types of Standard Models to Merlin. diff --git a/docs/user/examples/pyfunc_model.md b/docs/user/generated/examples/02_pyfunc_model.md similarity index 92% rename from docs/user/examples/pyfunc_model.md rename to docs/user/generated/examples/02_pyfunc_model.md index 4c99153cc..1cafb38dd 100644 --- a/docs/user/examples/pyfunc_model.md +++ b/docs/user/generated/examples/02_pyfunc_model.md @@ -1,3 +1,5 @@ + + # Deploy PyFunc Model Try out the notebooks below to learn how to deploy PyFunc Models to Merlin. diff --git a/docs/user/examples/transformer.md b/docs/user/generated/examples/03_transformer.md similarity index 75% rename from docs/user/examples/transformer.md rename to docs/user/generated/examples/03_transformer.md index 68122b346..743e2927f 100644 --- a/docs/user/examples/transformer.md +++ b/docs/user/generated/examples/03_transformer.md @@ -1,10 +1,12 @@ + + # Using Transformers Try out the notebooks below to learn how to deploy models with each type of transformers in Merlin. ## Deploy PyFunc Model with Standard Transformer -{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/transformer/standard-transformer/Standard%20Transformer.ipynb" %} +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/transformer/standard-transformer/Standard-Transformer.ipynb" %} ## Deploy PyFunc Model with Custom Transformer @@ -16,4 +18,4 @@ Try out the notebooks below to learn how to deploy models with each type of tran ## Deploy PyFunc Model with Feast Enricher Transformer -{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/transformer/feast-enricher-transformer/Feast%20Enricher.ipynb" %} \ No newline at end of file +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/transformer/feast-enricher-transformer/Feast-Enricher.ipynb" %} \ No newline at end of file diff --git a/docs/user/examples/batch_prediction.md b/docs/user/generated/examples/04_batch_prediction.md similarity index 61% rename from docs/user/examples/batch_prediction.md rename to docs/user/generated/examples/04_batch_prediction.md index 9cf6257bc..70e6caf60 100644 --- a/docs/user/examples/batch_prediction.md +++ b/docs/user/generated/examples/04_batch_prediction.md @@ -1,11 +1,13 @@ + + # Run Batch Prediction Job Try out the notebooks below to learn how to run batch prediction jobs using PyFunc V2 in Merlin. ## Run Iris Classifier Batch Prediction Job -{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/batch/Batch%20Prediction%20Tutorial%201%20-%20Iris%20Classifier.ipynb" %} +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/batch/BatchPredictionTutorial1-IrisClassifier.ipynb" %} ## Run New York Taxi Fare Batch Prediction Job -{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/batch/Batch%20Prediction%20Tutorial%202%20-%20New%20York%20Taxi%20.ipynb" %} \ No newline at end of file +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/batch/BatchPredictionTutorial2-NewYorkTaxi.ipynb" %} \ No newline at end of file diff --git a/docs/user/examples/others.md b/docs/user/generated/examples/05_others.md similarity index 70% rename from docs/user/examples/others.md rename to docs/user/generated/examples/05_others.md index 8af8dda85..8668c8720 100644 --- a/docs/user/examples/others.md +++ b/docs/user/generated/examples/05_others.md @@ -1,3 +1,5 @@ + + # Others Try out the notebooks below to learn about other features of Merlin. @@ -8,4 +10,4 @@ Try out the notebooks below to learn about other features of Merlin. ## Working with Model Endpoint -{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/model-endpoint/Model%20Endpoint.ipynb" %} \ No newline at end of file +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/model-endpoint/ModelEndpoint.ipynb" %} \ No newline at end of file diff --git a/docs/user/generated/model_deployment/01_deploying_a_model_version.md b/docs/user/generated/model_deployment/01_deploying_a_model_version.md new file mode 100644 index 000000000..d4e85c980 --- /dev/null +++ b/docs/user/generated/model_deployment/01_deploying_a_model_version.md @@ -0,0 +1,167 @@ + + +# Model Version Endpoint + +To start sending inference requests to a model version, it must first be deployed. During deployment, different configurations can be chosen such as the number of replicas, CPU/memory requests, autoscaling policy, environment variables, etc. The set of these configurations that are used to deploy a model version is called a *deployment*. + +A model may have any number of versions. But, at any given time, only a maximum of **2** model versions can be deployed. + +When a model version is deployed, a Model Version Endpoint is created. The URL is of the following format: + +``` +http://-.. +``` + +For example a Model named `my-model` within Project named `my-project` with the base domain `models.id.merlin.dev` will have a Model Version Endpoint for version `1` as follows: + +``` +http://my-model-1.my-project.models.id.merlin.dev +``` + +A Model Version Endpoint has several states: + +- **pending**: The initial state of a Model Version Endpoint. +- **running**: Once deployed, a Model Version Endpoint is in running state and is accessible. +- **serving**: A Model Version Endpoint is in serving state if a Model Endpoint is created from it. +- **terminated**: Once undeployed, a Model Version Endpoint is in terminated state. +- **failed**: If an error occurred during deployment. + +## Image Building + +Depending on the type of the model being deployed, there may be an intermediate step to build the Docker image (using Kaniko). This is applicable to PyFunc models. + +## Deploying a Model Version + +A model version can be deployed via the SDK or the UI. + +### Deploying a Model Version via SDK + +Here's the example to deploy a Model Version Endpoint using Merlin Python SDK: + +{% code title="model_version_deployment.py" overflow="wrap" lineNumbers="true" %} +```python +with merlin.new_model_version() as v: + merlin.log_metric("metric", 0.1) + merlin.log_param("param", "value") + merlin.set_tag("tag", "value") + + merlin.log_model(model_dir='tensorflow-sample') + + merlin.deploy(v, environment_name="staging") +``` +{% endcode %} + +### Deploying a Model Version via UI + +The Deploy option can be selected from the model versions view. + +![Deploy a Model Version](../../../images/deploy_model_version.png) + +## Deployment Modes + +Merlin supports 2 types of deployment mode: `SERVERLESS` and `RAW_DEPLOYMENT`. Under the hood, `SERVERLESS` deployment uses KNative as the serving stack. On the other hand `RAW_DEPLOYMENT` uses native [Kubernetes deployment resources](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/). + +The deployment modes supported by Merlin have their own advantages and disadvantages, listed below. + +* **Serverless Deployment:** + - **Pros:** Supports more advanced autoscaling policy (RPS, Concurrency); supports scale down to zero. + - **Cons:** Slower compared to `RAW_DEPLOYMENT` due to infrastructure overhead +* **Raw Deployment:** + - **Pros:** Relatively faster compared to `SERVERLESS` deployments; less infrastructure overhead and more cost efficient. + - **Cons:** Supports only autoscaling based on CPU usage. + +### Configuring Deployment Modes + +Users are able to configure the deployment mode of their model via the SDK or the UI. + +#### Configuring Deployment Mode via SDK + +Example below will configure the deployment mode to use `RAW_DEPLOYMENT` + +{% code title="deployment_configuration.py" overflow="wrap" lineNumbers="true" %} +```python +import merlin +from merlin import DeploymentMode +from merlin.model import ModelType + +# Deploy using raw_deployment +merlin.set_url("merlin.example.com") +merlin.set_project("my-project") +merlin.set_model("my-model", ModelType.TENSORFLOW) +model_dir = "test/tensorflow-sample" + +with merlin.new_model_version() as v: + merlin.log_model(model_dir=model_dir) + +# Deploy using raw_deployment +new_endpoint = merlin.deploy(v, deployment_mode=DeploymentMode.RAW_DEPLOYMENT) +``` +{% endcode %} + +#### Configuring Deployment Mode via UI + +![Deployment Mode](../../../images/deployment_mode.png) + +## Autoscaling Policy + +Merlin supports configurable autoscaling policy to ensure that users have complete control over the autoscaling behavior of their models. There are 4 types of autoscaling metrics in Merlin: + +* **CPU Utilization:** The autoscaling is based on the ration of model service's CPU usage and its CPU request. This autoscaling policy is available on all deployment mode. +* **Memory Utilization:** The autoscaling is based on the ration of model service's Memory usage and its Memory request. This autoscaling policy is available only on `SERVERLESS` deployment mode. +* **Model Throughput (RPS):** The autoscaling is based on RPS per replica of the model service. This autoscaling policy is available only on `SERVERLESS` deployment mode. +* **Concurrency:** The autoscaling is based on number of concurrent request served by a replica of the model service. This autoscaling policy is available only on `SERVERLESS` deployment mode. + +### Configuring Autoscaling Policy + +Users can update the autoscaling policy via the SDK or the UI. + +#### Configuring Autoscaling Policy via SDK + +Below is the example of configuring autoscaling policy of a `SERVERLESS` deployment to use `RPS` metrics. + +{% code title="autoscaling_policy.py" overflow="wrap" lineNumbers="true" %} +```python +import merlin +from merlin import DeploymentMode +from merlin.model import ModelType + +# Deploy using raw_deployment +merlin.set_url("merlin.example.com") +merlin.set_project("my-project") +merlin.set_model("my-model", ModelType.TENSORFLOW) +model_dir = "test/tensorflow-sample" + +with merlin.new_model_version() as v: + merlin.log_model(model_dir=model_dir) + +# Deploy using raw_deployment + endpoint = merlin.deploy(v1, deployment_mode=DeploymentMode.SERVERLESS, + autoscaling_policy=merlin.AutoscalingPolicy( + metrics_type=merlin.MetricsType.RPS, + target_value=20)) +``` +{% endcode %} + +#### Configuring Autoscaling Policy via UI + +![Autoscaling Policy](../../../images/autoscaling_policy.png) + +## Liveness Probe + +When deploying a model version, the model container will be built with a livenes probe by default. The liveness probe will periodically check that your model is still alive, and restart the pod automatically if it is deemed to be dead. + +However, should you wish to disable this probe, you may do so by providing an environment variable to the model service with the following value: + +``` +MERLIN_DISABLE_LIVENESS_PROBE="true" +``` + +This can be supplied via the deploy function. i.e. + +{% code title="liveness_probe.py" overflow="wrap" lineNumbers="false" %} +```python + merlin.deploy(v, env_vars={"MERLIN_DISABLE_LIVENESS_PROBE"="true"}) +``` +{% endcode %} + +The liveness probe is also available for the transformer. More details can be found at: {% page-ref page="./transformer/standard_transformer/01_standard_transformer_expressions.md" %} \ No newline at end of file diff --git a/docs/user/generated/model_deployment/02_serving_a_model_version.md b/docs/user/generated/model_deployment/02_serving_a_model_version.md new file mode 100644 index 000000000..00399826b --- /dev/null +++ b/docs/user/generated/model_deployment/02_serving_a_model_version.md @@ -0,0 +1,47 @@ + + +# Model Endpoint + +Model serving is the next step of model deployment. After deploying a model version, we can optionally start serving it. This creates a Model Endpoint which is a stable URL associated with a model, of the following format: + +``` +http://.. +``` + +For example a Model named `my-model` within Project named `my-project` with the base domain `models.id.merlin.dev` will have a Model Endpoint which look as follows: + +``` +http://my-model.my-project.models.id.merlin.dev +``` + +Having a Model Endpoint makes it easy to keep updating the model (creating a new model version, running it and then serving it) without having to modify the model URL used by the called system. + +## Serving a Model Version + +A model version can be served via the SDK or the UI. + +### Serving a Model Version via SDK + +To serve a model version, you can call `serve_traffic()` function from Merlin Python SDK. + +{% code title="model_version_serving.py" overflow="wrap" lineNumbers="true" %} +```python +with merlin.new_model_version() as v: + merlin.log_metric("metric", 0.1) + merlin.log_param("param", "value") + merlin.set_tag("tag", "value") + + merlin.log_model(model_dir='tensorflow-sample') + + version_endpoint = merlin.deploy(v, environment_name="staging") + +# serve 100% traffic at endpoint +model_endpoint = merlin.serve_traffic({version_endpoint: 100}) +``` +{% endcode %} + +### Serving a Model Version via UI + +Once a model version is deployed (i.e., it is in the Running state), the Serve option can be selected from the model versions view. + +![Serve Model Version](../../../images/serve_model_version.png) \ No newline at end of file diff --git a/docs/user/generated/model_deployment/03_configuring_transformers.md b/docs/user/generated/model_deployment/03_configuring_transformers.md new file mode 100644 index 000000000..322f84bdf --- /dev/null +++ b/docs/user/generated/model_deployment/03_configuring_transformers.md @@ -0,0 +1,11 @@ + + +# Transformer + +In the Merlin ecosystem, a Transformer is a service deployed in front of the model service which users can use to perform pre-processing / post-processing steps to the incoming request / outgoing response, to / from the model service. A Transformer allows the user to abstract the transformation logic outside of their model and even write it in a language more performant than python. + +Currently, Merlin supports two types of Transformer: Standard and Custom: + +{% page-ref page="./transformer/01_standard_transformer.md" %} + +{% page-ref page="./transformer/02_custom_transformer.md" %} \ No newline at end of file diff --git a/docs/user/generated/model_deployment/04_redeploying_a_model_version.md b/docs/user/generated/model_deployment/04_redeploying_a_model_version.md new file mode 100644 index 000000000..6aa264317 --- /dev/null +++ b/docs/user/generated/model_deployment/04_redeploying_a_model_version.md @@ -0,0 +1,38 @@ + + +# Redeploying a Model Version + +Once a model version is attempted to be deployed, a Model Version Endpoint is created. If the deployment is successful, the endpoint would be in the Running state. This endpoint can later also be terminated. + +Whenever a running (or serving) model version is redeployed, a new *deployment* is created and the Merlin API server attempts to deploy it, while keep the existing deployment running (or serving). + +If the deployment of the new configuration fails, **the old deployment stays deployed** and remains as the current *deployment* of the model version. The new configuration will then show a 'Failed' status. + +![Unsuccessful Model Version Redeployment](../../../images/redeploy_model_unsuccessful.png) + +A model version can be redeployed via the SDK or the UI. + +### Redeploying a Model Version via SDK + +{% code title="model_version_redeployment.py" overflow="wrap" lineNumbers="true" %} +```python +import merlin +from merlin import DeploymentMode + +# Get model version that's already deployed +merlin.set_url("merlin.example.com") +merlin.set_project("my-project") +merlin.set_model("my-model") +model = merlin.active_model() +version = model.get_version(2) + +# Redeploy using new config (here, we are updating the deployment mode) +new_endpoint = merlin.deploy(v, deployment_mode=DeploymentMode.RAW_DEPLOYMENT) +``` +{% endcode %} + +### Redeploying a Model Version via UI + +A Running / Serving model version can be redeployed from the model versions view. + +![Redeploy Model Version](../../../images/redeploy_model_version.png) \ No newline at end of file diff --git a/docs/user/standard_transformer.md b/docs/user/generated/model_deployment/transformer/01_standard_transformer.md similarity index 98% rename from docs/user/standard_transformer.md rename to docs/user/generated/model_deployment/transformer/01_standard_transformer.md index b6d4bac9a..5886ad76c 100644 --- a/docs/user/standard_transformer.md +++ b/docs/user/generated/model_deployment/transformer/01_standard_transformer.md @@ -1,4 +1,7 @@ -# Standard Transformer + + +# Standard Transformer + Standard Transformer is a built-in pre and post-processing steps supported by Merlin. With standard transformer, it’s possible to enrich the model’s incoming request with features from feast and transform the payload so that it’s compatible with API interface provided by the model. Same transformation can also be applied against the model’s response payload in the post-processing step, which allow users to adapt the response payload to make it suitable for consumption. Standard transformer supports **http_json** and **upi_v1** protocol. For **http_json** protocol the standard transformer server runs rest server on top of **http 1.1**, **upi_v1 protocol** the server run grpc server. ## Concept @@ -25,7 +28,7 @@ Within both preprocess and postprocess, there are 3 stages that users can specif * UPIPostprocessOutput. UPIPostprocessOutput will return UPI Response interface payload in a protobuf.Message type -

+![Standard Transformer](../../../../images/standard_transformer.png) ## Jsonpath Jsonpath is a way to find value from JSON payload. Standard transformer using jsonpath to find values either from request or model response payload. Standard transformer using Jsonpath in several operations: @@ -45,7 +48,6 @@ fromJson: defaultValue: # (Optional) Default value if value for the jsonPath is nil or empty valueType: # Type of default value, mandatory to specify if default value is exist - ``` but in some part of operation like variable operation and feast entity extraction, jsonPath configuration is like below @@ -58,7 +60,6 @@ but in some part of operation like variable operation and feast entity extractio defaultValue: # (Optional) Default value if value for the jsonPath is nil or empty valueType: # Type of default value, mandatory to specify if default value is exist - ``` ### Default Value @@ -151,7 +152,7 @@ Expression can be used for updating column value expression: getS2ID(df.Col('lat'), df.Col('lon')) ``` -For full list of standard transformer built-in function, please check [Transformer Expressions](./transformer_expressions.md). +For full list of standard transformer built-in functions, please check: {% page-ref page="./standard_transformer/01_standard_transformer_expressions.md" %} ## Input Stage At the input stage, users specify all the data dependencies that are going to be used in subsequent stages. There are 4 operations available in these stages: @@ -897,8 +898,8 @@ For example, given following customerTable: Depending on the json format, it will render different result JSON -* RECORD format - ``` +* RECORD Format +``` outputStage: jsonOutput: jsonTemplate: @@ -907,9 +908,9 @@ Depending on the json format, it will render different result JSON fromTable: tableName: customerTable format: RECORD - ``` - JSON Result: - ``` +``` +JSON Result: +``` { "instances" : [ [ @@ -931,9 +932,10 @@ Depending on the json format, it will render different result JSON ] ] } - ``` +``` + * VALUES Format - ``` +``` outputStage: jsonOutput: jsonTemplate: @@ -942,9 +944,9 @@ Depending on the json format, it will render different result JSON fromTable: tableName: customerTable format: VALUES - ``` - JSON Result: - ``` +``` +JSON Result: +``` { "instances":[ [ @@ -954,9 +956,10 @@ Depending on the json format, it will render different result JSON ] ] } - ``` +``` + * SPLIT Format - ``` +``` outputStage: jsonOutput: jsonTemplate: @@ -965,9 +968,9 @@ Depending on the json format, it will render different result JSON fromTable: tableName: customerTable format: SPLIT - ``` - JSON Result: - ``` +``` +JSON Result: +``` { "instances" : { "data": [ @@ -978,7 +981,7 @@ Depending on the json format, it will render different result JSON "columns" : ["customer_id", "customer_age", "total_booking_1w"] } } - ``` +``` ### UPIPreprocessOutput UPIPreprocessOutput is output specification only for **upi_v1** protocol and preprocess step. This output specification will create operation that convert defined tables to UPI request interface. @@ -1053,7 +1056,7 @@ Once you logged your model and it’s ready to be deployed, you can go to the mo Here’s the short video demonstrating how to configure the Standard Transformer: -![Configure Standard Transformer](configure_standard_transformer.gif) +![Configure Standard Transformer](../../../../images/configure_standard_transformer.gif) 1. As the name suggests, you must choose **Standard Transformer** as Transformer Type. 2. The **Retrieval Table** panel will be displayed. This panel is where you configure the Feast Project, Entities, and Features to be retrieved. @@ -1082,6 +1085,7 @@ Version: 0.10.0 You need to pass `transformer` argument to the `merlin.deploy()` function to enable and deploy your standard transformer. +{% code title="standard_transformer_deployment.py" overflow="wrap" lineNumbers="true" %} ```python from merlin.resource_request import ResourceRequest from merlin.transformer import StandardTransformer @@ -1101,6 +1105,7 @@ transformer = StandardTransformer(config_file=transformer_config_path, # Deploy the model alongside the transformer endpoint = merlin.deploy(v, transformer=transformer) ``` +{% endcode %} ### Standard Transformer Environment Variables @@ -1137,6 +1142,4 @@ Below are supported environment variables to configure your Transformer. | `MODEL_HYSTRIX_SLEEP_WINDOW_MS` | Sleep window is duration of rejecting calling model predictor once the circuit is open | 10 | `MODEL_GRPC_KEEP_ALIVE_ENABLED` | Flag to enable UPI_V1 model predictor keep alive | false | `MODEL_GRPC_KEEP_ALIVE_TIME` | Duration of interval between keep alive PING | 60s -| `MODEL_GRPC_KEEP_ALIVE_TIMEOUT` | Duration of PING that considered as TIMEOUT | 5s - - +| `MODEL_GRPC_KEEP_ALIVE_TIMEOUT` | Duration of PING that considered as TIMEOUT | 5s \ No newline at end of file diff --git a/docs/user/custom_transformer.md b/docs/user/generated/model_deployment/transformer/02_custom_transformer.md similarity index 89% rename from docs/user/custom_transformer.md rename to docs/user/generated/model_deployment/transformer/02_custom_transformer.md index 76c3eba0f..2a5bb88ce 100644 --- a/docs/user/custom_transformer.md +++ b/docs/user/generated/model_deployment/transformer/02_custom_transformer.md @@ -1,3 +1,5 @@ + + # Custom Transformer In 0.8 release, Merlin adds support to the Custom Transformer deployment. This transformer type enables the users to deploy their own pre-built Transformer service. The user should develop, build, and publish their own Transformer Docker image. @@ -17,6 +19,7 @@ Similar to Standard Transformer, users can configure Custom Transformer from UI ### Deploy Custom Transformer using Merlin SDK +{% code title="custom_transformer_deployment.py" overflow="wrap" lineNumbers="true" %} ```python from merlin.resource_request import ResourceRequest from merlin.transformer import Transformer @@ -31,4 +34,5 @@ transformer = Transformer("gcr.io//", # Deploy the model alongside the transformer endpoint = merlin.deploy(v, transformer=transformer) -``` \ No newline at end of file +``` +{% endcode %} \ No newline at end of file diff --git a/docs/user/transformer_expressions.md b/docs/user/generated/model_deployment/transformer/standard_transformer/01_standard_transformer_expressions.md similarity index 99% rename from docs/user/transformer_expressions.md rename to docs/user/generated/model_deployment/transformer/standard_transformer/01_standard_transformer_expressions.md index 29bc8d266..f1d178307 100644 --- a/docs/user/transformer_expressions.md +++ b/docs/user/generated/model_deployment/transformer/standard_transformer/01_standard_transformer_expressions.md @@ -1,3 +1,5 @@ + + # Standard Transformer Expressions Standard Transformer provides several built-in functions that are useful for common ML use-cases. These built-in functions are accessible from within expression context. diff --git a/docs/user/standard_transformer_upi.md b/docs/user/generated/model_deployment/transformer/standard_transformer/02_standard_transformer_upi.md similarity index 83% rename from docs/user/standard_transformer_upi.md rename to docs/user/generated/model_deployment/transformer/standard_transformer/02_standard_transformer_upi.md index 00623fcd4..b17f131ce 100644 --- a/docs/user/standard_transformer_upi.md +++ b/docs/user/generated/model_deployment/transformer/standard_transformer/02_standard_transformer_upi.md @@ -1,6 +1,10 @@ + + # Configuring Standard Transformer for UPI Model -> This guide assumes you have experience using standard transformer and familiar with UPI contract. You can refer to https://github.com/caraml-dev/universal-prediction-interface to get details on the contract. +{% hint style="info" %} +This guide assumes you have experience using standard transformer and are familiar with UPI contract. You can refer to https://github.com/caraml-dev/universal-prediction-interface to get details on the contract. +{% endhint %} There are 2 key differences in Standard Transformer when it’s deployed using UPI protocol: @@ -40,12 +44,13 @@ You can avoid it altogether by using autoload feature in UPI. To do so: For example, when using Python SDK, you can do so by following code. In below example, we are storing `user_rating` as variable and `customer_df` as `customer_table` in `transformer_input`, as well as sending the `prediction_df` as `prediction_table`. +{% code title="upi_standard_transformer_deployment.py" overflow="wrap" lineNumbers="true" %} ```python from caraml.upi.v1 import type_pb2, upi_pb2_grpc, upi_pb2, variable_pb2 -request = upi_pb2.`PredictValuesRequest`( - ... - prediction_table=df_to_table(predict_df, "prediction_table")), +request = upi_pb2.PredictValuesRequest( + # ... + prediction_table=df_to_table(predict_df, "prediction_table"), transformer_input=upi_pb2.TransformerInput( variables=[ variable_pb2.Variable(name="user_rating", @@ -53,17 +58,18 @@ request = upi_pb2.`PredictValuesRequest`( double_value=5.0), ], tables=[df_to_table(customer_df, "customer_table")] - ) - ... + ), + # ... ) ``` +{% endcode %} #### Add autoload feature in the standard transformer config. Add all variables and tables that are going to be imported in the standard transformer. In below example we are importing `prediction_table`, `customer_table` , and `user_rating` that was sent by the client. -![UPI Autoload](../images/upi_autoloading_config.png) +![UPI Autoloading](../../../../../images/upi_autoloading_config.png) Which will add following config @@ -88,7 +94,7 @@ Standard transformer’s preprocessing output in UPI mode must satisfy `PredictV Example below shows a preprocessing pipeline which join prediction_table and sample_table to produce preprocessed_table , and then use the preprocessed_table as the prediction_table of the `PredictValuesRequest` that is sent to model. -![UPI Preprocess Output](../images/upi_preprocess_output.png) +![UPI Standard Transformer Preprocessing Output](../../../../../images/upi_preprocess_output.png) ```yaml transformerConfig: @@ -112,9 +118,10 @@ transformerConfig: predictionTableName: preprocessed_table transformerInputTableNames: [] postprocess: {} - ``` Similarly, post-processing output in UPI mode must satisfy `PredictValuesResponse`. You can populate prediction_result_table of the `PredictValuesResponse` that will be sent back to client by defining its source table. The source table can be a table declared both in preprocessing and post-processing pipeline. -> When pre-processing or post-processing pipeline is not defined, standard transformer will simply forward the request/response to its receiver. +{% hint style="info" %} +When pre-processing or post-processing pipeline is not defined, standard transformer will simply forward the request/response to its receiver. +{% endhint %} \ No newline at end of file diff --git a/docs/user/custom_model.md b/docs/user/generated/model_types/01_custom_model.md similarity index 95% rename from docs/user/custom_model.md rename to docs/user/generated/model_types/01_custom_model.md index e5dcc276d..6d5fe4694 100644 --- a/docs/user/custom_model.md +++ b/docs/user/generated/model_types/01_custom_model.md @@ -1,8 +1,8 @@ + + # Custom Model Custom model enables users to deploy any docker image that satisfy merlin requirements. Users are responsible to develop their own web service, build and publish the docker image, which later on can be deployed through Merlin. -## Why Custom Model - Users should consider to use custom model, if they have one of the following conditions: * Model needs custom complex transformations (preprocess and postprocess) and want to use other languages than Python. * Using non standard model, e.g using heuristic or other ml framework model that have not been introduced in merlin. @@ -41,7 +41,7 @@ Similar with `HTTP_JSON` custom model, users can add the artifact during model u If users want to emit metrics from this web server, they need to create scrape metrics REST endpoint. The challenge here, the knative (the underlying k8s deployment tools that merlin use) doesn't open multiple ports, hence the REST endpoint must be running on the same port as gRPC server (using port number given by `CARAML_GRPC_PORT`). Not every programming language can support running multiple protocol (gRPC and HTTP in this case) on the same port, for Go language users can use [cmux](https://github.com/soheilhy/cmux) to solve this problem, otherwise users can use push metrics to [pushgateway](https://prometheus.io/docs/instrumenting/pushing/) ### Environment Variables -As mentioned in the previous section, there are several environment variables that will be supplied by merlin control plan to custom model. Below are the list of the variables +As mentioned in the previous section, there are several environment variables that will be supplied by Merlin control plane to the custom model. Below are the list of the variables | Name | Description | |------|-------------| @@ -79,8 +79,8 @@ Most of the method that used in the above snipped is commonly used by all the mo | `command` | Command to run docker image | No | | `args` | Arguments that needs to be specified when running docker | No | -### Deployment Flow: +### Deployment Flow * Create new model version * Log custom model, specify image and model directory that contains artifacts that need to be uploaded -* Deploy. There is no difference with other model deployments +* Deploy. There is no difference with other model deployments \ No newline at end of file diff --git a/docs/user/model.md b/docs/user/model.md deleted file mode 100644 index c6154d722..000000000 --- a/docs/user/model.md +++ /dev/null @@ -1,14 +0,0 @@ -# Model - -Model represents a machine learning model. Each Model has a type, currently Merlin supports both standard model (PyTorch, SKLearn, Tensorflow, and XGBoost) and user-defined model (PyFunc model). - -Conceptually, Model in Merlin is similar to a class in programming language. To instantiate a Model you’ll have to create a [Model Version](./model_version.md). - -`merlin.set_model(, )` will set the active model to the name given by parameter. If the Model with given name is not found, a new Model will be created. - -```python -import merlin -from merlin.model import ModelType - -merlin.set_model("tensorflow-model", ModelType.TENSORFLOW) -``` diff --git a/docs/user/model_deletion.md b/docs/user/model_deletion.md deleted file mode 100644 index a19dce3d5..000000000 --- a/docs/user/model_deletion.md +++ /dev/null @@ -1,28 +0,0 @@ -# Model Deletion - -A Merlin model can be deleted only if it is not serving any endpoints and does not have any deployed model versions or, if the model is of the `pyfunc_v2` type, none of its model versions must not have any active prediction jobs. Deleting a model will result in the purging of all the model versions associated with it, as well as related entities such as endpoints or prediction jobs (applicable for models of the `pyfunc_v2` type) from the Merlin database. This action is **irreversible**. - -A model with model versions that have any active prediction jobs or endpoints cannot be deleted. - - -## Model Deletion Via the SDK -To delete a Model, you can call the `delete_model()` function from the Merlin Python SDK. - -```python -merlin.set_project("test-project") - -merlin.set_model('test-model') - -model = merlin.active_model() - -model.delete_model() -``` - -## Model Deletion via the UI -To delete a model from the UI, you can access the delete button directly on the model list page. The dialog will provide information about any entities that are blocking the deletion process. - -- If the model does not have any associated entities, a dialog like the one below will be displayed: -![Model Version Deletion Without Entity](../images/delete_model_no_entity.png) - -- If the model has any associated active entities, a dialog like the one below will be displayed: -![Model Version Deletion Without Entity](../images/delete_model_active_entity.png) diff --git a/docs/user/model_deployment_serving.md b/docs/user/model_deployment_serving.md deleted file mode 100644 index 89e2c6c4c..000000000 --- a/docs/user/model_deployment_serving.md +++ /dev/null @@ -1,25 +0,0 @@ -# Model Deployment and Serving - -Model deployment in Merlin is a process of creating a model service and it's [Model Version Endpoint](./model_version_endpoint.md). Internally, the deployment of the Model Version Endpoint is done via [kaniko](https://github.com/GoogleContainerTools/kaniko) and [KFServing](https://github.com/kubeflow/kfserving). - -There are two types of Model Version deployment, standard and python function (PyFunc) deployment. The difference is PyFunc deployment includes Docker image building step by Kaniko. - -Model serving is the next step of model deployment. After we have a running Model Version Endpoint, we can start serving the HTTP traffic by routing the Model Endpoint to it. - -![Model Deployment and Serving](../diagrams/model_deployment_serving.drawio.svg) - -# Model Versions and Deployments -Each model version can deployed with a different set of deployment configurations, such as the number of -replicas, CPU/memory requests, autoscaling policy, environment variables, etc. Each set of these configurations that are -used to deploy a model version are called a *deployment*. - -While each model can have up to **2** model versions deployed at any point of time, each model version can only be -deployed using **1** deployment at any point of time. - -Whenever a running model version is redeployed, a new *deployment* is created and the Merlin API server attempts to -deploy it, all while keep the existing deployment running. - -If the deployment of the new configuration fails, **the old deployment stays deployed** and remains as the current -*deployment* of the model version. The new configuration will then show a 'Failed' status. - -![Unsuccessful redeployment](../images/redeploy_model_unsuccessful.png) diff --git a/docs/user/model_endpoint.md b/docs/user/model_endpoint.md deleted file mode 100644 index b3d42bba5..000000000 --- a/docs/user/model_endpoint.md +++ /dev/null @@ -1,31 +0,0 @@ -# Model Endpoint - -Model Endpoint is a stable URL associated with a model. Model Endpoint URL has following template: - -``` -http://.. -``` - -For example a Model named `my-model` within Project named `my-project` will have Model Endpoint which look as follow: - -``` -http://my-model.my-project.models.id.merlin.dev -``` - -Model Endpoint can have a traffic rule which determine which [Model Version Endpoint](./model_version_endpoint.md) will receive traffic when request is received. - -To serve Model Endpoint, you can call `serve_traffic()` function from Merlin Python SDK. - -```python -with merlin.new_model_version() as v: - merlin.log_metric("metric", 0.1) - merlin.log_param("param", "value") - merlin.set_tag("tag", "value") - - merlin.log_model(model_dir='tensorflow-model') - - version_endpoint = merlin.deploy(v, environment_name="production") - -# serve 100% traffic at endpoint -model_endpoint = merlin.serve_traffic({version_endpoint: 100}) -``` diff --git a/docs/user/model_version.md b/docs/user/model_version.md deleted file mode 100644 index f4920ce65..000000000 --- a/docs/user/model_version.md +++ /dev/null @@ -1,12 +0,0 @@ -# Model Version - -Model Version represents a snapshot of particular Model iteration. A Model Version might contain artifacts which is deployable to Merlin. You'll also be able to attach information such as metrics and tag to a given Model Version. - -```python -with merlin.new_model_version() as v: - merlin.log_metric("metric", 0.1) - merlin.log_param("param", "value") - merlin.set_tag("tag", "value") - - merlin.log_model(model_dir='tensorflow-model') -``` diff --git a/docs/user/model_version_deletion.md b/docs/user/model_version_deletion.md deleted file mode 100644 index 11fe8f95a..000000000 --- a/docs/user/model_version_deletion.md +++ /dev/null @@ -1,34 +0,0 @@ -# Model Version Deletion - -A Merlin model version can be deleted only if it is not serving any endpoints and does not have any deployed -endpoints or, if the base model is of the `pyfunc_v2` type, the model version must not have any active prediction jobs. -Deleting a model version will result in the purging of the model version and its related entities, such as endpoints or -prediction jobs, from the Merlin database. This action is **irreversible**. - -Model versions with related active prediction jobs or endpoints can not be deleted. - -## Model Version Deletion Via the SDK -To delete a Model Version, you can call the `delete_model_version()` function from Merlin Python SDK. - -```python -merlin.set_project("test-project") - -merlin.set_model('test-model') - -version = merlin.active_model().get_version(id_version) - -version.delete_model_version() -``` - - -## Model Version Deletion via the UI -To delete a model version from the UI, you can access the delete button directly on the model version list page. The dialog will provide information about entities that are blocking the deletion process or will be deleted along with the model version. - -- If the model version does not have any associated entities, a dialog like the one below will be displayed: -![Model Version Deletion Without Entity](../images/delete_model_version_no_entity.png) - -- If the model version has any associated active entities, a dialog like the one below (showing the entities blocking the deletion process) will be displayed: -![Model Version Deletion Without Entity](../images/delete_model_version_active_entity.png) - -- If the model version has any associated inactive entities, a dialog like the one below (showing which entities will get deleted along with the deletion process) will be displayed: -![Model Version Deletion Without Entity](../images/delete_model_version_inactive_entity.png) diff --git a/docs/user/model_version_endpoint.md b/docs/user/model_version_endpoint.md deleted file mode 100644 index 7688d4e3e..000000000 --- a/docs/user/model_version_endpoint.md +++ /dev/null @@ -1,51 +0,0 @@ -# Model Version Endpoint - -Model Version Endpoint is an URL associated with a Model Version deployment. Model Version Endpoint URL has following template: - -``` -http://-.. -``` - -For example a Model named `my-model` within Project named `my-project` will have a Model Version Endpoint for version `1` which look as follow: - -``` -http://my-model-1.my-project.models.id.merlin.dev -``` - -Model Version Endpoint has several state: - -- **pending**: The initial state of a Model Version Endpoint. -- **ready**: Once deployed, a Model Version Endpoint is in ready state and is accessible. -- **serving**: A Model Version Endpoint is in serving state if [Model Endpoint](./model_endpoint.md) has traffic rule which uses the particular Model Version Endpoint. A Model Version Endpoint could not be undeployed if its still in serving state. -- **terminated**: Once undeployed, a Model Version Endpoint is in terminated state. -- **failed**: If error occurred during deployment. - -Here's the example to deploy a Model Version Endpoint using Merlin Python SDK: - -```python -with merlin.new_model_version() as v: - merlin.log_metric("metric", 0.1) - merlin.log_param("param", "value") - merlin.set_tag("tag", "value") - - merlin.log_model(model_dir='tensorflow-model') - - merlin.deploy(v, environment_name="production") -``` - -## Model Liveness -When deploying a model, the model container will be built with a livenes probe by default. The liveness probe will periodically check that your model is still alive, and restart the pod automatically if it is deemed to be dead. - -However, should you wish to disable this probe, you may do so by providing an environment variable to the model service with the following value: - -``` -MERLIN_DISABLE_LIVENESS_PROBE="true" -``` - -This can be supplied via the deploy function. i.e. - -```python - merlin.deploy(v, env_vars={"MERLIN_DISABLE_LIVENESS_PROBE"="true"}) -``` - -The liveness probe is also available for the transformer. Checkout [Standard Transformer Environment Variables](./standard_transformer.md) for more details. diff --git a/docs/user/templates/00_introduction.md b/docs/user/templates/00_introduction.md new file mode 100644 index 000000000..1a6369106 --- /dev/null +++ b/docs/user/templates/00_introduction.md @@ -0,0 +1,43 @@ + +# Merlin + +After you have built a model with high-quality training data and the perfect algorithm, it’s time to apply it to make predictions and serve the outcome for future decision making. +For many data scientists, model training can be done easily within their Jupyter notebook. However, things become trickier when it comes to productionizing the model to serve real traffic, which is engineering intensive. There are many tools available, but learning when and how to use them requires a lot of exploration, which can be a headache. + +## What is Merlin + +Merlin is a platform designed to help users productionize their models quickly without deep knowledge on MLOps. Users only need to deploy their model into Merlin, and it will take care of the traffic routing and resources scaling in the background, saving lots of engineering hours and expertise required otherwise. + +## User Flows + +Productionizing a model with Merlin can be easily done in 3 steps, as detailed in the diagram below: + +![User Flow](../../diagrams/user_flow.drawio.svg) + +1. **Deploy a model** + + We want to make the deployment experience as seamless as possible, directly from Jupyter notebook. With the Merlin SDK, we can now upload the model and trigger the deployment pipeline, by simply calling a few functions in the notebook. Alternatively, Merlin UI supports the same, with just 1 click. + +2. **Setup serving endpoint** + + Once the model is deployed with an auto-generated HTTP endpoint, you can then specify the serving model version in the console. Give it a minute and your model will automagically be able to serve prediction. + +3. **Evaluate and iterate** + + The Merlin UI allows you to deploy and track different model versions and tag any version to run experiment easily. All model artifacts are synchronized into MLflow Tracking, which can be used to track and compare the model performance. + +## Key Concepts of Merlin + +The design of Merlin uses a few key concepts below, you should familiarize yourself with: + +**Project**: Project represents a namespace for a collection of model. For example, a project could be food Recommendations, driver allocation, ride pricing, etc. + +**Model**: Every model is associated with one (and only one) project and model endpoint. Model also can have zero or more model versions. In the entities' hierarchy of MLflow, a model corresponds to an MLflow experiment. + +**Model Version**: The model version represents an iteration within a model. A model version is associated with a run within MLflow. A Model Version can be deployed as a service, there can be multiple deployments of model version with different endpoint each. + +**Model Endpoint**: Every model has its own endpoint that contains routing rule(s) to an active model version endpoint (serving mode). This endpoint is usually used to serve traffic in production. The model version it is routed to changes in the background when a serving model version is changed. Hence there is no need to change the endpoint used to serve traffics when the serving model version is changed. + +**Model Version Endpoint**: A model version endpoint is a way to obtain model inference results in real-time, over the network (HTTP). This endpoint is unique to each model version. Model endpoint will route to the model version endpoint in the background, when the associated model version is set to serving. + +**Environment**: The environment’s name is a user-facing property that will be used to determine the target Kubernetes cluster where a model will be deployed to. The environment has two important properties, name and Kubernetes cluster. \ No newline at end of file diff --git a/docs/user/templates/01_getting_started.md b/docs/user/templates/01_getting_started.md new file mode 100644 index 000000000..9c603062d --- /dev/null +++ b/docs/user/templates/01_getting_started.md @@ -0,0 +1,47 @@ + +# Connecting to Merlin + +## Python SDK + +The Merlin SDK can be installed directly using pip: + +```bash +pip install merlin-sdk +``` + +Users should then be able to connect to a Merlin deployment as follows + +{% code title="getting_started.py" overflow="wrap" lineNumbers="true" %} +```python +import merlin +from merlin.model import ModelType + +# Connect to an existing Merlin deployment +merlin.set_url("{{ merlin_url }}") + +# Set the active model to the name given by parameter, if the model with the given name is not found, a new model will +# be created. +merlin.set_model("example-model", ModelType.PYFUNC) + +# Ensure that you're connected by printing out some Model Endpoints +merlin.list_model_endpoints() +``` +{% endcode %} + +## Client Libraries + +Merlin provides [Go client library](https://github.com/caraml-dev/merlin/blob/main/api/client/client.go) to deploy and serve ML models. + +To connect to the Merlin deployment, the client needs to be authenticated by Google OAuth2. You can use `google.DefaultClient()` to get the Application Default Credential. + +{% code title="getting_started.go" overflow="wrap" lineNumbers="true" %} +```go +googleClient, _ := google.DefaultClient(context.Background(), "https://www.googleapis.com/auth/userinfo.email") + +cfg := client.NewConfiguration() +cfg.BasePath = "http://merlin.dev/api/merlin/v1" +cfg.HTTPClient = googleClient + +apiClient := client.NewAPIClient(cfg) +``` +{% endcode %} \ No newline at end of file diff --git a/docs/user/templates/02_creating_a_model.md b/docs/user/templates/02_creating_a_model.md new file mode 100644 index 000000000..adf54b8b2 --- /dev/null +++ b/docs/user/templates/02_creating_a_model.md @@ -0,0 +1,34 @@ + +# Creating a Model + +A Model represents a machine learning model. Each Model has a type. Currently Merlin supports both standard model types (PyTorch, SKLearn, Tensorflow, and XGBoost) and user-defined models (PyFunc model). + +Merlin also supports custom models. More info can be found here: {% page-ref page="./model_types/01_custom_model.md" %} + +Conceptually, a Model in Merlin is similar to a class in programming languages. To instantiate a Model, you’ll have to create a [Model Version](#creating-a-model-version). + +`merlin.set_model(, )` will set the active model to the name given by parameter. If the Model with given name is not found, a new Model will be created. + +{% code title="model_creation.py" overflow="wrap" lineNumbers="true" %} +```python +import merlin +from merlin.model import ModelType + +merlin.set_model("tensorflow-sample", ModelType.TENSORFLOW) +``` +{% endcode %} + +# Creating a Model Version + +A Model Version represents a snapshot of A particular Model iteration. A Model Version might contain artifacts which are deployable to Merlin. You'll also be able to attach information such as metrics and tags to a given Model Version. + +{% code title="model_version_creation.py" overflow="wrap" lineNumbers="true" %} +```python +with merlin.new_model_version() as v: + merlin.log_metric("metric", 0.1) + merlin.log_param("param", "value") + merlin.set_tag("tag", "value") + + merlin.log_model(model_dir='tensorflow-sample') +``` +{% endcode %} \ No newline at end of file diff --git a/docs/user/templates/03_deploying_a_model.md b/docs/user/templates/03_deploying_a_model.md new file mode 100644 index 000000000..81a13aa67 --- /dev/null +++ b/docs/user/templates/03_deploying_a_model.md @@ -0,0 +1,12 @@ + +# Deploying a Model + +To learn about deploying a model, please visit the following docs. + +{% page-ref page="./model_deployment/01_deploying_a_model_version.md" %} + +{% page-ref page="./model_deployment/02_serving_a_model_version.md" %} + +{% page-ref page="./model_deployment/03_configuring_transformers.md" %} + +{% page-ref page="./model_deployment/04_redeploying_a_model_version.md" %} \ No newline at end of file diff --git a/docs/user/templates/04_deleting_a_model.md b/docs/user/templates/04_deleting_a_model.md new file mode 100644 index 000000000..5e861cc08 --- /dev/null +++ b/docs/user/templates/04_deleting_a_model.md @@ -0,0 +1,63 @@ + +# Model Version Deletion + +A Merlin model version can be deleted only if it is not serving any endpoints and does not have any deployed endpoints or, if the base model is of the `pyfunc_v2` type, the model version must not have any active prediction jobs. Deleting a model version will result in the purging of the model version and its related entities, such as endpoints or prediction jobs, from the Merlin database. This action is **irreversible**. + +Model versions with related active prediction jobs or endpoints can not be deleted. + +## Model Version Deletion via the SDK +To delete a Model Version, you can call the `delete_model_version()` function from Merlin Python SDK. + +```python +merlin.set_project("test-project") + +merlin.set_model('test-model') + +version = merlin.active_model().get_version(id_version) + +version.delete_model_version() +``` + +## Model Version Deletion via the UI +To delete a model version from the UI, you can access the delete button directly on the model version list page. The dialog will provide information about entities that are blocking the deletion process or will be deleted along with the model version. + +- If the model version does not have any associated entities, a dialog like the one below will be displayed: +![Delete Model Version without linked entites](../../images/delete_model_version_no_entity.png) + +- If the model version has any associated active entities, a dialog like the one below (showing the entities blocking the deletion process) will be displayed: +![Delete Model Version with linked active entites](../../images/delete_model_version_active_entity.png) + +- If the model version has any associated inactive entities, a dialog like the one below (showing which entities will get deleted along with the deletion process) will be displayed: +![Delete Model Version with linked inactive entites](../../images/delete_model_version_inactive_entity.png) + +# Model Deletion + +{% hint style="info" %} +This feature is currently behind a toggle and may or may not be enabled on the Merlin controller, by the maintainers. +{% endhint %} + +A Merlin model can be deleted only if it is not serving any endpoints and does not have any deployed model versions or, if the model is of the `pyfunc_v2` type, none of its model versions must not have any active prediction jobs. Deleting a model will result in the purging of all the model versions associated with it, as well as related entities such as endpoints or prediction jobs (applicable for models of the `pyfunc_v2` type) from the Merlin database. This action is **irreversible**. + +A model with model versions that have any active prediction jobs or endpoints cannot be deleted. + +## Model Deletion Via the SDK +To delete a Model, you can call the `delete_model()` function from the Merlin Python SDK. + +```python +merlin.set_project("test-project") + +merlin.set_model('test-model') + +model = merlin.active_model() + +model.delete_model() +``` + +## Model Deletion via the UI +To delete a model from the UI, you can access the delete button directly on the model list page. The dialog will provide information about any entities that are blocking the deletion process. + +- If the model does not have any associated entities, a dialog like the one below will be displayed: +![Delete Model without linked entites](../../images/delete_model_no_entity.png) + +- If the model has any associated active entities, a dialog like the one below will be displayed: +![Delete Model with linked active entites](../../images/delete_model_active_entity.png) diff --git a/docs/user/templates/05_configuring_alerts.md b/docs/user/templates/05_configuring_alerts.md new file mode 100644 index 000000000..20c890d8a --- /dev/null +++ b/docs/user/templates/05_configuring_alerts.md @@ -0,0 +1,21 @@ + +# Configuring Alerts + +{% hint style="info" %} +This feature is currently behind a toggle and may or may not be enabled on the Merlin controller, by the maintainers. +{% endhint %} + +Merlin uses a GitOps based alerting mechanism. Alerts can be configured for a model, on the Model Endpoint (i.e., for the model version that is in the 'Serving' state), from the models list UI. + +![Configure Alerts on Model Endpoint](../../images/configure_alert_models_list.png) + +## Metrics + +Alerting based on the following metrics are supported. For all metrics below, the transformer metrics, if exists, will also be taken into account. +* **Throughput:** This alert is triggered when the number of requests per second received by the model is lower than the threshold. +* **Latency:** This alert is triggered when the latency of model response time is higher than the threshold. +* **Error Rate:** This alert is triggerred when the percentage of erroneous responses from the model is more than the threshold. +* **CPU:** This alert is triggered when the percentage of CPU utilization is more than the threshold. +* **Memory:** This alert is triggered when the percentage of memory utilization is more than the threshold. + +![Alert Configuration](../../images/configure_alert.png) diff --git a/docs/user/templates/06_batch_prediction.md b/docs/user/templates/06_batch_prediction.md new file mode 100644 index 000000000..fa66b5d12 --- /dev/null +++ b/docs/user/templates/06_batch_prediction.md @@ -0,0 +1,155 @@ + +# Batch Prediction + +The batch prediction job will be executed as a Spark Application running in a Spark cluster on top of Kubernetes. + +## Prediction Job + +Prediction Job is the resource introduced in Merlin for executing batch prediction. A Prediction Job is owned by the corresponding Model Version. One Model Version can have several Prediction Jobs and it maintains the history of all jobs ever created. Prediction Job has several important properties: + +1. **Id**: Unique ID of the prediction job +1. **Model / Model version**: Reference to the model version from which the prediction job is created +1. **Config**: config will contain the source, sink, secret configuration of the prediction job. It could also contain additional config for resource requests or spark-specific configuration +1. **State**: Current state of the prediction job (see lifecycle section) +1. **Error**: Detailed error message if the prediction job is unsuccessful +1. **Logs**: Link to the log location +1. **Monitoring URL**: Link to the monitoring dashboard + +## Lifecycle + +Prediction Job has several state during its lifetime: + +1. **Pending**: Prediction job is in this state once it is created / submitted. It will enter the running state if the spark application is started successfully, otherwise it will enter a failed state. +1. **Running**: Prediction jobs will move to the running state once the underlying spark application for executing the prediction job is created. The prediction job will be in this state until the spark application is completed (in which case it moves to completed state) or failed (in which case the prediction job entered failed state). Users can manually stop the prediction job and it will enter the terminating state. +1. **Completed**: Prediction job enter the completed state if it’s completed successfully +1. **Failed**: Any kind of failure preventing the prediction job not being able to complete will make it enters the failed state. +1. **Terminating**: Prediction jobs enter a terminating state if a user manually cancels a pending/running prediction job. +1. **Terminated**: Once the termination process is completed the prediction job will enter the terminated state. + +![Prediction Job Lifecycle](../../diagrams/prediction_job_lifecycle.drawio.svg) + +## Creating Secret/Service Account + +To be able to run a Prediction Job you’ll need a service account and store the key inside the MLP Project using secret management API. The service account must have following authorization: + +1. BigQuery Job User (`roles/bigquery.jobUser`) in the project where service account is created +1. BigQuery Read Session User (`roles/bigquery.readSessionUser`) in the project where service account is created +1. BigQuery Data Viewer (`roles/bigquery.dataViewer`) in the source dataset +1. BigQuery Data Editor (`roles/bigquery.dataEditor`) in the destination dataset +1. Storage Writer (`roles/storage.legacyBucketWriter`) +1. Storage Object Admin (`roles/storage.objectAdmin`) + +## Configuring Source + +You can specify the source configuration of your prediction job by creating an instance of `BigQuerySource`. This class’s constructor accept following parameters: + +1. `table`: source table ID with format gcp_project.dataset_name.table_name +1. `features`: list of features to be used for prediction, it has to match the column name in the source table. +1. `options`: is dictionary containing additional options that could be used to customize the source. Following are option that can be used. + +| Property | Description | +| ---------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `parentProject` | The Google Cloud Project ID of the table to bill for the export.(Optional. Defaults to the project of the Service Account being used) | +| `maxParallelism` | The maximal number of partitions to split the data into. Actual number may be less if BigQuery deems the data small enough. If there are not enough executors to schedule a reader per partition, some partitions may be empty. **Important**: The old parameter (`parallelism`) is still supported but in deprecated mode. It will ve removed in version 1.0 of the connector. (Optional. Defaults to one partition per 400MB. See [Configuring Partitioning](https://github.com/GoogleCloudDataproc/spark-bigquery-connector#configuring-partitioning).) | +| `viewsEnabled` | Enables the connector to read from views and not only tables. Please read the [relevant section](https://github.com/GoogleCloudDataproc/spark-bigquery-connector#reading-from-views) before activating this option.(Optional. Defaults to `false`) | +| `viewMaterializationProject` | The project id where the materialized view is going to be created(Optional. Defaults to view's project id) | +| `viewMaterializationDataset` | The dataset where the materialized view is going to be created(Optional. Defaults to view's dataset) | +| `readDataFormat` | Data Format for reading from BigQuery. Options: `ARROW`, `AVRO`. Unsupported Arrow filters are not pushed down and results are filtered later by Spark. (Currently Arrow does not suport disjunction across columns).(Optional. Defaults to `AVRO`) | +| `optimizedEmptyProjection` | The connector uses an optimized empty projection (select without any columns) logic, used for count() execution. This logic takes the data directly from the table metadata or performs a much efficient `SELECT COUNT(*) WHERE...` in case there is a filter. You can cancel the use of this logic by setting this option to `false`. (Optional, defaults to `true`) | + +Source: https://github.com/GoogleCloudDataproc/spark-bigquery-connector + +### Reading from View + +To use view as data source instead of table you’ll have to set viewsEnabled to true and specify `viewMaterializationProject` and `viewMaterializationDataset`. Since the materialization of view will create a table, the service account should also have `roles/bigquery.dataEditor` in the pointed dataset. Below is an example: + +{% code title="bq_source.py" overflow="wrap" lineNumbers="false" %} +```python +bq_source = BigQuerySource("project.dataset.table_iris", + features=["sepal_length", "sepal_width", "petal_length", "petal_width"], + options={ + "viewsEnabled" : "true", + "viewMaterializationProject" : "project", + "viewMaterializationDataset" : "dsp" + }) +``` +{% endcode %} + +## Configuring Sink + +To configure the destination of prediction job you can create an instance of `BigQuerySink`. The class accepts following parameters: + +1. `table`: destination table ID with format gcp_project.dataset_name.table_name +1. `staging_bucket`: GCS staging bucket that will be used as temporary storage for storing prediction result before loading it to the destination table. +1. `result_column`: Column name in the destination table that will be used to store the prediction result. Note that it has to be a string and not list of string even though if you specify ARRAY as the result. +1. `save_mode`: SaveMode is used to specify the expected behavior of saving the prediction result into destination table. Following are the possible values: + - `ERRORIFEXISTS`: it will throw error if the destination table already exists (default). + - `OVERWRITE`: it will overwrite the destination table if it exists. + - `APPEND`: it will append the new result into destination table if it exists. + - `IGNORE`: it will not write the prediction result if the destination table exists. +1. `options`: Dictionary of strings that can be used to specify additional configuration. Following are the available parameters. + +| Property | Description | +| ----------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `createDisposition` | Specifies whether the job is allowed to create new tables. The permitted values are:
  1. `CREATE_IF_NEEDED` - Configures the job to create the table if it does not exist.
  2. `CREATE_NEVER` - Configures the job to fail if the table does not exist.
This option takes place only in case Spark has decided to write data to the table based on the SaveMode. (Optional. Default to `CREATE_IF_NEEDED`). | +| `intermediateFormat` | The format of the data before it is loaded to BigQuery, values can be either "parquet" or "orc". (Optional. Defaults to `parquet`). On write only. | +| `partitionField` | If not set, the table is partitioned by pseudo column, referenced via either `'_PARTITIONTIME' as TIMESTAMP` type, or `'_PARTITIONDATE' as DATE` type. If field is specified, the table is instead partitioned by this field. The field must be a top-level TIMESTAMP or DATE field. Its mode must be **NULLABLE** or **REQUIRED**. (Optional). | +| `partitionExpirationMs` | Number of milliseconds for which to keep the storage for partitions in the table. The storage in a partition will have an expiration time of its partition time plus this value. (Optional). | +| `partitionType` | The only type supported is `DAY`, which will generate one partition per day. (Optional. Default to `DAY`). | +| `clusteredFields` | Comma separated list of non-repeated, top level columns. Clustering is only supported for partitioned tables (Optional). | +| `allowFieldAddition` | Adds the [ALLOW_FIELD_ADDITION](https://googleapis.dev/java/google-cloud-clients/latest/com/google/cloud/bigquery/JobInfo.SchemaUpdateOption.html#ALLOW_FIELD_ADDITION) SchemaUpdateOption to the BigQuery LoadJob. Allowed vales are `true` and `false`. (Optional. Default to `false`). | +| `allowFieldRelaxation` | Adds the ALLOW_FIELD_RELAXATION SchemaUpdateOption to the BigQuery LoadJob. Allowed vales are `true` and `false`. (Optional. Default to `false`). | + +Source: https://github.com/GoogleCloudDataproc/spark-bigquery-connector + +## Configuring Resource Request + +Class `PredictionJobResourceRequest` is useful to configure the resource request for running prediction job. Following are the parameters that can be configured: + +1. `driver_cpu_request` : Driver CPU request. e.g: 1, 1500m , 500m +1. `driver_memory_request`: Driver memory request. e.g. 1Gi, 512Mi +1. `executor_cpu_request`: executor CPU request. e.g: 1, 1500m , 500m +1. `executor_memory_request`: executor memory request. e.g. 1Gi, 512Mi +1. `executor_replica`: number of executor replica. e.g. 1, 2 + +Without specifying `PredictionJobResourceRequest` the prediction job will run with the system default as follows: + +``` +executor_replica: 3 +driver_cpu_request: "2" +driver_memory_request: "2Gi" +executor_cpu_request: "2" +executor_memory_request: "2Gi" +``` + +This default configuration is good enough for most cases. However, it might not be sufficient for case where you have large model size , the dataset has a wide table (a lot of column), or the processing requires a lot of memory. In such case you might want to increase the `executor_memory_request` to a larger value. The best value can be determined by observing the memory usage of the executor in the monitoring dashboard. + +You might also want to make the prediction job to complete faster by increasing the `executor_cpu_request` and `executor_replica`. However, **it will increase the cost significantly**. + +## Known Issues + +### Type Conversion Error When BQ Source Has Date Column + +#### Symptom + +Following error is thrown during batch prediction execution + +``` +> raise AttributeError("Can only use .dt accessor with datetimelike " + "values") +E AttributeError: Can only use .dt accessor with datetimelike values + +../../../.local/share/virtualenvs/merlin-pyspark-n4ybPFnE/lib/python3.8/site-packages/pandas/core/indexes/accessors.py:324: AttributeError + +Assertion failed +``` + +Check whether your BQ source table has DATE type column. If so the workaround might help. + +#### Root Cause + +https://issues.apache.org/jira/browse/SPARK-30961 + +#### Work Around + +Add `pyarrow==0.11.1` and `pandas==0.24.1` to conda `environment.yaml` of your model. \ No newline at end of file diff --git a/docs/user/templates/07_examples.md b/docs/user/templates/07_examples.md new file mode 100644 index 000000000..0ec9f1584 --- /dev/null +++ b/docs/user/templates/07_examples.md @@ -0,0 +1,15 @@ + +# Examples + +Examples of using Merlin for different purposes are available to be tried out as Jupyter notebooks in the links below. +You may want to clone the examples to your local directory and run them using Jupyter notebook. + +{% page-ref page="./examples/01_standard_model.md" %} + +{% page-ref page="./examples/02_pyfunc_model.md" %} + +{% page-ref page="./examples/03_transformer.md" %} + +{% page-ref page="./examples/04_batch_prediction.md" %} + +{% page-ref page="./examples/05_others.md" %} \ No newline at end of file diff --git a/docs/user/templates/08_limitations.md b/docs/user/templates/08_limitations.md new file mode 100644 index 000000000..3f906da28 --- /dev/null +++ b/docs/user/templates/08_limitations.md @@ -0,0 +1,51 @@ + +# Limitations + +This article is an aggregation of the limits imposed on various components of the Merlin platform. + +## Project + +### Project Name + +A project name can only contain letters `a-z` (lowercase), numbers `0-9` and the dash `-` symbol. The maximum length of a project name is `50` characters. + +An example of a valid project name would be `gojek-project-01`. + +## Model + +### Model Name + +A model name can only contain letters `a-z` (lowercase), numbers `0-9` and the dash `-` symbol. The maximum length of a model name is `25` characters. + +An example of a valid model name would be `gojek-model-01`. + +### Model Deployment + +The maximum number of model versions that can be deployed in an environment is `2` per model. + +### Resources + +The maximum amount of CPU cores that can be allocated to a model is `4`. + +The maximum amount of memory that can be allocated to a model is `8GB`. + +## Autoscaling Policy + +### Autoscaling + +Autoscaling is enabled for both staging and production environment. User can set minimum and maximum number of replica during deployment. + +### Scale Down to Zero + +"Scaling down to zero" is a feature in Merlin, which automatically reduces the number of model deployments to zero when they haven't received any traffic for 10 minutes. To make the model available again, it must receive HTTP traffic, which triggers a scale-up. + +This feature is only applicable when your autoscaling policy is set to either `RPS` or `Concurrency`." + +Note that, to utilise this feature, the minimum replicas for the deployment should be set to `0`. + + +### Logs + +### Log History + +Users can only view the logs that are still in the model’s container. Link to the associated Stackdriver dashboard is provided in the log page to access past log. \ No newline at end of file diff --git a/docs/user/templates/examples/01_standard_model.md b/docs/user/templates/examples/01_standard_model.md new file mode 100644 index 000000000..da6ec5f1d --- /dev/null +++ b/docs/user/templates/examples/01_standard_model.md @@ -0,0 +1,21 @@ + + +# Deploy Standard Models + +Try out the notebooks below to learn how to deploy different types of Standard Models to Merlin. + +## Deploy SKLearn Model + +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/sklearn/SKLearn.ipynb" %} + +## Deploy XGBoost Model + +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/xgboost/XGBoost.ipynb" %} + +## Deploy Tensorflow Model + +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/tensorflow/Tensorflow.ipynb" %} + +## Deploy Pytorch Model + +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/pytorch/Pytorch.ipynb" %} \ No newline at end of file diff --git a/docs/user/templates/examples/02_pyfunc_model.md b/docs/user/templates/examples/02_pyfunc_model.md new file mode 100644 index 000000000..1cafb38dd --- /dev/null +++ b/docs/user/templates/examples/02_pyfunc_model.md @@ -0,0 +1,26 @@ + + +# Deploy PyFunc Model + +Try out the notebooks below to learn how to deploy PyFunc Models to Merlin. + +**Note on compatibility**: The Pyfunc servers are compatible with `protobuf>=3.12.0,<5.0.0`. Users whose models have a strong dependency on Protobuf `3.x.x` are advised to pin the library version in their conda environment, when submitting the model version. If using Protobuf `3.x.x`, users can do one of the following: +* Use `protobuf>=3.20.0` - these versions support simplified class definitions and this is the recommended approach. +* If you must use `protobuf>=3.12.0,<3.20.0`, other packages used in the Pyfunc server need to be downgraded as well. Please pin the following in your model’s conda environment: +```yaml +dependencies: + - pip: + - protobuf==3.15.6 # Example older protobuf version + - caraml-upi-protos<=0.3.6 + - grpcio<1.49.0 + - grpcio-reflection<1.49.0 + - grpcio-health-checking<1.49.0 +``` + +## Deploy PyFunc Model + +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/pyfunc/Pyfunc.ipynb" %} + +## Deploy PyFunc Model with Custom Prometheus Metrics + +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/metrics/Metrics.ipynb" %} \ No newline at end of file diff --git a/docs/user/templates/examples/03_transformer.md b/docs/user/templates/examples/03_transformer.md new file mode 100644 index 000000000..6183b724e --- /dev/null +++ b/docs/user/templates/examples/03_transformer.md @@ -0,0 +1,21 @@ + + +# Using Transformers + +Try out the notebooks below to learn how to deploy models with each type of transformers in Merlin. + +## Deploy PyFunc Model with Standard Transformer + +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/transformer/standard-transformer/Standard-Transformer.ipynb" %} + +## Deploy PyFunc Model with Custom Transformer + +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/transformer/custom-transformer/PyFunc-Transformer.ipynb" %} + +## Deploy PyTorch Model with Custom Transformer + +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/transformer/custom-transformer/PyTorch-Transformer.ipynb" %} + +## Deploy PyFunc Model with Feast Enricher Transformer + +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/transformer/feast-enricher-transformer/Feast-Enricher.ipynb" %} diff --git a/docs/user/templates/examples/04_batch_prediction.md b/docs/user/templates/examples/04_batch_prediction.md new file mode 100644 index 000000000..70e6caf60 --- /dev/null +++ b/docs/user/templates/examples/04_batch_prediction.md @@ -0,0 +1,13 @@ + + +# Run Batch Prediction Job + +Try out the notebooks below to learn how to run batch prediction jobs using PyFunc V2 in Merlin. + +## Run Iris Classifier Batch Prediction Job + +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/batch/BatchPredictionTutorial1-IrisClassifier.ipynb" %} + +## Run New York Taxi Fare Batch Prediction Job + +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/batch/BatchPredictionTutorial2-NewYorkTaxi.ipynb" %} \ No newline at end of file diff --git a/docs/user/templates/examples/05_others.md b/docs/user/templates/examples/05_others.md new file mode 100644 index 000000000..742b58da6 --- /dev/null +++ b/docs/user/templates/examples/05_others.md @@ -0,0 +1,13 @@ + + +# Others + +Try out the notebooks below to learn about other features of Merlin. + +## Requesting CPU and Memory Resources + +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/resource-request/Resource-Request.ipynb" %} + +## Working with Model Endpoint + +{% embed url="https://github.com/caraml-dev/merlin/blob/main/examples/model-endpoint/ModelEndpoint.ipynb" %} diff --git a/docs/user/templates/model_deployment/01_deploying_a_model_version.md b/docs/user/templates/model_deployment/01_deploying_a_model_version.md new file mode 100644 index 000000000..026592e86 --- /dev/null +++ b/docs/user/templates/model_deployment/01_deploying_a_model_version.md @@ -0,0 +1,167 @@ + + +# Model Version Endpoint + +To start sending inference requests to a model version, it must first be deployed. During deployment, different configurations can be chosen such as the number of replicas, CPU/memory requests, autoscaling policy, environment variables, etc. The set of these configurations that are used to deploy a model version is called a *deployment*. + +A model may have any number of versions. But, at any given time, only a maximum of **2** model versions can be deployed. + +When a model version is deployed, a Model Version Endpoint is created. The URL is of the following format: + +``` +http://-.. +``` + +For example a Model named `my-model` within Project named `my-project` with the base domain `{{ models_base_domain }}` will have a Model Version Endpoint for version `1` as follows: + +``` +http://my-model-1.my-project.{{ models_base_domain }} +``` + +A Model Version Endpoint has several states: + +- **pending**: The initial state of a Model Version Endpoint. +- **running**: Once deployed, a Model Version Endpoint is in running state and is accessible. +- **serving**: A Model Version Endpoint is in serving state if a Model Endpoint is created from it. +- **terminated**: Once undeployed, a Model Version Endpoint is in terminated state. +- **failed**: If an error occurred during deployment. + +## Image Building + +Depending on the type of the model being deployed, there may be an intermediate step to build the Docker image (using Kaniko). This is applicable to PyFunc models. + +## Deploying a Model Version + +A model version can be deployed via the SDK or the UI. + +### Deploying a Model Version via SDK + +Here's the example to deploy a Model Version Endpoint using Merlin Python SDK: + +{% code title="model_version_deployment.py" overflow="wrap" lineNumbers="true" %} +```python +with merlin.new_model_version() as v: + merlin.log_metric("metric", 0.1) + merlin.log_param("param", "value") + merlin.set_tag("tag", "value") + + merlin.log_model(model_dir='tensorflow-sample') + + merlin.deploy(v, environment_name="staging") +``` +{% endcode %} + +### Deploying a Model Version via UI + +The Deploy option can be selected from the model versions view. + +![Deploy a Model Version](../../../images/deploy_model_version.png) + +## Deployment Modes + +Merlin supports 2 types of deployment mode: `SERVERLESS` and `RAW_DEPLOYMENT`. Under the hood, `SERVERLESS` deployment uses KNative as the serving stack. On the other hand `RAW_DEPLOYMENT` uses native [Kubernetes deployment resources](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/). + +The deployment modes supported by Merlin have their own advantages and disadvantages, listed below. + +* **Serverless Deployment:** + - **Pros:** Supports more advanced autoscaling policy (RPS, Concurrency); supports scale down to zero. + - **Cons:** Slower compared to `RAW_DEPLOYMENT` due to infrastructure overhead +* **Raw Deployment:** + - **Pros:** Relatively faster compared to `SERVERLESS` deployments; less infrastructure overhead and more cost efficient. + - **Cons:** Supports only autoscaling based on CPU usage. + +### Configuring Deployment Modes + +Users are able to configure the deployment mode of their model via the SDK or the UI. + +#### Configuring Deployment Mode via SDK + +Example below will configure the deployment mode to use `RAW_DEPLOYMENT` + +{% code title="deployment_configuration.py" overflow="wrap" lineNumbers="true" %} +```python +import merlin +from merlin import DeploymentMode +from merlin.model import ModelType + +# Deploy using raw_deployment +merlin.set_url("{{ merlin_url }}") +merlin.set_project("my-project") +merlin.set_model("my-model", ModelType.TENSORFLOW) +model_dir = "test/tensorflow-sample" + +with merlin.new_model_version() as v: + merlin.log_model(model_dir=model_dir) + +# Deploy using raw_deployment +new_endpoint = merlin.deploy(v, deployment_mode=DeploymentMode.RAW_DEPLOYMENT) +``` +{% endcode %} + +#### Configuring Deployment Mode via UI + +![Deployment Mode](../../../images/deployment_mode.png) + +## Autoscaling Policy + +Merlin supports configurable autoscaling policy to ensure that users have complete control over the autoscaling behavior of their models. There are 4 types of autoscaling metrics in Merlin: + +* **CPU Utilization:** The autoscaling is based on the ration of model service's CPU usage and its CPU request. This autoscaling policy is available on all deployment mode. +* **Memory Utilization:** The autoscaling is based on the ration of model service's Memory usage and its Memory request. This autoscaling policy is available only on `SERVERLESS` deployment mode. +* **Model Throughput (RPS):** The autoscaling is based on RPS per replica of the model service. This autoscaling policy is available only on `SERVERLESS` deployment mode. +* **Concurrency:** The autoscaling is based on number of concurrent request served by a replica of the model service. This autoscaling policy is available only on `SERVERLESS` deployment mode. + +### Configuring Autoscaling Policy + +Users can update the autoscaling policy via the SDK or the UI. + +#### Configuring Autoscaling Policy via SDK + +Below is the example of configuring autoscaling policy of a `SERVERLESS` deployment to use `RPS` metrics. + +{% code title="autoscaling_policy.py" overflow="wrap" lineNumbers="true" %} +```python +import merlin +from merlin import DeploymentMode +from merlin.model import ModelType + +# Deploy using raw_deployment +merlin.set_url("{{ merlin_url }}") +merlin.set_project("my-project") +merlin.set_model("my-model", ModelType.TENSORFLOW) +model_dir = "test/tensorflow-sample" + +with merlin.new_model_version() as v: + merlin.log_model(model_dir=model_dir) + +# Deploy using raw_deployment + endpoint = merlin.deploy(v1, deployment_mode=DeploymentMode.SERVERLESS, + autoscaling_policy=merlin.AutoscalingPolicy( + metrics_type=merlin.MetricsType.RPS, + target_value=20)) +``` +{% endcode %} + +#### Configuring Autoscaling Policy via UI + +![Autoscaling Policy](../../../images/autoscaling_policy.png) + +## Liveness Probe + +When deploying a model version, the model container will be built with a livenes probe by default. The liveness probe will periodically check that your model is still alive, and restart the pod automatically if it is deemed to be dead. + +However, should you wish to disable this probe, you may do so by providing an environment variable to the model service with the following value: + +``` +MERLIN_DISABLE_LIVENESS_PROBE="true" +``` + +This can be supplied via the deploy function. i.e. + +{% code title="liveness_probe.py" overflow="wrap" lineNumbers="false" %} +```python + merlin.deploy(v, env_vars={"MERLIN_DISABLE_LIVENESS_PROBE"="true"}) +``` +{% endcode %} + +The liveness probe is also available for the transformer. More details can be found at: {% page-ref page="./transformer/standard_transformer/01_standard_transformer_expressions.md" %} \ No newline at end of file diff --git a/docs/user/templates/model_deployment/02_serving_a_model_version.md b/docs/user/templates/model_deployment/02_serving_a_model_version.md new file mode 100644 index 000000000..4411ee01b --- /dev/null +++ b/docs/user/templates/model_deployment/02_serving_a_model_version.md @@ -0,0 +1,47 @@ + + +# Model Endpoint + +Model serving is the next step of model deployment. After deploying a model version, we can optionally start serving it. This creates a Model Endpoint which is a stable URL associated with a model, of the following format: + +``` +http://.. +``` + +For example a Model named `my-model` within Project named `my-project` with the base domain `{{ models_base_domain }}` will have a Model Endpoint which look as follows: + +``` +http://my-model.my-project.{{ models_base_domain }} +``` + +Having a Model Endpoint makes it easy to keep updating the model (creating a new model version, running it and then serving it) without having to modify the model URL used by the called system. + +## Serving a Model Version + +A model version can be served via the SDK or the UI. + +### Serving a Model Version via SDK + +To serve a model version, you can call `serve_traffic()` function from Merlin Python SDK. + +{% code title="model_version_serving.py" overflow="wrap" lineNumbers="true" %} +```python +with merlin.new_model_version() as v: + merlin.log_metric("metric", 0.1) + merlin.log_param("param", "value") + merlin.set_tag("tag", "value") + + merlin.log_model(model_dir='tensorflow-sample') + + version_endpoint = merlin.deploy(v, environment_name="staging") + +# serve 100% traffic at endpoint +model_endpoint = merlin.serve_traffic({version_endpoint: 100}) +``` +{% endcode %} + +### Serving a Model Version via UI + +Once a model version is deployed (i.e., it is in the Running state), the Serve option can be selected from the model versions view. + +![Serve Model Version](../../../images/serve_model_version.png) \ No newline at end of file diff --git a/docs/user/templates/model_deployment/03_configuring_transformers.md b/docs/user/templates/model_deployment/03_configuring_transformers.md new file mode 100644 index 000000000..322f84bdf --- /dev/null +++ b/docs/user/templates/model_deployment/03_configuring_transformers.md @@ -0,0 +1,11 @@ + + +# Transformer + +In the Merlin ecosystem, a Transformer is a service deployed in front of the model service which users can use to perform pre-processing / post-processing steps to the incoming request / outgoing response, to / from the model service. A Transformer allows the user to abstract the transformation logic outside of their model and even write it in a language more performant than python. + +Currently, Merlin supports two types of Transformer: Standard and Custom: + +{% page-ref page="./transformer/01_standard_transformer.md" %} + +{% page-ref page="./transformer/02_custom_transformer.md" %} \ No newline at end of file diff --git a/docs/user/templates/model_deployment/04_redeploying_a_model_version.md b/docs/user/templates/model_deployment/04_redeploying_a_model_version.md new file mode 100644 index 000000000..85b41f22d --- /dev/null +++ b/docs/user/templates/model_deployment/04_redeploying_a_model_version.md @@ -0,0 +1,38 @@ + + +# Redeploying a Model Version + +Once a model version is attempted to be deployed, a Model Version Endpoint is created. If the deployment is successful, the endpoint would be in the Running state. This endpoint can later also be terminated. + +Whenever a running (or serving) model version is redeployed, a new *deployment* is created and the Merlin API server attempts to deploy it, while keep the existing deployment running (or serving). + +If the deployment of the new configuration fails, **the old deployment stays deployed** and remains as the current *deployment* of the model version. The new configuration will then show a 'Failed' status. + +![Unsuccessful Model Version Redeployment](../../../images/redeploy_model_unsuccessful.png) + +A model version can be redeployed via the SDK or the UI. + +### Redeploying a Model Version via SDK + +{% code title="model_version_redeployment.py" overflow="wrap" lineNumbers="true" %} +```python +import merlin +from merlin import DeploymentMode + +# Get model version that's already deployed +merlin.set_url("{{ merlin_url }}") +merlin.set_project("my-project") +merlin.set_model("my-model") +model = merlin.active_model() +version = model.get_version(2) + +# Redeploy using new config (here, we are updating the deployment mode) +new_endpoint = merlin.deploy(v, deployment_mode=DeploymentMode.RAW_DEPLOYMENT) +``` +{% endcode %} + +### Redeploying a Model Version via UI + +A Running / Serving model version can be redeployed from the model versions view. + +![Redeploy Model Version](../../../images/redeploy_model_version.png) \ No newline at end of file diff --git a/docs/user/templates/model_deployment/transformer/01_standard_transformer.md b/docs/user/templates/model_deployment/transformer/01_standard_transformer.md new file mode 100644 index 000000000..5886ad76c --- /dev/null +++ b/docs/user/templates/model_deployment/transformer/01_standard_transformer.md @@ -0,0 +1,1145 @@ + + +# Standard Transformer + +Standard Transformer is a built-in pre and post-processing steps supported by Merlin. With standard transformer, it’s possible to enrich the model’s incoming request with features from feast and transform the payload so that it’s compatible with API interface provided by the model. Same transformation can also be applied against the model’s response payload in the post-processing step, which allow users to adapt the response payload to make it suitable for consumption. Standard transformer supports **http_json** and **upi_v1** protocol. For **http_json** protocol the standard transformer server runs rest server on top of **http 1.1**, **upi_v1 protocol** the server run grpc server. + +## Concept + +Within standard transformer there are 2 process that user can specify: preprocess and postprocess. + +Preprocess is useful to perform transformation against model’s incoming request such as enriching the request with features from Feast and transforming the client’s request to a format accepted by model service. + +Post Processing is useful for performing transformation against model response so that it is more suitable for client consumption. + +Within both preprocess and postprocess, there are 3 stages that users can specify: + +* Input stage + In the input stage, users specify all the data dependencies that are going to be used in subsequent stages. There are 2 operations available in these stages: variable declaration and table creation. + +* Transformation stage. + In this stage, the standard transformers perform transformation to the tables created in the input stage so that its structure is suitable for the output. In the transformation stage, users operate mainly on tables and are provided with 2 transformation types: single table transformation and table join. + + +* Output stage + At this stage, both the preprocessing and postprocessing pipeline should create output payload which later on can be used as request payload for model predictor or the final response to be returned to downstream service/client. There are 3 types of output operation: + * JSON Output. JSON output operation will return JSON output, this operation only applicable for **http_json** protocol + * UPIPreprocessOutput. UPIPreprocessOutput will return UPI Request interface payload in a protobuf.Message type + * UPIPostprocessOutput. UPIPostprocessOutput will return UPI Response interface payload in a protobuf.Message type + + +![Standard Transformer](../../../../images/standard_transformer.png) + +## Jsonpath +Jsonpath is a way to find value from JSON payload. Standard transformer using jsonpath to find values either from request or model response payload. Standard transformer using Jsonpath in several operations: +* Variable declaration +* Feast entity value +* Base table +* Column value in table +* Json Output + + +Most of the jsonpath configuration is like this +``` +fromJson: + + jsonPath: # Json path in the incoming request / model response payload + + defaultValue: # (Optional) Default value if value for the jsonPath is nil or empty + + valueType: # Type of default value, mandatory to specify if default value is exist +``` + +but in some part of operation like variable operation and feast entity extraction, jsonPath configuration is like below + +``` + jsonPathConfig: + + jsonPath: # Json path in the incoming request / model response payload + + defaultValue: # (Optional) Default value if value for the jsonPath is nil or empty + + valueType: # Type of default value, mandatory to specify if default value is exist +``` + +### Default Value + +In standard transformer, user can specify jsonpath with default value if the result of jsonpath is empty or nil. Cases when default value is used: +* Result of jsonpath is nil +* Result of jsonpath is empty array +* Result of jsonpath is array, where some of its value is null + +Value Type + +|Value Type| Syntax | +|----------| ------------| +| Integer | INT | +| Float | FLOAT | +| Boolean | BOOL | +| String | STRING | + +For example, if we have incoming request +``` +{ + "signature_name" : "predict", + "instances": [ + {"sepal_length":2.8, "sepal_width":1.0, "petal_length":6.8, "petal_width":0.4}, + {"sepal_length":0.1, "sepal_width":0.5, "petal_length":1.8, "petal_width":2.4} + ], + "instances_with_null": [ + {"sepal_length":2.8, "sepal_width":1.0, "petal_length":6.8, "petal_width":0.4}, + {"sepal_length":0.1, "sepal_width":0.5, "petal_length":1.8, "petal_width":null}, + {"sepal_length":0.1, "sepal_width":0.5, "petal_length":1.8, "petal_width":0.5} + ], + "empty_array": [], + "null_key": null, + "array_object": [ + {"exist_key":1}, + {"exist_key":2} + ] +} +``` + +* Result of jsonpath is nil + There are cases when `jsonpath` value is nil: + * Value in JSON is nil + * There is no such key in JSON + +Example: + ``` + fromJson: + jsonPath: $.null_key + defaultValue: -1 + valueType: INT + ``` + the result of above Jsonpath is `-1` because `$.null_key` returning nil +* Result of jsonpath is empty array + ``` + fromJson: + jsonPath: $.empty_array + defaultValue: 0.0 + valueType: FLOAT + ``` + the result of above Jsonpath is `[0.0]` because `$.empty_array` returning empty so it will use default value +* Result of jsonpath is array, where some of its value is null + ``` + fromJson: + jsonPath: $.instances_with_null[*].petal_width + defaultValue: -1 + valueType: INT + ``` + the result of above Jsonpath is `[0.4,-1,0.5]`, because the original jsonpath result `[0.4,null,0.5]` containing null value, the default value is used to replace `null` value + +## Expression + +An expression is a single line of code which should return a value. Standard transformer uses expression as a flexible way of calculating values to be used in variable initialization or any other operations. + +For example: + +Expression can be used for initialising variable value + +``` +variables: + - currentTime: + expression: now() + - currentHour: + expression: currentTime.Hour() +``` +Expression can be used for updating column value +``` + updateColumns: + column: "s2id" + expression: getS2ID(df.Col('lat'), df.Col('lon')) +``` + +For full list of standard transformer built-in functions, please check: {% page-ref page="./standard_transformer/01_standard_transformer_expressions.md" %} + +## Input Stage +At the input stage, users specify all the data dependencies that are going to be used in subsequent stages. There are 4 operations available in these stages: + +1. Table creation + - Table Creation from Feast Features + - Table Creation from Input Request + - Table Creation from File + +2. Variable declaration + +3. Encoder declaration +4. Autoload + +### Table Creation +Table is the main data structure within the standard transformer. There are 3 ways of creating table in standard transformer: + +#### Table Creation from Feast Features +This operation creates one or more tables containing features from Feast. This operation is already supported in Merlin 0.10. The key change to be made is to adapt the result of operation. Previously, the features retrieved from feast is directly enriched to the original request body to be sent to the model. Now, the operation only outputs as internal table representation which can be accessible by subsequent transformation steps in the pipeline. + +Additionally, it should be possible for users to give the features table a name to ease referencing the table from subsequent steps. + +Following is the syntax: + + ``` + feast: + - tableName: # Specify the output table name + + project: # Name of project in feast where the features located + + source: # Source for feast (REDIS or BIGTABLE) + + entities: # List of entities + + - name: # Entity Id + + valueType: # Entity Value Type + + # The entity value will be retrieved either using jsonPath or expression configuration below: + jsonPathConfig: + + jsonPath: # Json path in the incoming request container the entity value + + defaultValue: # (Optional) Default value if value for the jsonPath is nil or empty + + valueType: # Type of default value, mandatory to specify if default value is exist + + jsonPath: # Json path in the incoming request containing the entity value (Deprecated) + + expression: # Expression provided by user which return entity values + + features: # List of features to be retrieved + + - name: # feature name + + defaultValue: # default value if the feature is not available + ``` + below is the sample of feast input: + + ``` + feast: + - tableName: table_1 + project: sample + source: BIGTABLE + entities: + - name: merchant_uuid + valueType: STRING + jsonPathConfig: + jsonPath: $.merchant_uuid + defaultValue: -1 + valueType: INT + - name: customer_id + valueType: STRING + expression: customer_id + features: + - name: sample_driver:order_count + valueType: DOUBLE + defaultValue: '90909' + ``` + + There are two ways to get/retrieve features from feast in merlin standard transformer: + * Getting the features values from feast GRPC URL + * By direcly querying from feast storage (Bigtable or Redis). For this, you need to add extra environment variables in standard transformer + * REDIS. Set `FEAST_REDIS_DIRECT_STORAGE_ENABLED` value to true + * BIGTABLE. Set `FEAST_BIGTABLE_DIRECT_STORAGE_ENABLED` value to true + + \ + For detail explanation of environment variables in standard transformer, you can look [this section](#standard-transformer-environment-variables) + + #### Table Creation from Input Request + This step is generic table creation that allows users to define one or more tables based on value from either JSON payload, result of built-in expressions, or an existing table. Following is the syntax for table input: + ``` + tables: + + - name: # Table name + + baseTable: # create a base table either from a JSON array of object or from existing table + + fromJson: # create a table based on an array of objects within JSON payload, the object key will be column name. + + jsonPath: # JSONPath to array of object in the JSON payload + + defaultValue: # Fallback value if value for jsonpath is nil or empty + + addRowNumber: # True/false, add column called "row_number" which contains row number + + fromTable: # Create base table from an existing table + + tableName: # Source table name + + columns: # List of columns to be added in the table, it's possible to have 0 columns. The columns will override existing column defined in the baseTable + + # The number of row in the first column determines the table size so put the longest column first + + - name: # Column name + + fromJson: # Get column values from json path + + jsonPath: # JSONPath to array of object in the JSON payload + + defaultValue: # Fallback value if value for jsonpath is nil or empty + + expression: # Assign result from expression to the column value + + ``` + sample: + + ``` + - tables: + - name: table_2 + baseTable: + fromTable: + tableName: table_1 + columns: + - name: col_1 + fromJson: + jsonPath: $.drivers[*].id + - name: col_2 + expression: table.Col('rating') + - name: table_3 + baseTable: + fromJson: + jsonPath: $.drivers[*] + columns: + - name: col_1 + fromJson: + jsonPath: $.drivers[*].id + - name: col_2 + expression: table.Col('rating') + + ``` + +#### Table Creation from File +This operation allows user to create a static table from a file. For example, user might choose to load a table with a list of public holidays for the year. As the data will be loaded into memory, it is strongly advised to keep the total size of all files within 50mb. Also, each file shall only contain information for 1 table. + +##### Supported File Format + +There are 2 types of files are currently supported: + +- csv: For this file type, only comma (,) may be used as delimiter. The first line shall also contain a header, which gives each column a unique name. + +- parquet + +##### Supported File Storage Location + +Currently, files must first be uploaded to a preferred GCS bucket in gods-* project. The file will be read once during deployment. + +##### Supported Column Types + +Only basic types for the columns are supported, namely: String, Integer, Float and Boolean + +The types of each column are auto-detected, but may be manually set by the user (please ensure type compatibility). + +##### How to use + +In order to use this feature, firstly, these files will have to be loaded into GCS buckets in gods-* projects in order to be linked. + +Then, use the syntax below to define the specifications: + + ``` + tables: + - name: # Table name + baseTable: + fromFile: + format: CSV # others: PARQUET + uri: # GCS uri to the location of the file in gods-* project + schema: # this part is used to manually set column type + - name: col_1 # name of column + type: STRING #others: INT, FLOAT, BOOL + … + - name: col_2 + type: INT + ``` + + +### Variable +Variable declaration is used for assigning literal value or result of a function into a variable. The variable declaration will be executed from top to bottom and it’s possible to refer to the declared variable in subsequent variable declarations. Following are ways to set value to variable. + +* Literal + Specifiying literal value to variable. By specifying literal values user needs to specify what is the type for that variable. Types that supported for this: + * String + * Int + * Float + * Bool + for example: + ``` + - variables: + - name: var_1 + literal: + intValue: 3 + - name: var_2 + literal: + floatValue: 2.2 + - name: var_3 + literal: + boolValue: true + - name: var_4 + literal: + stringValue: stringVal + + ``` +* Jsonpath + Value of variable is obtained from request/model response payload by specifying jsonpath value, e.g + ``` + - variables: + - name: var_5 + jsonPathConfig: + jsonPath: $.rating + defaultValue: -1 + valueType: INT + - name: var_6 + jsonPath: $rating # deprecated + ``` +* Expression + Value of variable is obtained from expression, e.g + ``` + - variables: + - name: var_7 + jsonPathConfig: + jsonPath: $.customer_id + - name: var_8 + expression: var_7 + ``` + +### Encoders +In order to encode data in the transformation stage, we need to first define an encoder by giving it a name, and defining the associated configurations. + +The syntax of encoder declaration is as follows: +``` +- encoders: + + - name: #name of encoder 1 + + + + - name: #name of encoder 2 + + +``` + +There are 2 types of encoder currently available: + +Ordinal encoder: For mapping column values from one type to another + +Cyclical encoder: For mapping column values that have a cyclical significance. For example, Wind directions, time of day, days of week + +#### Ordinal Encoder Specification +The syntax to define an ordinal encoder is as follows: + +``` +ordinalEncoderConfig: + + defaultValue: #default value + + targetValueType: #target value type. i.e. INT, FLOAT, BOOL or STRING + + mapping: + + + + … + + +``` + +There are currently 4 types of target value supported. The following table shows the syntax to use for each type: + +|Value Type| Syntax | +|----------| ------------| +| Integer | INT | +| Float | FLOAT | +| Boolean | BOOL | +| String | STRING | + +See below for a complete example on how to declare an ordinal encoder + +``` +- encoders: + - name: vehicle_mapping + ordinalEncoderConfig: + defaultValue: '0' + targetValueType: INT + mapping: + suv: '1' + sedan: '2' +``` + +#### Cyclical Encoder Specification +Cyclical encoder are useful for encoding columns that has cyclical significance. By encoding such columns cyclically, you can ensure that the values representing the end of a cycle and the start of the next cycle does not jump abruptly. Some examples of such data are: + +- Hours of the day +- Days of the week +- Months in a year +- Wind direction +- Seasons +- Navigation Directions + +The syntax to define an cyclical encoder is as follows: +``` +cyclicalEncoderConfig: + +``` + +There are 2 ways to encode the column: + +1. By epoch time: Unix Epoch time is the number of seconds that have elapsed since January 1, 1970 (midnight UTC/GMT). By using this option, we assume that the time zone to encode in will be UTC. In order to use this option you only need to define the period of your cycle to encode. + +1. By range: This defines the base range of floating point values representing a cycle. For example, one might define wind directions to be in the range of 0 to 360 degrees, although the actual value may be >360 or <0. + +To encode by **epoch time**, use the following syntax: +``` +cyclicalEncoderConfig: + byEpochTime: + periodType: HOUR #HOUR, DAY, WEEK, MONTH, QUARTER, HALF, YEAR +``` + +Period type defines the time period of a cycle. For example, HOUR means that a new cycle begins every hour and DAY means that a new cycle begins every day. + +***NOTE: If you choose to encode by epoch time, the granularity is per seconds. If you need different granularity, you can modify the values in the epoch time column accordingly or choose to encode by range.*** + +To encode by **range**, use the following syntax: +``` +cyclicalEncoderConfig: + byRange: + min: FLOAT + max: FLOAT +``` +Do note that the min and max values are Float. The range is inclusive for the min and exclusive for the max, since in a cycle min and max will represent the same phase. For example, you can encode the days of a week in the range of [1, 8), where 8 and 1 both represents the starting point of a cycle. You can then represent Monday 12am as 1 and Sunday 12pm as 7.5 and so on. + +See below for complete examples on how to declare a cyclical encoder: + +*By epoch time:* +``` +-encoders: + -name: payday_trend + cyclicalEncoderConfig: + byEpochTime: + periodType: MONTH +``` + +*By range:* +``` +-encoders: + -name: wind_dir + cyclicalEncoderConfig: + byRange: + min: 0 + max: 360 +``` + +**Input/Output Examples** +By epoch time: Period of a day + +| col | col_x | col_y | remarks | +|------------|-------|-------|-------------------------| +| 1644278400 | 1 | 0 | 8 Feb 2022 00:00:00 UTC | +| 1644300000 | 0 | 1 | 8 Feb 2022 06:00:00 UTC | +| 1644451200 | -1 | 0 | 8 Feb 2022 12:00:00 UTC | +| 1644343200 | 0 | -1 | 8 Feb 2022 18:00:00 UTC | +| 1644364800 | 1 | 0 | 9 Feb 2022 00:00:00 UTC | +| 1644451200 | 1 | 0 | 10 Feb 2022 00:00:00 UTC| + +By range: 0 to 360 (For example wind directions) + +| col | col_x | col_y | +|--------|-------|-------| +| 0 | 1 | 0 | +| 90 | 0 | 1 | +| 180 | -1 | 0 | +| 270 | 0 | -1 | +| 360 | 1 | 0 | +| 420 | 0 | 1 | +| -90 | 0 | -1 | + +To learn more about cyclical encoding, you may find this page useful: [Cyclical Encoding](https://towardsdatascience.com/cyclical-features-encoding-its-about-time-ce23581845ca) + +### Autoload + +Autoload declares tables and variables that need to be loaded to standard transformer runtime from incoming request/response. This operation is only applicable for **upi_v1** protocol. Below is specification of autoload +```yaml +autoload: + tableNames: + - table_name_1 + - table_name_2 + variableNames: + - var_name_1 + - var_name_2 +``` +`tableNames` and `variableNames` are fields that list table name and variables declaration. If `autoload` is part of `preprocess` pipeline, it will try to load those declared table and variables from request payload, otherwise it will load from model response payload. + +## Transformation Stage + + In this stage, the standard transformers perform transformation to the tables created in the input stage so that its structure is suitable for the output. In the transformation stage, users operate mainly on tables and are provided with 2 transformation types: single table transformation and table join. Each transformation declared in this stage will be executed sequentially and all output/side effects from each transformation can be used in subsequent transformations. There are two types of transformations in standard transformer: + * Table Transformation + * Table Join + +### Table Transformation + +Table transformation performs transformation to a single input table and creates a new table. The transformation performed to the table is defined within the “steps” field and executed sequentially. + +``` + tableTransformation: + + inputTable: # name of the input table + + outputTable: # name of the output table + + steps: # list of transformation steps, it will be executed sequentially + + - + + - + +``` + +Following are the operation available for table transformation: + + +#### Drop Column +This operation will drop one or more column + +``` + tableTransformation: + + inputTable: myTable + + outputTable: myTransformedTable + + steps: + + - dropColumns: ["id"] +``` + +#### Select Column +This operation will reorder and optionally drop non-selected column + +``` + tableTransformation: + + inputTable: myTable + + outputTable: myTransformedTable + + steps: + + - selectColumns: ["lat", "lon", "total_trip"] +``` + +#### Sort Operation +This operation will sort the table using the defined column and ordering + +``` +tableTransformation: + inputTable: myTable + outputTable: myTransformedTable + steps: + - sort: + - column: id + order: ASC + - column: total_trip + order: DESC +``` + +#### Rename Columns +This operation will rename one column into another + +``` +tableTransformation: + inputTable: myTable + outputTable: myTransformedTable + steps: + - renameColumns: + "total_trip": "totalTrip" +``` + +#### Update Columns +Adding column or modifying column in-place using expressions + +``` +tableTransformation: + inputTable: myTable + outputTable: myTransformedTable + steps: + - updateColumns: + - column: "s2id" + expression: S2ID(myTable.Col('lat'), myTable.Col('lon'), 12) + - column: "col2" + conditions: + - rowSelector: myTable.Col('col1') * 2 > 10 + expression: myTable.Col('col1') + - default: + expression: -1 +``` + +There are two ways to update columns: +* Update all rows in the column. You need to specify `column`and `expression`. `column` determines which column to be updated and `expression` determines the value that will be used to update the column. +Value produced by the `expression` must be a scalar or a series that has the same length as the other columns. Following the example:: + ``` + - updateColumns: + - column: "customer_id" + expression: "cust_1" # the value is scalar and will be broadcasted to all the row + - column: "s2id" + expression: S2ID(myTable.Col('lat'), myTable.Col('lon'), 12) # the value is array or series that the length should be the same with the rest of the columns in a table + ``` +* Update subset of rows in the columns given some row selector condition. For this users can set multiple `rowSelector` with `expression` and also default value if none of conditions are match. For example users have following table + +| customer_id | customer_age | total_booking_1w | +| ----------- | ------------ | ---------------- | +| 1234 | 60 | 8 | +| 4321 | 23 | 4 | +| 1235 | 17 | 4 | + +Users want to create new column `customer_segment` with certain rules: +1. Customer that older than 55, the `customer_segment` will be `retired` +2. Customer that has age between 30 - 55, the `customer_segment` will be `matured` +3. Customer that has age between 22 - 30, the `customer_segment` will be `productive` +4. Customer that has age < 22, the `customer_segment` will be `non-productive` + +Based on those rules we can translate this to standard transformer config: +``` +tableTransformation: + inputTable: myTable + outputTable: myTransformedTable + steps: + - updateColumns: + - column: "customer_segment" + conditions: + - rowSelector: myTable.Col('customer_age') > 55 + expression: "retired" + - rowSelector: myTable.Col('customer_age') >= 30 + expression: "matured" + - rowSelector: myTable.Col('customer_age') >= 22 + expression: "productive" + - default: + expression: "non-productive" +``` +All `rowSelector` conditions are working like `if else` statement. `rowSelector` condition must be returning boolean or series of boolean, `default` will be executed if none of the `rowSelector` conditions are matched. + +#### Filter Row +Filter row is an operation that will filter rows in a table based on given condition. Suppose users have this following table +| customer_id | customer_age | total_booking_1w | +| ----------- | ------------ | ---------------- | +| 1234 | 60 | 8 | +| 4321 | 23 | 4 | +| 1235 | 17 | 4 | + +and users want to show only records that have `total_booking_1w` less than 5. To achieve that users need to use `filterRow` operation like below configuration: +``` +tableTransformation: + inputTable: myTable + outputTable: myTransformedTable + steps: + - filterRow: + condition: myTable.Col('total_booking_1w') < 5 +``` + +#### Slice Row +Slice row is an operation to slice a table based on start(lower bound) and end index(upper bound) that given by the user. The result includes starting index but excluding end index. Below is the example of this operation +``` +tableTransformation: + inputTable: myTable + outputTable: myTransformedTable + steps: + - sliceRow: + start: 0 + end: 4 +``` +Value of `start` end `end` can be null or negative. Following are the behaviour: +* Null value of `start` means that `start` value is 0 +* Null value of `end` means that `end` value is number of rows in a table +* Negative value of `start` or `end` means that the value will be (`number of row` + `start`) or (`number of row` + `end`). Suppose you set `start` -5 and `end` -1 and number of row is 10, so `start` value will be 5 and `end` will be 9 + +#### Encode Column +This operation will encode the specified columns with the specified encoder defined in the input step. + +``` +tableTransformation: + inputTable: myTable + outputTable: myTransformedTable + steps: + - encodeColumns: + - columns: + - vehicle + - previous_vehicle + encoder: vehicle_mapping +``` + +#### Scale Column +This operation will scale a specified column using scalers. At the moment 2 types of scalers are available: + * Standard Scaler + * Min-max Scaler + +Standard Scaler +In order to use a standard scaler, the mean and standard deviation (std) of the respective column to be scaled should be computed beforehand and provided in the specification. The syntax for scaling a column with a standard scaler is as follows: + +``` +tableTransformation: + inputTable: myTable + outputTable: myTransformedTable + steps: + - scaleColumns: + - column: rank + standardScalerConfig: + mean: 0.5 + std: 0.2 +``` + +Min-Max Scaler +In order to use a min-max scaler, the minimum and maximum value for the column to scale to must be defined in the specification. The syntax for scaling a column with a min-max scaler is as follows: + +``` +tableTransformation: + inputTable: myTable + outputTable: myTransformedTable + steps: + - scaleColumns: + - column: rating + minMaxScalerConfig: + min: 1 + max: 5 +``` + +### Join Operation +This operation joins 2 tables, as defined by “leftTable” and “rightTable” parameters, into 1 output table given a join column and method of join. The join column must exist in both the input tables. The available method of join are: + * Left join + * Concat Column + * Cross join + * Inner Join + * Outer join + * Right join + +``` +tableJoin: + leftTable: merchant_table + rightTable: customer_table + outputTable: merchant_customer_table + how: LEFT # LEFT, INNER , RIGHT, CROSS, OUTER, CONCAT_COLUMN + onColumn: merchant_id +``` + +## Output Stage +At this stage, both the preprocessing and postprocessing pipeline should create an output. The output of preprocessing pipeline will be used as the request payload to be sent as model request, whereas output of the postprocessing pipeline will be used as response payload to be returned to downstream service / client. +There are 3 types of output specifications: +* JSON Output. Applicable for **http_json** protocol and both preprocess and postprocess output +* UPIPreprocessOutput. Applicable only for **upi_v1** protocol and preprocess output +* UPIPostprocessOutput. Applicable only for **upi_v1** protocol and postprocess output + +### JSON Output - User-defined JSON template +Users are given freedom to specify the transformer’s JSON output structure. The syntax is as follows: + +``` +output: + +- jsonOutput: + + jsonTemplate: + + baseJson: # Base JSON Template, the value can be "fromJson" or "fromTable" + + fromJson: # Copy JSON object pointed by the source and jsonPath + + jsonPath: # Path to JSON field to be copied from in the source JSON + + fromTable: # Create json payload from a table + + tableName: # Source table name + + format: # json output format, possible format are (based on https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_json.html): RECORD, VALUES, SPLIT + + fields: # list of JSON field to be included, fields defined here will override JSON field in the base JSON + + - fieldName : # field name + + fromJson: # Copy json field from RAW_REQUEST / MODEL_RESPONSE + + jsonPath: # json path of the field to be copied from source JSON payload + + fromTable: # Create json from a table + + tableName: # Source table name + + format: # json output format, possible format are (based on https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_json.html): RECORD, VALUES, SPLIT + + expression: # populate the field with value from result of an expression + + - fieldName: + + fields: # it's also possible to have a nested JSON field + + - + +``` +Similar to the table creation specification, users can specify the “baseJson” as the base json structure and override it using “fields” configuration. + +The field_value above can be configured to retrieve from 3 sources: +* From JSON +* From Table +* From Expression + + +#### From JSON +In the example below, “output” field will be set to the “predictions” field from the model response. +``` +jsonOutput: + jsonTemplate: + fields: + - fieldName: output + fromJson: + jsonPath: $.model_response.predictions +``` + +#### From Table +Users can populate JSON fields using values from a table. The table can be rendered into 3 JSON formats: RECORD, VALUES, and SPLIT. +Note that if “fromTable” is used as “baseJson” it will use the table name as the json field. + +For example, given following customerTable: + +| customer_id | customer_age | total_booking_1w | +| ----------- | ------------ | ---------------- | +| 1234 | 34 | 8 | +| 4321 | 23 | 4 | +| 1235 | 17 | 4 | + +Depending on the json format, it will render different result JSON + +* RECORD Format +``` + outputStage: + jsonOutput: + jsonTemplate: + fields: + - fieldName: instances + fromTable: + tableName: customerTable + format: RECORD +``` +JSON Result: +``` + { + "instances" : [ + [ + { + "customer_id" : 1234, + "customer_age" : 34, + "total_booking_1w": 8 + }, + { + "customer_id" : 4321, + "customer_age" : 23, + "total_booking_1w": 4 + }, + { + "customer_id" : 1235, + "customer_age" : 17, + "total_booking_1w": 4 + } + ] + ] + } +``` + +* VALUES Format +``` + outputStage: + jsonOutput: + jsonTemplate: + fields: + - fieldName: instances + fromTable: + tableName: customerTable + format: VALUES +``` +JSON Result: +``` + { + "instances":[ + [ + [1234, 34, 8], + [4321, 23, 4], + [1235, 17, 4] + ] + ] + } +``` + +* SPLIT Format +``` + outputStage: + jsonOutput: + jsonTemplate: + fields: + - fieldName: instances + fromTable: + tableName: customerTable + format: SPLIT +``` +JSON Result: +``` + { + "instances" : { + "data": [ + [1234, 34, 8], + [4321, 23, 4], + [1235, 17, 4] + ], + "columns" : ["customer_id", "customer_age", "total_booking_1w"] + } + } +``` + +### UPIPreprocessOutput +UPIPreprocessOutput is output specification only for **upi_v1** protocol and preprocess step. This output specification will create operation that convert defined tables to UPI request interface. +Below is the specification + +```yaml +upiPreprocessOutput: + predictionTableName: table_1 + transformerInputTableNames: + - input_table_1 + - input_table_2 +``` + +This specification will convert content of `predictionTableName` into UPI table +``` +message Table { + string name = 1; + repeated Column columns = 2; + repeated Row rows = 3; +} +message Column { + string name = 1; + Type type = 2; +} +message Row { + string row_id = 1; + repeated Value values = 2; +} +message Value { + double double_value = 1; + int64 integer_value = 2; + string string_value = 3; + bool is_null = 10; +} +``` +and then set field `prediction_table` from this UPI Request interface +``` +message PredictValuesRequest { + Table prediction_table = 1; + TransformerInput transformer_input = 4; + string target_name = 2; + repeated Variable prediction_context = 3; + + RequestMetadata metadata = 10; +} +``` +`transformerInputTableNames` are list of table names that will be converted into UPI Table. These values will be assigned into field `transformer_input`.`tables` field. +The rest of the fields will be carried on from the incoming reques payload. + +### UPIPostprocessOutput +UPIPostprocessOutput is output specification only for **upi_v1** protocol and postprocess step. This output specification will create operation that convert defined tables to UPI response interface. +Below is the specification + +```yaml +upiPostprocessOutput: + predictionResultTableName: table_name_1 +``` +This specification will convert content of `predictionResultTableName` into UPI table and assigned it to field `prediction_result_table` in this UPI Response interface like below +``` +message PredictValuesResponse { + Table prediction_result_table = 1; + string target_name = 2; + repeated Variable prediction_context = 3; + ResponseMetadata metadata = 10; +} +``` +The rest of the fields will be carried on from model predictor response + +### Deploy Standard Transformer using Merlin UI + +Once you logged your model and it’s ready to be deployed, you can go to the model deployment page. + +Here’s the short video demonstrating how to configure the Standard Transformer: + +![Configure Standard Transformer](../../../../images/configure_standard_transformer.gif) + +1. As the name suggests, you must choose **Standard Transformer** as Transformer Type. +2. The **Retrieval Table** panel will be displayed. This panel is where you configure the Feast Project, Entities, and Features to be retrieved. + 1. The list of Feast Entity depends on the selected Feast Project + 2. Similarly, the list of Feast Feature also depends on the configured entities +3. You can have multiple Retrieval Table that can retrieve a different kind of entities and features and enrich the request to your model at once. To add it, simply click `Add Retrieval Table`, and new Retrieval Table panel will be displayed and ready to be configured. +4. You can check the Transformer Configuration YAML specification by clicking `See YAML configuration`. You can copy and paste this YAML and use it for deployment using Merlin SDK. + 1. To read more about Transformer Configuration specification, please continue reading. +5. You can also specify the advanced configuration. These configurations are separated from your model. + 1. Request and response payload logging + 2. Resource request (Replicas, CPU, and memory) + 3. Environment variables (See supported environment variables below) + +### Deploy Standard Transformer using Merlin SDK + +Make sure you are using the supported version of Merlin SDK. + +```bash +> pip install merlin-sdk -U +> pip show merlin-sdk + +Name: merlin-sdk +Version: 0.10.0 +... +``` + +You need to pass `transformer` argument to the `merlin.deploy()` function to enable and deploy your standard transformer. + +{% code title="standard_transformer_deployment.py" overflow="wrap" lineNumbers="true" %} +```python +from merlin.resource_request import ResourceRequest +from merlin.transformer import StandardTransformer + +# Specify the path to transformer config YAML file +transformer_config_path = "transformer_config.yaml" + +# Create the transformer resources requests config +resource_request = ResourceRequest(min_replica=0, max_replica=1, + cpu_request="100m", memory_request="200Mi") + +# Create the transformer object +transformer = StandardTransformer(config_file=transformer_config_path, + enabled=True, + resource_request=resource_request) + +# Deploy the model alongside the transformer +endpoint = merlin.deploy(v, transformer=transformer) +``` +{% endcode %} + +### Standard Transformer Environment Variables + +Below are supported environment variables to configure your Transformer. + +| Name | Description | Default Value | +| ----------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- | ------------- | +| `LOG_LEVEL` | Set the logging level for internal system. It doesn’t effect the request-response logging. Supported value: DEBUG, INFO, WARNING, ERROR. | INFO | +| `FEAST_FEATURE_STATUS_MONITORING_ENABLED` | Enable metrics for the status of each retrieved feature. | false | +| `FEAST_FEATURE_VALUE_MONITORING_ENABLED` | Enable metrics for the summary value of each retrieved feature. | false | +| `FEAST_BATCH_SIZE` | Maximum number of entities values that will be passed as a payload to feast. For example if you want to get features from 75 entities values and FEAST_BATCH_SIZE is set to 50, then there will be 2 calls to feast, first call request features from 50 entities values and next call will request features from 25 entities values. | 50 | +| `FEAST_CACHE_ENABLED` | Enable cache response of feast request | true | +| `FEAST_CACHE_TTL` | Time to live cached features, if TTL is reached the cached will be expired. The value has format like this [$number][$unit] e.g 60s, 10s, 1m, 1h | 60s| +| `CACHE_SIZE_IN_MB` | Maximum capacity of cache from allocated memory. Size is in MB | 100 | +| `FEAST_REDIS_DIRECT_STORAGE_ENABLED` | Enable features retrieval by querying direcly from redis | false | +| `FEAST_REDIS_POOL_SIZE` | Number of redis connection established in one replica of standard transformer | 10 | +| `FEAST_REDIS_READ_TIMEOUT` | Timeout for read commands from redis. If reached commands will fails | 3s | +| `FEAST_REDIS_WRITE_TIMEOUT` | Timeout for write commands to redis. If reached commands will fails | 3s | +| `FEAST_BIGTABLE_DIRECT_STORAGE_ENABLED` | Enable features retrieval by querying direcly from bigtable | false | +| `FEAST_BIGTABLE_POOL_SIZE` | Number of bigtable grpc connections established in one replica of standard transformer | +| `FEAST_TIMEOUT` | Timeout of feast request | 1s +| `FEAST_HYSTRIX_MAX_CONCURRENT_REQUESTS` | Maximum concurrent requests when calling feast | 100 +| `FEAST_HYSTRIX_REQUEST_VOLUME_THRESHOLD` | Threshold of error percentage, once breached circuit will be open | 100 +| `FEAST_HYSTRIX_SLEEP_WINDOW` | Sleep window is duration of rejecting calling feast once the circuit is open | 1s +| `FEAST_HYSTRIX_ERROR_PERCENT_THRESHOLD` | Threshold of number of request to model predictor | 25 +| `FEAST_SERVING_KEEP_ALIVE_ENABLED` | Flag to enable feast keep alive | true +| `FEAST_SERVING_KEEP_ALIVE_TIME` | Duration of interval between keep alive PING | 60s +| `FEAST_SERVING_KEEP_ALIVE_TIMEOUT` | Duration of PING that considered as TIMEOUT | 5s +| `MERLIN_DISABLE_LIVENESS_PROBE` | Disable liveness probe of transformer if set to true | | +| `MODEL_TIMEOUT` | Timeout duration of model prediction | 1s | +| `MODEL_HYSTRIX_MAX_CONCURRENT_REQUESTS` | Maximum concurrent requests when calling model predictor | 100 +| `MODEL_HYSTRIX_ERROR_PERCENTAGE_THRESHOLD` | Threshold of error percentage, once breached circuit will be open | 25 +| `MODEL_HYSTRIX_REQUEST_VOLUME_THRESHOLD` | Threshold of number of request to model predictor | 100 +| `MODEL_HYSTRIX_SLEEP_WINDOW_MS` | Sleep window is duration of rejecting calling model predictor once the circuit is open | 10 +| `MODEL_GRPC_KEEP_ALIVE_ENABLED` | Flag to enable UPI_V1 model predictor keep alive | false +| `MODEL_GRPC_KEEP_ALIVE_TIME` | Duration of interval between keep alive PING | 60s +| `MODEL_GRPC_KEEP_ALIVE_TIMEOUT` | Duration of PING that considered as TIMEOUT | 5s \ No newline at end of file diff --git a/docs/user/templates/model_deployment/transformer/02_custom_transformer.md b/docs/user/templates/model_deployment/transformer/02_custom_transformer.md new file mode 100644 index 000000000..2a5bb88ce --- /dev/null +++ b/docs/user/templates/model_deployment/transformer/02_custom_transformer.md @@ -0,0 +1,38 @@ + + +# Custom Transformer + +In 0.8 release, Merlin adds support to the Custom Transformer deployment. This transformer type enables the users to deploy their own pre-built Transformer service. The user should develop, build, and publish their own Transformer Docker image. + +Similar to Standard Transformer, users can configure Custom Transformer from UI and SDK. The difference is instead of specifying the standard transformer configuration, users configure the Docker image and the command and arguments to run it. + +### Deploy Custom Transformer using Merlin UI + +1. As the name suggests, you must choose Custom Transformer as Transformer Type. +2. Specify the Docker image registry and name. + 1. You need to push your Docker image into supported registries: public DockerHub repository and private GCR repository. +3. If your Docker image needs command or arguments to start, you can specify them on related input form. +4. You can also specify the advanced configuration. These configurations are separated from your model. + 1. Request and response payload logging + 2. Resource request (Replicas, CPU, and memory) + 3. Environment variables + +### Deploy Custom Transformer using Merlin SDK + +{% code title="custom_transformer_deployment.py" overflow="wrap" lineNumbers="true" %} +```python +from merlin.resource_request import ResourceRequest +from merlin.transformer import Transformer + +# Create the transformer resources requests config +resource_request = ResourceRequest(min_replica=0, max_replica=1, + cpu_request="100m", memory_request="200Mi") + +# Create the transformer object +transformer = Transformer("gcr.io//", + resource_request=resource_request) + +# Deploy the model alongside the transformer +endpoint = merlin.deploy(v, transformer=transformer) +``` +{% endcode %} \ No newline at end of file diff --git a/docs/user/templates/model_deployment/transformer/standard_transformer/01_standard_transformer_expressions.md b/docs/user/templates/model_deployment/transformer/standard_transformer/01_standard_transformer_expressions.md new file mode 100644 index 000000000..f1d178307 --- /dev/null +++ b/docs/user/templates/model_deployment/transformer/standard_transformer/01_standard_transformer_expressions.md @@ -0,0 +1,958 @@ + + +# Standard Transformer Expressions + +Standard Transformer provides several built-in functions that are useful for common ML use-cases. These built-in functions are accessible from within expression context. + +| Categories | Functions | +| ---------- | -------------------------------------------------------------| +| Geospatial | [Geohash](#geohash) | +| Geospatial | [S2ID](#s2id) | +| Geospatial | [HaversineDistance](#haversinedistance) | +| Geospatial | [HaversineDistanceWithUnit](#haversinedistancewithunit) | +| Geospatial | [PolarAngle](#polarangle) | +| Geospatial | [GeohashDistance](#geohashdistance) | +| Geospatial | [GeohashAllNeighbors](#geohashallneighbors) | +| Geospatial | [GeohashNeighborForDirection](#geohashneighborfordirection) | +| JSON | [JsonExtract](#jsonextract) | +| Statistics | [CumulativeValue](#cumulativevalue) | +| Time | [Now](#now) | +| Time | [DayOfWeek](#dayofweek) | +| Time | [IsWeekend](#isweekend) | +| Time | [FormatTimestamp](#formattimestamp) | +| Time | [ParseTimestamp](#parsetimestamp) | +| Time | [ParseDateTime](#parsedatetime) | +| Series | [Get](#get) | +| Series | [IsIn](#isin) | +| Series | [StdDev](#stddev) | +| Series | [Mean](#mean) | +| Series | [Median](#median) | +| Series | [Max](#max) | +| Series | [MaxStr](#maxstr) | +| Series | [Min](#min) | +| Series | [MinStr](#minstr) | +| Series | [Quantile](#quantile) | +| Series | [Sum](#sum) | +| Series | [Flatten](#flatten) | +| Series | [Unique](#unique) | + + +## Geospatial + +### Geohash + +Geohash calculates geohash of `latitude` and `longitude` with the given `precision`. + +#### Input + +| Name | Description | +| --------- | ----------------------------------------------------------------- | +| Latitude | Latitude of the object, in form of JSONPath, array, or variable. | +| Longitude | Longitude of the object, in form of JSONPath, array, or variable. | +| Precision | Character precision in integer. | + +#### Output + +`Geohash of location with the given precision.` + +#### Example + +``` +Input: +{ + "latitude": 1.0, + "longitude": 2.0 +} + +Standard Transformer Config: +variables: +- name: geohash + expression: Geohash("$.latitude", "$.longitude", 12) + +Output: `"s01mtw037ms0"` +``` + +### S2ID + +S2ID calculates S2ID cell of `latitude` and `longitude` with the given `level`. + +#### Input + +| Name | Description | +| --------- | ----------------------------------------------------------------- | +| Latitude | Latitude of the object, in form of JSONPath, array, or variable. | +| Longitude | Longitude of the object, in form of JSONPath, array, or variable. | +| Level | S2ID level in integer. | + +#### Output + +`S2ID cell of the location in certain level.` + +#### Example + +``` +Input: +{ + "latitude": 1.0, + "longitude": 2.0 +} + +Standard Transformer Config: +variables: +- name: s2id + expression: S2ID("$.latitude", "$.longitude", 12) + +Output: `"1154732743855177728"` +``` + +### HaversineDistance + +HaversineDistance calculates Haversine distance of two points (given by their latitude and longitude). + +#### Input + +| Name | Description | +| ----------- | ----------------------------------------------------------------------- | +| Latitude 1 | Latitude of the first point, in form of JSONPath, array, or variable. | +| Longitude 1 | Longitude of the first point, in form of JSONPath, array, or variable. | +| Latitude 2 | Latitude of the second point, in form of JSONPath, array, or variable. | +| Longitude 2 | Longitude of the second point, in form of JSONPath, array, or variable. | + +#### Output + +`The haversine distance between 2 points in kilometer.` + +#### Example + +``` +Input: +{ + "pickup": { + "latitude": 1.0, + "longitude": 2.0 + }, + "dropoff": { + "latitude": 1.2, + "longitude": 2.2 + } +} + +Standard Transformer Config: +variables: +- name: haversine_distance + expression: HaversineDistance("$.pickup.latitude", "$.pickup.longitude", "$.dropoff.latitude", "$.dropoff.longitude") +``` + +### HaversineDistanceWithUnit + +HaversineDistanceWithUnit calculates Haversine distance of two points (given by their latitude and longitude) and given the distance unit + +#### Input + +| Name | Description | +| ----------- | ----------------------------------------------------------------------- | +| Latitude 1 | Latitude of the first point, in form of JSONPath, array, or variable. | +| Longitude 1 | Longitude of the first point, in form of JSONPath, array, or variable. | +| Latitude 2 | Latitude of the second point, in form of JSONPath, array, or variable. | +| Longitude 2 | Longitude of the second point, in form of JSONPath, array, or variable. | +| Distance Unit | Unit of distance measurement, supported unit `km` and `m` | + +#### Output + +`The haversine distance between 2 points.` + +#### Example + +``` +Input: +{ + "pickup": { + "latitude": 1.0, + "longitude": 2.0 + }, + "dropoff": { + "latitude": 1.2, + "longitude": 2.2 + } +} + +Standard Transformer Config: +variables: +- name: haversine_distance + expression: HaversineDistanceWithUnit("$.pickup.latitude", "$.pickup.longitude", "$.dropoff.latitude", "$.dropoff.longitude", "m") +``` + +### PolarAngle + +PolarAngle calculates polar angles between two points (given by their latitude and longitude) in radian. + +#### Input + +| Name | Description | +| ----------- | ----------------------------------------------------------------------- | +| Latitude 1 | Latitude of the first point, in form of JSONPath, array, or variable. | +| Longitude 1 | Longitude of the first point, in form of JSONPath, array, or variable. | +| Latitude 2 | Latitude of the second point, in form of JSONPath, array, or variable. | +| Longitude 2 | Longitude of the second point, in form of JSONPath, array, or variable. | + +#### Output + +`The polar angles between 2 points in radian.` + +#### Example + +``` +Input: +{ + "pickup": { + "latitude": 1.0, + "longitude": 2.0 + }, + "dropoff": { + "latitude": 1.2, + "longitude": 2.2 + } +} + +Standard Transformer Config: +variables: +- name: polar_angle + expression: PolarAngle("$.pickup.latitude", "$.pickup.longitude", "$.dropoff.latitude", "$.dropoff.longitude") +``` + +### GeohashDistance +GeohashDistance will calculate haversine distance between two geohash. It will convert a geohash into the center point (latitude, longitude) of that geohash and calculate haversine distance based on that point. + +#### Input + +| Name | Description | +|---------------|----------------------------------------------------------| +| Geohash 1 | First geohash, in form of JSONPath, array | +| Geohash 2 | Second geohash, in form of JSONPath, array | +| Distance Unit | Unit measurement of distance, supported unit `km` and `m`| + +#### Output +`Haversine Distance between two geohash calculated from the center point of that geohash` + +#### Example +``` +Input: +{ + "pickup_geohash": "qqgggnwxx", + "dropoff_geohash": "qqgggnweb" +} + +Standard Transformer Config: +variables: +- name: geohash_distance + expression: GeohashDistance("$.pickup_geohash", "$.dropoff_geohash", "m") +``` + +### GeohashAllNeighbors + +GeohashAllNeighbors will find all neighbors of geohash from all directions + +#### Input + +| Name | Description | +|---------------|----------------------------------------------------------| +| Geohash 1 | Geohash , in form of JSONPath, array | + +#### Output +`List of neighbors of given geohash` + +#### Example +``` +Input: +{ + "pickup_geohash": "qqgggnwxx", + "dropoff_geohash": "qqgggnweb" +} + +Standard Transformer Config: +variables: +- name: geohash_distance + expression: GeohashAllNeighbors("$.pickup_geohash") +``` + +### GeohashNeighborForDirection + +GeohashNeighborForDirection will find a neighbor of geohash given the direction + +#### Input + +| Name | Description | +|---------------|----------------------------------------------------------| +| Geohash 1 | Geohash , in form of JSONPath, array | +| Direction | Direction of that neighbor relatively from geohash. List of accepted direction `north`, `northeast`, `northwest`, `south`, `southeast`, `southwest`, `west`, `east`| + +#### Output +`Neighbor of given geohash` + +#### Example +``` +Input: +{ + "pickup_geohash": "qqgggnwxx", + "dropoff_geohash": "qqgggnweb" +} + +Standard Transformer Config: +variables: +- name: geohash_distance + expression: GeohashNeighborForDirection("$.pickup_geohash", "north") +``` + +## JSON + +### JsonExtract + +Given a JSON string as value, you can use JsonExtract to extract JSON value from that JSON string. + +#### Input + +| Name | Description | +| ----------------- | ------------------------------------------------------------------------------------ | +| Parent's JSONPath | Path to JSON key that its value is a JSON string to be extracted. | +| Nested's JSONPath | Path to JSON key inside of JSON string which extracted from Parent's JSONPath above. | + +#### Output + +`JSON value within a JSON string pointed by the first JSONPath argument.` + +#### Example + +``` +Input: +{ + "details": "{\"merchant_id\": 9001}" +} + +Standard Transformer Config: +variables: +- name: merchant_id + valueType: STRING + expression: JsonExtract("$.details", "$.merchant_id") + +Output: `"9001"` +``` + +## Statistics + +### CumulativeValue + +CumulativeValue is a function that accumulates values based on the index and its predecessors. E.g., `[1, 2, 3] => [1, 1+2, 1+2+3] => [1, 3, 6]`. + +#### Input + +| Name | Description | +| ------ | ----------------- | +| Values | Array of numbers. | + +#### Output + +`Array of cumulative values.` + +#### Example + +``` +Input: +{ + "fares": [10000, 20000, 50000] +} + +Standard Transformer Config: +variables: +- name: cumulative_fares + expression: CumulativeValue($.fares) + +Output: `[10000, 30000, 80000]` +``` + +## Time + +### Now + +Return current local timestamp. + +#### Input + +`None` + +#### Output + +`Current local timestamp.` + +#### Example + +``` +Standard Transformer Config: +variables: +- name: currentTime + expression: Now() +``` + +### DayOfWeek + +Return number representations of the day in a week, given the timestamp and timezone. + +SUNDAY(0), MONDAY(1), TUESDAY(2), WEDNESDAY(3), THURSDAY(4), FRIDAY(5), SATURDAY(6). + +#### Input + +| Name | Description | +| --------- | ------------------------------------------------------------------------------------------------ | +| Timestamp | Unix timestamp value in integer or string format. It accepts JSONPath, arrays, or variable. | +| Timezone | Timezone value in string. For example, `Asia/Jakarta`. It accepts JSONPath, arrays, or variable. | + +#### Output + +`Day number.` + +#### Example + +``` +Input: +{ + "timestamp": "1637605459" +} + +Standard Transformer Config: +variables: +- name: day_of_week + expression: DayOfWeek("$.timestamp", "Asia/Jakarta") + +Output: `2` +``` + +### IsWeekend + +Return 1 if given timestamp is weekend (Saturday or Sunday), otherwise 0. + +#### Input + +| Name | Description | +| --------- | ------------------------------------------------------------------------------------------------ | +| Timestamp | Unix timestamp value in integer or string format. It accepts JSONPath, arrays, or variable. | +| Timezone | Timezone value in string. For example, `Asia/Jakarta`. It accepts JSONPath, arrays, or variable. | + +#### Output + +`1 if weekend, 0 if not` + +#### Example + +``` +Input: +{ + "timestamp": "1637445044", + "timezone": "Asia/Jakarta" +} + +Standard Transformer Config: +variables: +- name: is_weekend + expression: IsWeekend("$.timestamp", "$.timezone") + +Output: `1` +``` + +### FormatTimestamp + +FormatTimestamp converts timestamp in given location into formatted date time string. + +#### Input + +| Name | Description | +| --------- | ------------------------------------------------------------------------------------------------------- | +| Timestamp | Unix timestamp value in integer or string format. It accepts JSONPath, arrays, or variable. | +| Timezone | Timezone value in string. For example, `Asia/Jakarta`. It accepts JSONPath, arrays, or variable. | +| Format | Targetted date time format. It follows Golang date time format (https://pkg.go.dev/time#pkg-constants). | + +#### Output + +`Date time.` + +#### Examples + +``` +Input: +{ + "timestamp": "1637691859" +} + +Standard Transformer Config: +variables: +- name: datetime + expression: FormatTimestamp("$.timestamp", "Asia/Jakarta", "2006-01-02") + +Output: `"2021-11-24"` +``` + +### ParseTimestamp + +ParseTimestamp converts timestamp in integer or string format to time. + +#### Input + +| Name | Description | +| --------- | ------------------------------------------------------------------------------------------- | +| Timestamp | Unix timestamp value in integer or string format. It accepts JSONPath, arrays, or variable. | + +#### Output + +`Parsed timestamp.` + +#### Examples + +``` +Input: +{ + "timestamp": "1619541221" +} + +Standard Transformer Config: +variables: +- name: parsed_timestamp + expression: ParseTimestamp("$.timestamp") + +Output: `"2021-04-27 16:33:41 +0000 UTC"` +``` + +### ParseDateTime + +ParseDateTime converts datetime given with specified format layout (e.g. RFC3339) into time. + +#### Input + +| Name | Description | +| --------- | --------------------------------------------------------------------------------------------------- | +| Date time | Date time value in string format. It accepts JSONPath, arrays, or variable. | +| Timezone | Timezone value in string. For example, `Asia/Jakarta`. It accepts JSONPath, arrays, or variable. | +| Format | Date time input format. It follows Golang date time format (https://pkg.go.dev/time#pkg-constants). | + +#### Output + +`Parsed date time.` + +#### Examples + +``` +Input: +{ + "datetime": "2021-11-30 15:00:00", + "location": "Asia/Jayapura" +} + +Standard Transformer Config: +variables: +- name: parsed_datetime + expression: ParseDateTime("$.datetime", "$.location", "2006-01-02 15:04:05") + +Output: `"2021-11-30 15:00:00 +0900 WIT"` +``` + + +## Series Expression +Series expression is function that can be invoked by series (column) values in a table + +### Get +`Get` will retrieve a row in series based on the given index + +#### Input +| Name | Description | +|------|-------------| +| Index| Position of rows starts with 0 | + +#### Output +Single series row + +#### Examples +Suppose users have table `yourTableName` + +| restaurant_id | avg_order_1_day | avg_cancellation_rate_30_day | +|---------------|-----------------|------------------------------| +| 1 | 2000 | 0.02 | +| 2 | 3000 | 0.005 | +| 3 | 4000 | 0.006 | + +Users try to retrieve index 2 for series `avg_order_1_day` + +Standard Transformer Config: +``` +variables: +- name: total_order_1_day + expression: yourTableName.Col("avg_order_1_day").Get(2) +``` + +Output: 4000 + +### IsIn +`IsIn` checks whether value in a row is part of the given array, the result will be a new series that has boolean type + +#### Input +| Name | Description | +|------|-------------| +| Comparator| Array of value | + +#### Output +New Series that has boolean type and same dimension with original series + +#### Examples +Suppose users have table `yourTableName` + +| restaurant_id | avg_order_1_day | avg_cancellation_rate_30_day | +|---------------|-----------------|------------------------------| +| 1 | 2000 | 0.02 | +| 2 | 3000 | 0.005 | +| 3 | 4000 | 0.006 | + +Standard Transformer Config: +``` +variables: +- name: bool_series + expression: yourTableName.Col("avg_order_1_day").IsIn([2000, 3000]) +``` +Output: +| bool_series | +|----------------| +| true | +| true | +| false | + +### StdDev +`StdDev` is a function to calculate standard deviation from series values. The output will be single value + +#### Input +No Input + +#### Output +Single value with float type + +#### Examples + +Suppose users have table `yourTableName` + +| restaurant_id | avg_order_1_day | avg_cancellation_rate_30_day | +|---------------|-----------------|------------------------------| +| 1 | 2000 | 0.02 | +| 2 | 3000 | 0.005 | +| 3 | 4000 | 0.006 | + + +Standard Transformer Config: +``` +variables: +- name: std_dev + expression: yourTableName.Col("avg_cancellation_rate_30_day").StdDev() +``` +Output: 0.0068475461947247 + +### Mean +`Mean` is a function to calculate mean value from series values. The output will be single value + +#### Input +No Input + +#### Output +Single value with float type + +#### Examples + +Suppose users have table `yourTableName` + +| restaurant_id | avg_order_1_day | avg_cancellation_rate_30_day | +|---------------|-----------------|------------------------------| +| 1 | 2000 | 0.02 | +| 2 | 3000 | 0.005 | +| 3 | 4000 | 0.006 | + +Standard Transformer Config: +``` +variables: +- name: mean + expression: yourTableName.Col("avg_order_1_day").Mean() +``` +Output: 3000 + +### Median + +`Median` is a function to calculate median value from series values. The output will be single value + +#### Input +No Input + +#### Output +Single value with float type + +#### Examples + +Suppose users have table `yourTableName` + +| restaurant_id | avg_order_1_day | avg_cancellation_rate_30_day | +|---------------|-----------------|------------------------------| +| 1 | 2000 | 0.02 | +| 2 | 3000 | 0.005 | +| 3 | 4000 | 0.006 | + +Standard Transformer Config: +``` +variables: +- name: median + expression: yourTableName.Col("avg_order_1_day").Median() +``` +Output: 3000 + +### Max + +`Max` is a function to find max value from series values. The output will be single value + +#### Input +No Input + +#### Output +Single value with float type + +#### Examples + +Suppose users have table `yourTableName` + +| restaurant_id | avg_order_1_day | avg_cancellation_rate_30_day | +|---------------|-----------------|------------------------------| +| 1 | 2000 | 0.02 | +| 2 | 3000 | 0.005 | +| 3 | 4000 | 0.006 | + +Standard Transformer Config: +``` +variables: +- name: max + expression: yourTableName.Col("avg_order_1_day").Max() +``` +Output: 4000 + +### MaxStr + +`MaxStr` is a function to find max value from series values. The output will be single value in string type + +#### Input +No Input + +#### Output +Single value with string type + +#### Examples + +Suppose users have table `yourTableName` + +| restaurant_id | avg_order_1_day | avg_cancellation_rate_30_day | +|---------------|-----------------|------------------------------| +| 1 | 2000 | 0.02 | +| 2 | 3000 | 0.005 | +| 3 | 4000 | 0.006 | + +Standard Transformer Config: +``` +variables: +- name: max_str + expression: yourTableName.Col("avg_order_1_day").MaxStr() +``` +Output: "4000" + +### Min + +`Min` is a function to find minimum value from series values. The output will be single value in float type + +#### Input +No Input + +#### Output +Single value with float type + +#### Examples + +Suppose users have table `yourTableName` + +| restaurant_id | avg_order_1_day | avg_cancellation_rate_30_day | +|---------------|-----------------|------------------------------| +| 1 | 2000 | 0.02 | +| 2 | 3000 | 0.005 | +| 3 | 4000 | 0.006 | + +Standard Transformer Config: +``` +variables: +- name: min + expression: yourTableName.Col("avg_order_1_day").Min() +``` +Output: 2000 + +### MinStr + +`MinStr` is a function to find minimum value from series values. The output will be single value in string type + +#### Input +No Input + +#### Output +Single value with string type + +#### Examples + +Suppose users have table `yourTableName` + +| restaurant_id | avg_order_1_day | avg_cancellation_rate_30_day | +|---------------|-----------------|------------------------------| +| 1 | 2000 | 0.02 | +| 2 | 3000 | 0.005 | +| 3 | 4000 | 0.006 | + +Standard Transformer Config: +``` +variables: +- name: min_str + expression: yourTableName.Col("avg_order_1_day").MinStr() +``` +Output: "2000" + +### Quantile + +`Quantile` is a function to returns the sample of x such that x is greater than or equal to the fraction p of samples + +#### Input +Fraction in float type + +#### Output +Single value with float type + +#### Examples + +Suppose users have table `yourTableName` + +| rank | +|-------| +| 1 | +| 2 | +| 3 | +| 4 | +| 5 | +| 6 | +| 7 | +| 8 | +| 9 | +| 10 | + +Standard Transformer Config: +``` +variables: +- name: quantile_0.9 + expression: yourTableName.Col("rank").Quantile(0.9) +``` +Output: 9 + +### Sum + +`Sum` is a function to sum all the values in the seriess. The output will be single value in float type + +#### Input +No Input + +#### Output +Single value with float type + +#### Examples + +Suppose users have table `yourTableName` + +| restaurant_id | avg_order_1_day | avg_cancellation_rate_30_day | +|---------------|-----------------|------------------------------| +| 1 | 2000 | 0.02 | +| 2 | 3000 | 0.005 | +| 3 | 4000 | 0.006 | + +Standard Transformer Config: +``` +variables: +- name: sum + expression: yourTableName.Col("avg_order_1_day").Sum() +``` +Output: 9000 + +### Flatten +`Flatten` is a function to flatten all values in a series, this is suitable for series that has list type, for non list the result will be the same with the original seriess + +#### Input +No Input + +#### Output +New Series that the value already flatten + +#### Examples + +Suppose users have table `yourTableName` + +| restaurant_id | nearby_restaurant_ids | +|---------------|-----------------| +| 1 | [2, 3, 4] | +| 2 | [4, 5, 6] | +| 3 | [7, 8, 9] | + +Standard Transformer Config: +``` +variables: +- name: restaurant_ids + expression: yourTableName.Col("nearby_restaurant_ids").Flatten() +``` +Output: +| restaurant_ids | +|----------------| +| 2 | +| 3 | +| 4 | +| 4 | +| 5 | +| 6 | +| 7 | +| 8 | +| 9 | + +### Unique +`Unique` is a function to return all values without duplication. +#### Input +No Input + +#### Output +New Series that has unique value for each row + +#### Examples + +Suppose users have table `yourTableName` + +| restaurant_id | rating | +|---------------|-----------------| +| 1 | [2, 2, 4] | +| 2 | [4, 5, 4] | +| 1 | [2, 2, 4] | + +Standard Transformer Config: +``` +variables: +- name: unique_restaurant_id + expression: yourTableName.Col("restaurant_id").Unique() +``` +Output: +| unique_restaurant_id | +|----------------| +| 1 | +| 2 | + +``` +variables: +- name: rating + expression: yourTableName.Col("rating").Unique() +``` +Output: +| rating | +|--------| +| [2, 2, 4] | +| [4, 5, 4] | \ No newline at end of file diff --git a/docs/user/templates/model_deployment/transformer/standard_transformer/02_standard_transformer_upi.md b/docs/user/templates/model_deployment/transformer/standard_transformer/02_standard_transformer_upi.md new file mode 100644 index 000000000..32c7322b6 --- /dev/null +++ b/docs/user/templates/model_deployment/transformer/standard_transformer/02_standard_transformer_upi.md @@ -0,0 +1,127 @@ + + +# Configuring Standard Transformer for UPI Model + +{% hint style="info" %} +This guide assumes you have experience using standard transformer and are familiar with UPI contract. You can refer to https://github.com/caraml-dev/universal-prediction-interface to get details on the contract. +{% endhint %} + +There are 2 key differences in Standard Transformer when it’s deployed using UPI protocol: + +1. Autoload Feature +2. Separate Output Operation for Pre-process and Post-process + +## Autoload Feature + +Autoload feature is the primary mechanism for importing values in the request payload as variables or tables into standard transformer. Previously, it is done by using JSONPath query in the HTTP mode. + +For example in HTTP model, if you want to declare a rating variable that should use value from `user_rating` field of the below incoming request + +```json +{ + "user_id": 12345, + "user_rating": 4.9, + "user_name": "jon_doe" +} +``` + +then you have to declare following configuration in the input configuration of the standard transformer which will extract the data from incoming request. +The drawback of this approach is that it could be complicated for a more complex request payload and for large number of variable/table to be imported. + +```yaml + - variables: + - name: rating + jsonPathConfig: + jsonPath: $.user_rating + defaultValue: -1 + valueType: FLOAT +``` + +You can avoid it altogether by using autoload feature in UPI. To do so: + +#### Store the variable/table in `prediction_table` or `transformer_input` field of the `PredictValuesRequest` + +For example, when using Python SDK, you can do so by following code. +In below example, we are storing `user_rating` as variable and `customer_df` as `customer_table` in `transformer_input`, as well as sending the `prediction_df` as `prediction_table`. + +{% code title="upi_standard_transformer_deployment.py" overflow="wrap" lineNumbers="true" %} +```python +from caraml.upi.v1 import type_pb2, upi_pb2_grpc, upi_pb2, variable_pb2 + +request = upi_pb2.PredictValuesRequest( + # ... + prediction_table=df_to_table(predict_df, "prediction_table"), + transformer_input=upi_pb2.TransformerInput( + variables=[ + variable_pb2.Variable(name="user_rating", + type=type_pb2.TYPE_DOUBLE, + double_value=5.0), + ], + tables=[df_to_table(customer_df, "customer_table")] + ), + # ... +) +``` +{% endcode %} + +#### Add autoload feature in the standard transformer config. + +Add all variables and tables that are going to be imported in the standard transformer. +In below example we are importing `prediction_table`, `customer_table` , and `user_rating` that was sent by the client. + +![UPI Autoloading](../../../../../images/upi_autoloading_config.png) + +Which will add following config + +```yaml +transformerConfig: + preprocess: + inputs: + - autoload: + tableNames: + - customer_table + variableNames: + - user_rating + postprocess: {} +``` + +Note that, the table created using UPI autoload will have additional column `row_id` which will store the `row_ids` value of the associated table. +The imported tables and variables from UPI Autoload can then be used for downstream transformation in the standard transformer’s preprocess and post-process. + +## Preprocess & Postprocess Output + +Standard transformer’s preprocessing output in UPI mode must satisfy `PredictValuesRequest` structure. You can populate `prediction_table` field and tables in transformer_inputs fields of the `PredictValuesRequest` that will be sent to model by defining its source tables. The source tables must be tables that have been declared in preprocessing pipeline. + +Example below shows a preprocessing pipeline which join prediction_table and sample_table to produce preprocessed_table , and then use the preprocessed_table as the prediction_table of the `PredictValuesRequest` that is sent to model. + +![UPI Standard Transformer Preprocessing Output](../../../../../images/upi_preprocess_output.png) + +```yaml +transformerConfig: + preprocess: + inputs: + - autoload: + tableNames: + - prediction_table + variableNames: + - sample_table + transformations: + - tableJoin: + leftTable: prediction_table + rightTable: sample_table + outputTable: preprocessed_table + how: LEFT + onColumns: + - row_id + outputs: + - upiPreprocessOutput: + predictionTableName: preprocessed_table + transformerInputTableNames: [] + postprocess: {} +``` + +Similarly, post-processing output in UPI mode must satisfy `PredictValuesResponse`. You can populate prediction_result_table of the `PredictValuesResponse` that will be sent back to client by defining its source table. The source table can be a table declared both in preprocessing and post-processing pipeline. + +{% hint style="info" %} +When pre-processing or post-processing pipeline is not defined, standard transformer will simply forward the request/response to its receiver. +{% endhint %} diff --git a/docs/user/templates/model_types/01_custom_model.md b/docs/user/templates/model_types/01_custom_model.md new file mode 100644 index 000000000..6d5fe4694 --- /dev/null +++ b/docs/user/templates/model_types/01_custom_model.md @@ -0,0 +1,86 @@ + + +# Custom Model +Custom model enables users to deploy any docker image that satisfy merlin requirements. Users are responsible to develop their own web service, build and publish the docker image, which later on can be deployed through Merlin. + +Users should consider to use custom model, if they have one of the following conditions: +* Model needs custom complex transformations (preprocess and postprocess) and want to use other languages than Python. +* Using non standard model, e.g using heuristic or other ml framework model that have not been introduced in merlin. +* Having dependencies with some os distribution packages. + +## Comparison With PyFunc Model + +In high level PyFunc and custom model has similarity, they both enable users to specify custom logic and dependencies. The difference is mostly on the flexibility level and performance. + +| Factor | Custom Model | Pyfunc Model | +|--------|--------------|--------------| +| Web Service|
  • Users can use any tech stack for web service
  • Users need to implement whole web service
| Use python server, and users only need to modify core logic of prediction (infer function in this case) | +| Dependency | Users can specify any dependencies that is required. It can be os distribution package or library from specific programming language | Users can only specify python package dependencies | +| Performance | Users has more control on the performance of model. Since there is no limitation on tech stack that can be used | Users only has control on the infer function. Performance is rather slow because of the performance of python | + +## Web Service Implementation + +Users need to implement their own web service using any tech stack that suitable for their use case. Currently users can deploy web service using `HTTP_JSON` or `UPI_V1` protocol, both have different requirements that must be satisfied by the web server. + +### HTTP_JSON Custom Model +Users can add the artifact (model or anything else) in addition to the docker image when uploading the model. During the deployment, these artifacts will be made available in the directory specified by `CARAML_ARTIFACT_LOCATION` environment variable. + +Web service must open and listen to the port number given by `CARAML_HTTP_PORT` environment variable. + +Web service MUST implement the following endpoints: +| Endpoint | HTTP Method | Description| +|--------- |-------------|------------| +| `/v1/models/{model_name}:predict`| POST | For every inference or prediction calls, it will call this endpoint. Merlin will give the `CARAML_MODEL_FULL_NAME` environment variable, this value can be used as {model_name} for this endpoint. | +| `/v1/models/{model_name}` | GET | This endpoint will be used to check model healthiness. Model can serve after this API return 200 status code. | +| `/` | GET | This endpoint will be used as server liveness. Return 200 if the model is healthy. | +| `/metrics` | GET | This endpoint is used by prometheus to pull the metrics produced by the predictor. The implementation of this endpoint is handled by prometheus library, for example [this](https://prometheus.io/docs/guides/go-application/) is how to implement the endpoint with golang. + +### UPI_V1 Custom Model +Similar with `HTTP_JSON` custom model, users can add the artifact during model upload, and the uploaded artifacts will be available in the directory specified by `CARAML_ARTIFACT_LOCATION` environment variable. The web server must implement service that defined in the [UPI interface](https://github.com/caraml-dev/universal-prediction-interface/blob/main/proto/caraml/upi/v1/upi.proto#L11), also the server must open and listen to the port number given by `CARAML_GRPC_PORT` environment variable. + +If users want to emit metrics from this web server, they need to create scrape metrics REST endpoint. The challenge here, the knative (the underlying k8s deployment tools that merlin use) doesn't open multiple ports, hence the REST endpoint must be running on the same port as gRPC server (using port number given by `CARAML_GRPC_PORT`). Not every programming language can support running multiple protocol (gRPC and HTTP in this case) on the same port, for Go language users can use [cmux](https://github.com/soheilhy/cmux) to solve this problem, otherwise users can use push metrics to [pushgateway](https://prometheus.io/docs/instrumenting/pushing/) + +### Environment Variables +As mentioned in the previous section, there are several environment variables that will be supplied by Merlin control plane to the custom model. Below are the list of the variables + +| Name | Description | +|------|-------------| +| STORAGE_URI | Contains the URI where the `model` artifacts is remotely stored | +| CARAML_HTTP_PORT | Port that must be openend when the model is deployed with `HTTP_JSON` protocol | +| CARAML_GRPC_PORT | Port that must be opened when the model is deployed with `UPI_V1` protocol | +| CARAML_MODEL_NAME | Name of merlin model | +| CARAML_MODEL_VERSION | Merlin model version | +| CARAML_MODEL_FULL_NAME | Full name merlin model, per current version it use `{CARAML_MODEL_NAME}-{CARAML_MODEL_VERSION}` format | +| CARAML_ARTIFACT_LOCATION | Local path where the model artifacts will be stored | + +## Docker Image + +Docker image must contains web service application and dependencies that must be installed in order to run the web service. Users are responsible for building the docker image as well as for publishing it. Please make sure the k8s cluster (where model will be deployed) have access to pull the docker image. + +## Deployment + +Using Merlin SDK + +``` +resource_request = ResourceRequest(1, 1, "1", "1Gi") +model_dir = "model_dir" +with merlin.new_model_version() as v: + v.log_custom_model(image="ghcr.io/yourcustommodelimage", model_dir=model_dir) + +endpoint = merlin.deploy(v, resource_request= resource_request, protocol = Protocol.HTTP_JSON) +# endpoint = merlin.deploy(v, resource_request= resource_request, protocol = Protocol.UPI_V1) if using UPI +``` + +Most of the method that used in the above snipped is commonly used by all the model deployment, but `log_custom_model` method. `log_custom_model` is method exclusively used to upload custom model. Below are the method parameters that can be specified during the invocation +| Parameter | Description | Required | +|-----------|-------------|----------| +| `image` | Docker image that will be used as predictor | Yes | +| `model_dir` | Directory that will be uploaded to MLFlow | No | +| `command` | Command to run docker image | No | +| `args` | Arguments that needs to be specified when running docker | No | + +### Deployment Flow + +* Create new model version +* Log custom model, specify image and model directory that contains artifacts that need to be uploaded +* Deploy. There is no difference with other model deployments \ No newline at end of file diff --git a/docs/user/transformer.md b/docs/user/transformer.md deleted file mode 100644 index 9a2b7919d..000000000 --- a/docs/user/transformer.md +++ /dev/null @@ -1,9 +0,0 @@ -# Transformer - -In Merlin ecosystem, Transformer is a service deployed in front of the model service which users can use to perform pre-processing and post-processing steps into the incoming requests before being sent to the model service. The benefits of using Transformer are users can abstract the transformation logic outside of their model and write it in a language more performant than python. - -Currently, Merlin has two types of Transformer: Standard and Custom Transformer. - -{% page-ref page="./standard_transformer.md" %} -{% page-ref page="./custom_transformer.md" %} - diff --git a/docs/user/values.json b/docs/user/values.json new file mode 100644 index 000000000..55b4f1a99 --- /dev/null +++ b/docs/user/values.json @@ -0,0 +1,4 @@ +{ + "merlin_url": "merlin.example.com", + "models_base_domain": "models.id.merlin.dev" +} \ No newline at end of file diff --git a/examples/batch/Batch Prediction Tutorial 1 - Iris Classifier.ipynb b/examples/batch/BatchPredictionTutorial1-IrisClassifier.ipynb similarity index 100% rename from examples/batch/Batch Prediction Tutorial 1 - Iris Classifier.ipynb rename to examples/batch/BatchPredictionTutorial1-IrisClassifier.ipynb diff --git a/examples/batch/Batch Prediction Tutorial 2 - New York Taxi .ipynb b/examples/batch/BatchPredictionTutorial2-NewYorkTaxi .ipynb similarity index 100% rename from examples/batch/Batch Prediction Tutorial 2 - New York Taxi .ipynb rename to examples/batch/BatchPredictionTutorial2-NewYorkTaxi .ipynb diff --git a/examples/model-endpoint/Model Endpoint.ipynb b/examples/model-endpoint/ModelEndpoint.ipynb similarity index 100% rename from examples/model-endpoint/Model Endpoint.ipynb rename to examples/model-endpoint/ModelEndpoint.ipynb diff --git a/examples/transformer/feast-enricher-transformer/Feast Enricher.ipynb b/examples/transformer/feast-enricher-transformer/Feast-Enricher.ipynb similarity index 100% rename from examples/transformer/feast-enricher-transformer/Feast Enricher.ipynb rename to examples/transformer/feast-enricher-transformer/Feast-Enricher.ipynb diff --git a/examples/transformer/standard-transformer/Standard Transformer.ipynb b/examples/transformer/standard-transformer/Standard-Transformer.ipynb similarity index 100% rename from examples/transformer/standard-transformer/Standard Transformer.ipynb rename to examples/transformer/standard-transformer/Standard-Transformer.ipynb