This project contains resources to showcase a full circle continuous motion of data to capture training data, train new ML models, deploy them, serve them, and expose the service for clients to send inference requests.
Warning
This project is now deprecated.
Please use instead the following repository, which is an improved version of this demo and includes deployment scripts and detailed documentation:
RHODS artifacts are not YAML editable, they require UI interaction.
Although tedious and time consuming, by the end of the deployment procedure (below), you will be able to understand how the full cycle connects all the stages together (acquisition, training, delivery, inferencing).
- RHODS 2.5.0 provided by Red Hat
- RHO Pipelines 1.10.4 provided by Red Hat
- AMQ-Streams 2.6.0-0 provided by Red Hat
- AMQ Broker 7.11.4 provided by Red Hat
- Red Hat build of Apache Camel 4
- Camel K 1.10 provided by Red Hat
The following list summarises the steps to deploy the demo:
- Provision a RHODS environment
- Create and prepare a RHODS project.
- Create and run the AI/ML Pipeline.
- Deliver the AI/ML model and run the ML server.
- Create a trigger for the Pipeline.
- Deploy the data ingestion system.
- Test the end to end solution.
-
Provision the following RHDP item:
-
Log in using your environment credentials.
-
Deploy an instance of Minio
- Create a new project, named
central
- Under the
central
project, deploy the following YAML resource:- deployment/central/minio.yaml
- Create a new project, named
-
Create necessary S3 buckets
-
Open the Minio UI (2 routes: use UI Route)
-
Login with
minio/minio123
-
Create buckets for RHODS:
- workbench
-
Create buckets for Edge-1:
- edge1-data
- edge1-models
- edge1-ready
-
[OPTIONAL] Create buckets for Edge-2:
(Not needed for standard demo)- edge2-data
- edge2-models
- edge2-ready
-
-
Create a new Data Science Project.
Open Red Hat OpenShift AI (also known as RHODS).
Log in using your environment credentials.
Select Data Science Projects and clickCreate data science project
.
As a name, use for exampletf
(TensorFlow) -
Create a new Data Connection.
Under the new
tf
project > Data connections, clickAdd data connection
.
Enter the following parameters:- Name:
dc1
(data connection 1) - Access key:
minio
- Secret key:
minio123
- Endpoint:
http://minio-service.central.svc:9000
- Region:
eu-west-2
- Bucket:
workbench
- Name:
-
Create a Pipeline Server.
Under the new
tf
project > Pipelines, clickCreate a pipeline server
.
Enter the following parameters:- Existing data connection:
dc1
Then click
Configure
to proceed. - Existing data connection:
-
Create a 'PersistentVolumeClaim' for the pipeline.
The PVC will enable shared storage for the pipeline's execution.
Deploy the following YAML resource:- deployment/pipeline/pvc.yaml
-
Create a new Workbench.
Under the new
tf
project > Workbenches, clickCreate workbench
.
Enter the following parameters:- Name:
wb1
(workbench 1) - Image selection:
TensorFlow
- Container Size:
medium
- Create new persistent storage
- Name:
wb1
- Persistent storage size: (leave default)
- Name:
- Use a data connection
- Use existing data connection
- Data connection:
dc1
- Data connection:
- Use existing data connection
Then click
Create workbench
- Name:
-
Open the workbench (Jupyter).
When your workbench is in Running status, click
Open
.Log in using your environment credentials.
-
Upload the pipeline sources to the project tree.
[!CAUTION] Do not use the 'Git Clone' feature to upload the project, you don't need to upload the big dataset of images!
Under the Jupyter menu, click the icon 'Upload Files' and select the sources listed below:
To show the entire modelling process:
- workbench/clean-01.ipynb
To show the process segmented in pipeline steps:
- workbench/pipeline/step-01.ipynb
- workbench/pipeline/step-02.ipynb
- workbench/pipeline/step-03.ipynb
To show the Elyra pipeline definition:
- workbench/pipeline/retrain.pipeline
-
Export the pipeline in a Tekton YAML file.
[!TIP] Reference to documented guidelines:
- Double click on the
retrain.pipeline
resource. The pipeline will be displayed in Elyra (embedded visual pipeline editor in Jupyter). - Hover and click on the icon with label
Export Pipeline
. - Enter the following paramters:
- s3endpoint:
http://minio-service.central.svc:9000
- leave all other parameters with default values.
- s3endpoint:
- Click
OK
.
a. The action will produce a new file
retrain.yaml
file.b. It will also populate your S3 bucket
workbench
with your pipeline's artifacts. - Double click on the
-
Import the pipeline as an OpenShift Tekton pipeline.
From your OpenShift UI Console, navigate to Pipelines > Pipelines.
[!TIP] Reference to documented guidelines:
Ensure you're working under the
tf
project (namespace).
ClickCreate > Pipeline
.
Use the following snippet:apiVersion: tekton.dev/v1beta1 kind: Pipeline metadata: name: train-model namespace: tf spec: [Copy paste here contents under 'pipelineSpec']
Complete the YAML code above with the
pipelineSpec
(around line 51) definition from your exported YAML file in Jupyter (retrain.yaml
).[!CAUTION] Make sure you un-tab one level the
pipelineSpec
definition to make the resource valid.Click
Create
.You can test the pipeline by clicking
Action > Start
, accept default values and clickStart
.You should see the pipeline FAIL because there is no trainable data available just yet.
-
Upload training data to S3.
There are two options to upload training data:
- Manually (recommended): Use Minio's UI console to upload the images (training data):
- From the project's folder:
- dataset/images
- To the root of the S3 bucket:
edge1-data
(wait for all images to be fully uploaded)
- From the project's folder:
- Automatically: Use the Camel server provided in the repository to push training data to S3. Follow the instructions under:
- camel/central-feeder/Readme.txt
- Manually (recommended): Use Minio's UI console to upload the images (training data):
-
Train the model.
When ALL images have been uploaded, re-run the pipeline by clicking
Action > Start
, accept default values and clickStart
.You should now see the pipeline succeed. It will push the new model to the following buckets:
edge1-models
edge1-ready
-
Create a new OpenShift project
edge1
. -
Deploy an AMQ Broker
AMQ is used to enable MQTT connectivity with edge devices and manage monitoring events.
-
Install the AMQ Broker Operator:
- AMQ Broker for RHEL 8 (Multiarch)
Install in
edge1
namespace (specific)
NOT cluster wide -
Create a new ActiveMQ Artemis (amq broker instance)
Use the YAML defined under:- deployment/edge/amq-broker.yaml
-
Create a route to enable external MQTT communication (demo Mobile App)
oc create route edge broker-amq-mqtt --service broker-amq-mqtt-0-svc
-
-
Deploy a Minio instance on the (near) edge.
- In the
edge1
namespace use the following YAML resource to create the Minio instance:- deployment/edge/minio.yaml
- In the new Minio instance create the following buckets:
- production (live AI/ML models)
- data (training data)
- valid (data from valid inferences)
- unclassified (data from invalid inferences)
- In the
-
Create a local service to access the
central
S3 storage with Service Interconnect.Follow the instructions below:
-
Install Service Interconnect's CLI
(you can use an embedded terminal from the OCP's console)curl https://skupper.io/install.sh | sh
export PATH="/home/user/.local/bin:$PATH"
-
Initialize SI in
central
and create a connection token:oc project central
skupper init --enable-console --enable-flow-collector --console-auth unsecured
skupper token create edge_to_central.token
-
Initialize SI in
edge1
and create the connection using the token we created earlier:oc project edge1
skupper init
skupper link create edge_to_central.token --name edge-to-central
-
Expose the S3 storage service (Minio) from
central
on SI's network using annotations:oc project central
kubectl annotate service minio-service skupper.io/proxy=http skupper.io/address=minio-central
-
Test the SI service.
You can test the service fromedge1
with a Route:oc project edge1 oc create route edge --service=minio-central --port=port9090
Try opening (central) Minio's console using the newly created route
minio-central
. Make sure the buckets you see are the ones fromcentral
.
You can delete the route after validating the service is healthy.
-
-
Deploy the Edge Manager.
Deploy in the newedge1
namespace.
Follow instructions under:- camel/edge-manager/Readme.txt
The Edge Manager moves available models from the
edge1-ready
(central) toproduction
(edge1).
When the pod starts, you will see the model available inproduction
. -
Deploy the TensorFlow server.
Under the
edge1
project, deploy the following YAML resource:- deployment/edge/tensorflow.yaml
The server will pick up the newly trained model from the
production
S3 bucket. -
Run an inference request.
To test the Model server works, follow the instructions below.
-
From a terminal window change directory to client folder:
cd client
-
Edit the
infer.sh
script and configure theserver
url with your TensorFlow server's route. -
Run the script:
./infer.sh
The output should show something similar to:
"predictions": ["tea-green", "0.838234"]
-
-
Create a Pipeline trigger.
The next stage makes the pipeline triggerable. The goal is enable the platform to train new models automatically when new training data becomes available.
Follow the steps below to create the trigger.
To provision the YAML resources below, make sure you switch to the
tf
project where your pipeline was created.-
Deploy the following YAML resource:
- deployment/pipeline/trigger-template.yaml
-
Deploy the following YAML resource:
- deployment/pipeline/trigger-binding.yaml
-
Deploy the following YAML resource:
- deployment/pipeline/event-listener.yaml
-
-
Trigger the pipeline
To manually test the pipeline trigger, from OpenShifts's UI console, open a terminal by clicking the icon
>_
in the upper-right corner of the screen.Copy/Paste and execute the following
curl
command:curl -v \ -H 'content-Type: application/json' \ -d '{"id-edge":"edge1"}' \ http://el-train-model-listener.tf.svc:8080
The output of the command above should show the status response:
HTTP/1.1 202 Accepted
Switch to the Pipelines view to inspect if a new pipeline execution has started.
a. When the pipeline succeeds, a new model version will show up in the
edge1-models
S3 bucket.b. The pipeline also pushes the new model to the
edge1-ready
bucket. The Edge Manager moves the model to the Edge Minio instance, into theproduction
bucket. The Model server will detect the new version and hot reload it. -
Deploy a Kafka cluster
The platform uses Kafka to produce/consume events to trigger the pipeline automatically.
-
Install the AMQ Streams operator in the
central
namespace. -
Deploy a Kafka cluster in the
central
namespace using the following YAML resource:- deployment/central/kafka.yaml
Wait for the cluster to fully deploy.
-
-
Deploy the Camel delivery system
This Camel system is responsible to listen for Kafka signals to trigger pipeline executions.
Follow instructions under:
- camel/central-delivery/Readme.txt
When successfully deployed, Camel should connect to Kafka and create a Kafka topic
trigger
. Check in your environment Camel started correctly, and the Kafka topic exists.[!CAUTION] You might need to wait a bit until the
trigger
topic gets created, be patient.
A Camel service deployed on Central will be ready listening for requests to ingest training data.
Upon receiving data ingestion requests, Camel will:
- Unpack the data and push it to central S3 storage.
- Send a signal via Kafka to kick off the process of training a new AI/ML model.
-
Deploy the Feeder
To deploy the system on OpenShift, follow instructions under:
- camel/central-feeder/Readme.txt
Check in your environment Camel has started and is in healthy state.
-
Expose the Feeder service to the Service Interconnect network to allow
edge1
to have visibility:oc project central
kubectl annotate service feeder skupper.io/proxy=http
-
(for testing purposes) Expose the
feeder
service (inedge1
) by executing the command below:oc expose service feeder -n edge1
This final test validates all the platform stages are healthy. We should see the following processes in motion:
- A client sends training data for a new product.
- The feeder system (Camel) ingests the data, stores it in S3, and sends a trigger signal.
- The delivery system (Camel) receives the signal and triggers the Pipeline.
- The Pipeline trains a new model and pushes it to S3 storage.
- The edge manager (Camel) detects a new model and moves it to local S3 storage.
- The edge ML Server (TensorFlow) detects a new model and hot deploys it.
- The platform has now evolved and capable of detecting the new product.
Procedure:
-
Check the current edge model version in
production
.The
edge1
Minio S3 bucket should show model version2
under:- production/models/tea_model_b64
-
Push training data
From the
central-feeder
project, execute in your terminal the followingcurl
command:[!CAUTION] If the ZIP file is big, be patient.
ROUTE=$(oc get routes -n edge1 -o jsonpath={.items[?(@.metadata.name==\'feeder\')].spec.host}) && \ curl -v -T data.zip http://$ROUTE/zip?edgeId=edge1
-
When the upload completes you should see a new pipeline execution has started.
-
When the pipeline execution completes you should see a new version
3
deployed under:- production/models/tea_model_b64
-
Test the new model
Send a new inference request against the ML Server.
Under the project'sclient
folder, execute the script:./infer.sh
The App connects edge devices to the platform and integrates with the various systems.
It includes an interface capable of:
- Get price tags for products (inferencing)
- Send training data (data ingesting)
- Monitoring platform activity
Some components are Camel K based.
- Install Camel K Operator (cluster-wide)
- Red Hat Integration - Camel K
1.10.5 provided by Red Hat
- Red Hat Integration - Camel K
Under the edge1
namespace, perform the following actions:
-
Deploy the Price Engine (Catalogue).
The price engine is based on Camel K.
From the folder:- camel/edge-shopper/camel-price
First, create a ConfigMap containing the catalogue:
(make sure you're working on theedge1
namespace)oc create cm catalogue --from-file=catalogue.json -n edge1
Then, run the
kamel
cli command:kamel run price-engine.xml \ --resource configmap:catalogue@/deployments/config
-
Deploy the Edge Monitor.
Deploy it in the newedge1
namespace.
Follow instructions under:- camel/edge-monitor/Readme.txt
The Edge Monitor bridges monitoring events from Kafka to MQTT.
-
Deploy the Edge Shopper (Intelligent App).
Deploy it in the newedge1
namespace.
Follow instructions under:- camel/edge-shopper/Readme.txt
The Edge Shopper allows for inferencing/data-acquisition/monitoring from a web-based app the user can operate.
-
Create a route to enable external connectivity:
oc create route edge camel-edge --service shopper
Use the route URL to connect from a browser.