In the healthcare domain, there is a lot of real-time data that is generated. This real-time data needs to be monitored to generate predictions and alerts for healthcare professionals. A manual monitoring of such data is difficult. Hence in this code pattern we add AI based predictions and automate the monitoring of healthcare data. To demonstrate IBM Cloud pak for Data technology on AWS Cloud, we have taken the use case of predicting cardiac events based on real-time monitoring of patients health data.
In this code pattern, you will learn to build a machine learning model with no code on IBM Cloud Pak for Data, create a streaming flow on AWS Cloud and invoke the model to get predictions in real-time.
IBM Services used in the code pattern:
- IBM Cloud Pak for Data
- Watson Studio
- Watson SPSS Modeler
- Watson Machine Learning
AWS Services used in the code pattern:
- AWS IAM Roles
- Amazon Kinesis
- AWS Lambda Functions
- Amazon CloudWatch
- Amazon S3
Anyone using AWS services will be able to seamlessly plug in the IBM Cloud pak for Data Watson Machine Learning model to their flow.
Once you complete the code pattern, you will learn to:
- Create a S3 bucket on AWS.
- Create an event notification for the S3 bucket to trigger functions on adding data to the bucket.
- Create IAM Roles to AWS Services.
- Create a Lambda producer function to encrypt the data from S3 bucket and send it to Amazon Kinesis.
- Create a Machine Learning model using SPSS Modeler on IBM Cloud Pak for Data.
- Deploy the Machine Learning model on IBM Cloud Pak for Data Watson Machine learning and get the APIs to Invoke the model.
- Create a Lambda consumer function to decrypt the streaming data from Amazon Kinesis and send it to the model to get predictions.
- View the real-time predections from IBM Cloud Pak for Data Watson Machine Learning in Amazon CloudWatch.
- Healthcare data is dumped into a S3 bucket on AWS.
- A producer lambda function is triggered to encrypt the data and stream it to AWS Kinesis.
- A machine learning model is trained in IBM Cloud pak for Data Watson Studio using SPSS Modeler and the model is deployed in Watson Studio Machine Learning.
- A consumer lambda function reads the data from kinesis streams.
- The consumer function invokes the model from Watson Studio Machine Learning with the data received from kinesis streams.
- The data streamed from the kinesis along with the predictions received from the Watson Studio Machine Learning are then visualized in AWS CloudWatch.
- Create a S3 bucket
- Create a Kinesis Stream
- Create an IAM Role to AWS services
- Create Producer Lambda Function
- Create an event notification for the S3 bucket
- Build and Deploy Watson Machine Learning model
- Create Consumer Lambda Function
- Upload data to S3 bucket
- View Logs in CloudWatch
Create a S3 bucket in AWS by refering to the AWS documentation.
- Creating a bucket
- Click on Create Bucket and enter a name for the bucket (for example, 'cpdbucket').
- Keep the
Block all public access
enabled and then click onCreate bucket
button.
Create a Kinesis stream in AWS by following the steps:
- Sign in to the AWS Management Console and open the Kinesis console at https://console.aws.amazon.com/kinesis
- Click on Data Streams in the navigation pane.
- Click on Create data stream.
- Enter a name for your stream (for example,
myDataStream
). - Enter
2
for the number of shards. - Click on Create data stream.
- Create an IAM Role.
- Select Trusted entity type as
AWS service
- Select Use case as
Lambda
and click Next - Add the following permission policies:
CloudWatchFullAccess
AmazonKinesisFullAccess
AmazonS3FullAccess
- Click on Next
- Enter a role name and click on Create role
- Select Trusted entity type as
Make a note of the role name as it will be used in subsequent steps.
For more information, see Creating a role to delegate permissions to an AWS service AWS documentation.
Create a lambda function in AWS by following the steps:
- Open the Functions page on the Lambda console.
- Click on Create function.
- Under Basic information, do the following:
- For Function name, enter
Producer
. - For Runtime, confirm that
Node.js 14.x
is selected. - Expand the Change default execution role and select Use an existing role.
- Select the role name created in Step 3.
- For Function name, enter
- Click on Create function.
In the Code source enter the below code by replacing the existing code.
const AWS = require('aws-sdk');
const KinesisStreamsName = '<KinesisStreamsName>';
const Region = '<Region>'; // Example: us-east-1, ap-south-1, etc.
AWS.config.update({
region: Region
})
const s3 = new AWS.S3();
const kinesis = new AWS.Kinesis();
exports.handler = async (event) => {
console.log(JSON.stringify(event));
const bucketName = event.Records[0].s3.bucket.name;
const keyName = event.Records[0].s3.object.key;
const params = {
Bucket: bucketName,
Key: keyName
}
await s3.getObject(params).promise().then(async (data) => {
const dataString = data.Body.toString();
const payload = {
data: dataString
}
await sendToKinesis(payload, keyName);
}, error => {
console.log(error);
})
};
const sendToKinesis = async (payload, partitionKey) => {
const params = {
Data: JSON.stringify(payload),
PartitionKey: partitionKey,
StreamName: KinesisStreamsName
}
await kinesis.putRecord(params).promise().then(response => {
console.log(response);
}, error => {
console.log(error);
})
};
In line no 2, replace <KinesisStreamsName>
with the name of the Kinesis Streams you created.
Also in line no 3, replace <Region>
with the region where the Kinesis Streams is created.
Click on Deploy.
Create an event notification for the S3 bucket. This will allow the producer lambda function to trigger when the data is uploaded to the S3 bucket.
To enable and configure event notifications for an S3 bucket follow the steps:
-
Sign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/
-
In the Buckets list, choose the name of the bucket that you created.
-
Click on Properties.
-
Navigate to the Event Notifications section and click on Create event notification.
-
In the General configuration section, specify event name as Data-upload. Specify suffix as .csv to limit the notifications to objects with keys ending in the specified characters.
Note: If you don't enter a name, a globally unique identifier (GUID) is generated and used for the name.
- In the Event types section, select All object create events.
Note: Before you can publish event notifications, you must grant the Amazon S3 principal the necessary permissions to call the relevant API to publish notifications to a Lambda function, SNS topic, or SQS queue.
- Select the destination type as Lambda Function. Specify the lambda function name as producer.
- Click on Save changes, and Amazon S3 sends a test message to the event notification destination.
For more information, see Supported event destinations in the Amazon Simple Storage Service Developer Guide.
Access the Cloud pak for Data 4.x Dashboard on AWS.
Note: If you don't have Cloud pak for Data deployed on your AWS console, you can refer AWS Cloud Pak for Data 4.x on AWS for deployment guide.
Make sure your have enabled the following services in Cloud Pak for Data:
- Watson Studio
- Watson Machine Learning
Guided ML is a tool that helps you build machine learning models. It is a part of Watson Studio in the Cloud Pak for Data 4.x.
Follow the steps to build a machine learning model using Guided ML:
-
Create a project in Watson Studio.
- From the left navigation pane, click on Projects.
- Click on All projects.
- Click on New project.
- Select project type as Analytics project.
- Click on Create an empty project.
- Enter name and description for the project.
- Click on Create.
-
Download and unzip the dataset.
- We are using the Heart Attack Analysis & Prediction Dataset from Kaggle.
- Download the dataset and unzip it.
- We will be working with the
heart.csv
and02Saturation.csv
files.
-
Add the datasets to the project.
-
Create a SPSS Modeler Flow.
-
Import the datasets in the Modeler Flow.
-
Similarly Double click on the
02Saturation.csv
node in the Import datasets block and select the02Saturation.csv
file. -
Once the Dataset is loaded, you need to perform 4 steps to build the model.
-
You need to run the
heart_o2_merged.csv
node to merge both theheart.csv
and02Saturation.csv
datasets. -
This will merge the two datasets and create a new dataset called
heart_o2_merged.csv
.Note: You can view the dataset by selecting the Assets tab within the project.
-
You need to load the merged dataset into the SPSS Modeler Flow.
-
Once the Dataset is injested in the SPSS Modeler Flow, 60% of the data is split into training data named as
X_Train
andY_Train
. The remaining 40% of the data is split into testing data named asX_Test
andY_Test
. -
X_Train
andX_Test
are the input features andY_Train
andY_Test
are the target variables for the training and testing data. -
You will need to run the
X_Text
node to export the input features for testing and evaluating the model. -
Finally, build the model.
-
Click on the
Build Model
node in the Model Building block in the SPSS Modeler Flow. Click on the three dots icon and select Run.
Note: This process will take some time to complete.
-
-
Once the model is built, you can evaluate the different models and their accuracy and other parameters.
-
Click on the
Output
node in the Output block in the SPSS Modeler Flow. Click on the three dots icon and select View Model. -
You will see a set of one or more models generated by the Auto Classifier node, each of which can be individually viewed or selected for use in scoring.
-
For the Heart Attack Analysis & Prediction Dataset, auto classifier will generate 5 models. All the models can be individually viewed by selecting them.
-
C5
-
Logistic regression
-
Bayesian Network
-
Linear SVM
-
Random Trees
-
-
-
Aditionally, you can check the Target variable prediction with confidence scores.
-
Double click on the
X_Test.csv
node in the Data Preparation block in the SPSS Modeler Flow. -
A Data Asset panel will appear. Click on Select data asset.
-
Choose the Data asset and click on
X_Test.csv
. -
Click on Select.
-
Leave the File formatting settings as default and click on Save.
-
Now select the
Actual vs Prediction
node in the Output block in the SPSS Modeler Flow. Click on the three dots icon and select Preview. -
You can see the actual output and the model output with the confidence scores.
-
-
Now that you have successfully built the model and evaluated its accuracy, you can save the model to the project.
-
On the top bar, click on the Save Model button.
-
Select the Saving mode as
Scoring branch
. Select the Branch terminal node asSave Predictions Model
. Enter a Model name and click on Save. -
The model will be saved to the project.
-
Goto the project in your Cloud Pak for Data.
- From the left navigation pane, click on Projects.
- Click on All projects.
- Select the project you created in the previous step.
-
Under the Assets tab, in the Models section you will see the model you saved in the previous step.
-
Click on the Model.
-
You can see the overview of the model such as input and output columns.
-
Click on Promote to deployment space.
- Select the desired Target space or create a new deployment space.
- Select the goto the model in the space after promoting it.
- Click on Promote.
-
Once the model is promoted to deployment space, you can click on the deploy button.
-
Select the Deployment type as
Online
. -
Enter the Deployment name and click on Create.
-
You can check the staus of the model under the Deployments tab in the Deployment Space.
-
Once the model status is Deployed, click on the model name.
- Under the API Reference tab, copy the
Endpoint URL
.
- Note: Copy the
Endpoint URL
as it is required in the next step.
- Under the API Reference tab, copy the
-
Once the model is deployed and running, you will have to copy the
wmlToken
.-
In terminal, run the following curl command.
curl -k <CloudPakforData_URL>/v1/preauth/validateAuth -u USER:PASSWORD | sed -n -e 's/^.*accessToken":"//p' | cut -d'"' -f1
Note: Replace the
<CloudPakforData_URL>
with your cloud pak for data URL. And also replace theUSER
andPASSWORD
with theusername
andpassword
used to sign into Cloud Pak for Data.- An access token will be displayed on the terminal. Copy the entire token as it is required in the next step.
-
Create a lambda function in AWS by following the steps:
- Open the Functions page on the Lambda console.
- Click on Create function.
- Under Basic information, do the following:
- For Function name, enter
Consumer
. - For Runtime, confirm that
Node.js 14.x
is selected. - Expand the Change default execution role and select Use an existing role.
- Select the role name created in Step 3.
- For Function name, enter
- Click on Create function.
In the Function overview section, click on add trigger.
- Under Trigger configuration select Kinesis.
- Select Kinesis stream that you created in Step 2
- Select Consumer as
no consumer
- Batch size as
100
- Leave Batch window - optional blank
- Starting position as
Latest
- Finally click on Add.
In the Code source enter the below code by replacing the existing code.
const https = require('https');
// Initialize Cloud pak for Data Machine Learning credentials
const token = "<watson-machine-learning-token>";
const iamToken = "Bearer " + token;
const scoring_url = "<watson-machine-learning-url>";
const scoring_hostname = scoring_url.split('://')[1].split('.com')[0] + ".com";
const path = scoring_url.split('.com')[1];
let port=0;
scoring_url.split('://')[0] === "https" ? port=443 : port=80;
// API call setup
const APIcall = async (options, payload) => {
const promise = new Promise(function(resolve, reject) {
var req = https.request(options, function(res) {
res.setEncoding('utf8');
res.on('data', function (data) {
let result = JSON.parse(data);
resolve(result.predictions[0].values[0]);
});
});
req.on('error', function(e) {
reject('problem with request: ' + e.message);
});
req.write(JSON.stringify(payload));
req.end();
});
return promise;
};
// Convert the Kinesis Stream into an array
const processData = (data) => {
const regex = /[0-9]+(\.[0-9]+)?/g;
let array_of_values_to_be_scored = [];
let datarefined = data.data.split('\r\n');
for(let i = 1; i < datarefined.length-1; i++){
let temp = datarefined[i].match(regex);
// parse int the string to int
if (temp != null) {
for(let j = 0; j < temp.length; j++){
temp[j] = parseFloat(temp[j]);
}
array_of_values_to_be_scored.push(temp);
}
}
return (array_of_values_to_be_scored);
};
// AWS Lambda event handler
exports.handler = async function(event) {
let scoringpayload = [];
for (const records of event.Records){
const data = JSON.parse(Buffer.from(records.kinesis.data, 'base64'));
console.log('\n\n' +'--------------------------\n'
+'Amazon Kinesis stream data\n' +'--------------------------\n'
+ ' ',data);
scoringpayload = processData(data);
}
// Prepare the API to make a request
const array_of_input_fields = ["age","sex","cp","trtbps","chol","fbs","restecg","thalachh","exng","oldpeak","slp","caa","thall","spO2"];
const array_of_values_to_be_scored = scoringpayload;
let options = {
hostname: scoring_hostname,
port: port,
path: path,
method: "POST",
headers: {
'Authorization': iamToken,
'Content-Type': 'application/json',
'Accept': 'application/json'
},
rejectUnauthorized: false
};
// Handle the API response
let result = {
"labels": [...array_of_input_fields, "model_output","output_confidence"],
"values": []
};
let tableView = [];
let output = [];
for (let i=0; i<array_of_values_to_be_scored.length; i++){
let input = array_of_values_to_be_scored[i];
let temp = {};
const payload = {"input_data": [{"fields": array_of_input_fields, "values": [input]}]};
let modelScores = await APIcall(options, payload);
output = [...input, modelScores[0], modelScores[1]];
for (let k=0; k<result.labels.length; k++){
temp[result.labels[k]] = output[k];
}
tableView.push(temp);
result.values.push([...input, modelScores[0], modelScores[1]]);
}
// Print the results
console.log('\n\n' +'---------------------------------------------------\n'
+'IBM Cloud Pak for Data Machine Learning Predictions \n' +'---------------------------------------------------\n');
console.table(tableView);
return result;
};
In line no 5, replace <watson-machine-learning-token>
with the token copied from Step 5.3
Also in line no 7, replace <watson-machine-learning-url>
with the Endpoint-URL
copied from Step 5.2
Click on Deploy.
This step is to simulate data dumps coming into your s3 bucket and the Machine Learning model gives predictions. You can connect your s3 bucket with any service that can dump data into the bucket to trigger the event notification.
- Sign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/
- In the Buckets list, choose the name of the bucket that you created.
- Click on Upload and click on Add files and select the test-file.csv and click on Upload.
Note: We have taken random 6 rows from the
X_Test.csv
and created thetest-file.csv
.
Now that the data is added into the bucket, the event trigger will invoke the producer function and the function will send the data from the file to Kinesis. Kinesis will stream the data. Consumer function will comsume the streaming data and invoke the Machine Learning model deployed on IBM Cloud Pak for Data. The Model will send the output and the confidence score. All this happens in realtime.
Follow the steps to view the predeictions in real-time,
- Open the Functions page on the Lambda console.
- Select the
Consumer
function that you created earlier. - Under the Monitor tab, click on View logs in CloudWatch. This will launch Amazon CloudWatch in a new tab.
- In the CloudWatch dashboard, under Log streams you will see logs for your consumer function. Select the latest entry to view the logs.
In this code pattern, you learnt how to build a machine learning model with no code using SPSS Modeler on IBM Cloud Pak for Data, create a streaming flow using Amazon Kinesis on AWS Cloud and invoke the model using AWS Lambda function to get predictions and view in real-time on Amazon CloudWatch.
This code pattern can be further extended by visualizing the results using IBM Cloud Pak for Data Embedded Dashboard.
This code pattern is licensed under the Apache License, Version 2. Separate third-party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 and the Apache License, Version 2.