- Github Link : https://github.com/open-telemetry/opentelemetry-demoLinks
- Documentation link : https://opentelemetry.io/docs/demo/Links
-
Set up a dedicated EC2 instance (minimum large instance with 16GB storage) to install and configure Docker.
-
Use the docker-compose.yml file available in the GitHub repository to bring up the service
-
User Data specification for EC2 instance to clone the Github Repo, install Docker and Docker Compose, and build the images
#! /bin/sh sudo yum update -y sudo yum install git -y sudo git clone https://github.com/open-telemetry/opentelemetry-demo.git sudo amazon-linux-extras install docker -y sudo service docker start sudo usermod -a -G docker ec2-user # Adds the ec2-user to the docker group to allow running Docker commands without sudo sudo chkconfig docker on # Configures Docker to start automatically when the system boots. sudo curl -L https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m) -o /usr/local/bin/docker-compose sudo chmod +x /usr/local/bin/docker-compose cd opentelemetry-demo/ docker-compose up --force-recreate --remove-orphans --detach
-
-
Validate that all services defined in the docker-compose.yml are up and Ensure this by:
- Running Docker commands like docker ps and docker-compose logs.
- Accessing application endpoints (if applicable) and confirming service
-
Once the deployment is validated, delete the EC2 instance to clean up
-
Screenshot of the EC2 instance details (instance type, storage).
-
Screenshots of the services running (docker ps).
-
Screenshots of Docker logs showing the application's startup
- To view last n logs from every service : docker-compose log --tail n
-
Screenshot of accessible application
- Web Store
- Grafana
- Load Generator UI
- Jaeger UI
- Flagd configurator UI
-
Set up an EKS Cluster in AWS with:
- At least 2 worker
- Instance type: large for worker nodes.
-
Deploy the application to the EKS cluster using the provided opentelemetry-demo.yaml manifest file.
-
Use a dedicated EC2 instance as the EKS client to:
-
Install and configure kubectl, AWS CLI, and eksctl for cluster
-
Run all Kubernetes commands from the EC2 instance (not from a local machine).
- Create an IAM policy (say, EksAllAccess) to provide complete access to EKS Services
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "eks:*", "Resource": "*" }, { "Action": [ "ssm:GetParameter", "ssm:GetParameters" ], "Resource": [ "arn:aws:ssm:*:<account_id>:parameter/aws/*", "arn:aws:ssm:*::parameter/aws/*" ], "Effect": "Allow" }, { "Action": [ "kms:CreateGrant", "kms:DescribeKey" ], "Resource": "*", "Effect": "Allow" }, { "Action": [ "logs:PutRetentionPolicy" ], "Resource": "*", "Effect": "Allow" } ] }
- Create an IAM policy (say, IamLimitedAccess) to provided limited access to AWS IAM
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "iam:CreateInstanceProfile", "iam:DeleteInstanceProfile", "iam:GetInstanceProfile", "iam:RemoveRoleFromInstanceProfile", "iam:GetRole", "iam:CreateRole", "iam:DeleteRole", "iam:AttachRolePolicy", "iam:PutRolePolicy", "iam:UpdateAssumeRolePolicy", "iam:AddRoleToInstanceProfile", "iam:ListInstanceProfilesForRole", "iam:PassRole", "iam:DetachRolePolicy", "iam:DeleteRolePolicy", "iam:GetRolePolicy", "iam:GetOpenIDConnectProvider", "iam:CreateOpenIDConnectProvider", "iam:DeleteOpenIDConnectProvider", "iam:TagOpenIDConnectProvider", "iam:ListAttachedRolePolicies", "iam:TagRole", "iam:UntagRole", "iam:GetPolicy", "iam:CreatePolicy", "iam:DeletePolicy", "iam:ListPolicyVersions" ], "Resource": [ "arn:aws:iam::<account_id>:instance-profile/eksctl-*", "arn:aws:iam::<account_id>:role/eksctl-*", "arn:aws:iam::<account_id>:policy/eksctl-*", "arn:aws:iam::<account_id>:oidc-provider/*", "arn:aws:iam::<account_id>:role/aws-service-role/eks-nodegroup.amazonaws.com/AWSServiceRoleForAmazonEKSNodegroup", "arn:aws:iam::<account_id>:role/eksctl-managed-*" ] }, { "Effect": "Allow", "Action": [ "iam:GetRole", "iam:GetUser" ], "Resource": [ "arn:aws:iam::<account_id>:role/*", "arn:aws:iam::<account_id>:user/*" ] }, { "Effect": "Allow", "Action": [ "iam:CreateServiceLinkedRole" ], "Resource": "*", "Condition": { "StringEquals": { "iam:AWSServiceName": [ "eks.amazonaws.com", "eks-nodegroup.amazonaws.com", "eks-fargate.amazonaws.com" ] } } } ] }
-
Create an IAM role (say, EKSClientRole) with the following policies, and attach it to the EC2 instance, serving as the EKS client
- IamLimitedAccess
- EksAllAccess
- AWSCloudFormationFullAccess (AWS Managed Policy)
- AmazonEC2FullAccess (AWS Managed Policy)
-
Create a security group for the EC2 instance allowing inbound SSH traffic
-
Launch the EC2 Instance (EKS Client), with the following configs
aws ec2 run-instances --image-id ami-0166fe664262f664c --instance-type t2.medium --key-name SSH1 --security-group-ids sg-013069d9f0783aca5 --iam-instance-profile Name=EKSClientRole --tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=EKSClient}]" --count 1 --region us-east-1 --user-data file://./phase1/eksClientStartupScript.sh
-
SSH into the EC2 instance
-
Create the EKS Cluster using the following command
eksctl create cluster -f /EKS-and-Monitoring-with-OpenTelemetry/phase1/eks-cluster-deployment.yaml
-
On occurrence of the warning while trying to access the cluster details from the AWS console as a root user
Your current IAM principal doesn’t have access to Kubernetes objects on this cluster. This may be due to the current user or role not having Kubernetes RBAC permissions to describe cluster resources or not having an entry in the cluster’s auth config map. Learn more
- Check the following article
-
Deploy the application to the EKS Cluster
kubectl apply --namespace otel-demo -f /EKS-and-Monitoring-with-OpenTelemetry/phase1/opentelemetry-demo.yaml
-
-
Validate the deployment:
-
Ensure all pods and services are running as expected in the otel-demo namespace.
kubectl get all -n otel-demo
-
Access application endpoints through port-forwarding or service
-
Accessing the application
- Forward a local port on the EKS Client to a port on a service running within a Kubernetes cluster. This command allows the port-forwarding to be accessed from any IP address, not just localhost
kubectl port-forward svc/opentelemetry-demo-frontendproxy 8080:8080 --namespace otel-demo --address 0.0.0.0
- Access the application
http://<instance-public-ip>:8080 http://<instance-public-ip>:8080/grafana http://<instance-public-ip>:8080/jaeger/ui/search http://<instance-public-ip>:8080/loadgen/ http://<instance-public-ip>:8080/feature
-
-
Collect the cluster details, including node and pod
kubectl get nodes -o wide
kubectl get pods -n otel-demo
kubectl describe nodes
-
Do not delete the EKS cluster unless explicitly
eksctl delete cluster -f /EKS-and-Monitoring-with-OpenTelemetry/phase1/eks-cluster-deployment.yaml
-
-
Screenshot of the EKS cluster configuration details (number of nodes, instance type, ).
-
Screenshot of the EC2 instance used as the EKS client (instance type, storage, ).
-
Screenshot of kubectl get all -n otel-demo showing the status of pods, services, and deployments.
-
Screenshot of logs from key application pods to confirm success
kubectl logs <pod-name> -n otel-demo
-
Exported Kubernetes manifest (opentelemetry-demo.yaml).
- The yaml file containing the kubernetes configurations : opentelemetry-demo.yaml
-
Screenshot of accessible application follow the project documentation link.
Deploy the application by creating and organizing split YAML files, applying them either individually or recursively, and validating their functionality. Splitting the YAML file is crucial because if any service or pod is down, the corresponding YAML file can be reapplied to make it functional without affecting others. This approach simplifies debugging and deployment management.
-
Create folders for resource types to organize the split YAML files by resource type (e.g., ConfigMaps, Secrets, Deployments, Services). Ensure the folder structure is logical and reflects the Kubernetes resources being deployed.
-
Apply resources either individually or recursively:
-
Individually apply each file to ensure resources deploy successfully.
-
Alternatively, apply all files recursively from the root folder containing the organized files to deploy everything.
-
SSH into the instance and move to the following path : EKS-and-Monitoring-with-OpenTelemetry/phase2/deployment
-
Deploy the namespace.yaml file
kubectl apply -f namespace.yaml
- Apply all the resources recursively
kubectl apply -f ./open-telemetry --recursive --namespace otel-demo
-
-
-
Validate resource deployment by checking the status of pods, services, and Debug any issues by reviewing pod logs or describing problematic resources.
kubectl get all -n otel-demo
-
Compress the organized folder of split YAML files into a ZIP file
-
Screenshots of the created folder structure containing the split YAML
- Folder Structure containing the split yaml file : link
-
Screenshots showing successful deployment of each resource (individually or recursively).
-
Screenshots showing all resources running successfully, including pods, services.
-
Logs or screenshots verifying the initialization and proper functioning of application
- Frontend-proxy pod logs
- Grafana pod log
-
A ZIP file containing the organized and split YAML
- Link to zip file : zip file
-
A short report explaining the purpose of each resource, steps followed during deployment, and resolutions to any challenges Note : Manage the namespaces properly while deploying the yaml files
Reasoning Behind Splitting YAML Files by Application Level
The decision to split the YAML files at the application level reflects an organizational strategy that aligns deployment artifacts with application-specific resources. Here's a detailed explanation of the rationale and steps followed during this process:
- Namespace YAML:
Defines logical segregation for resources, enabling streamlined management and isolation between applications or teams. - OpenTelemetry:
- OpenSearch: Manages data storage and retrieval for telemetry data using StatefulSets for persistence.
- Jaeger: Provides tracing capabilities to debug distributed applications.
- OpenTelemetry Collector: Gathers metrics and traces from services for analysis.
- Prometheus: A metrics-based monitoring tool.
- Grafana: Visualization of telemetry data and metrics.
- WebApplication:
- Core: Contains Kafka and validation services.
- Backend: Includes microservices like accounting, ad, cart, etc., for functional requirements.
- Frontend: Handles user-facing components with a proxy for load balancing.
-
Organizing Resources:
- Resources are grouped by their function or application to ensure logical separation.
- Example: All OpenTelemetry-related files (e.g., ConfigMaps, Deployments) are placed under
open-telemetry
for clarity.
-
Applying YAML Files:
- Individual Application:
Each YAML file is applied independently to validate its deployment in isolation. For instance:- Apply
namespace.yaml
first to ensure all resources deploy in the correct namespace. - Deploy ConfigMaps and Secrets before Deployments and Services to satisfy dependencies.
- Apply
- Recursive Deployment:
A batch deployment is achieved by applying all YAML files recursively from the root directory if individual validation is not required.
- Individual Application:
-
Validation:
- Checked the status of resources (
kubectl get pods
,kubectl get services
) to ensure successful deployment. - Used
kubectl logs
andkubectl describe
commands for debugging failed deployments.
- Checked the status of resources (
-
Zipping Deployment Artifacts:
- Compressed the entire folder for storage, version control, or portability.
- Simplified Debugging:
- Errors in one service (e.g., Jaeger) can be resolved by reapplying its specific YAML files without affecting other applications.
- Modularity and Reusability:
- Components like ConfigMaps or Services can be reused across environments (e.g., staging, production).
- Parallel Development and Deployment:
- Teams working on different applications can operate independently without conflict.
- Enhanced Scalability:
- Facilitates scaling individual microservices or applications based on demand.
- Dependency Issues:
- Services failing due to unavailable ConfigMaps or Secrets.
Resolution: Ensured ConfigMaps and Secrets were deployed before dependent resources.
- Services failing due to unavailable ConfigMaps or Secrets.
- StatefulSet Failures (OpenSearch):
- Pods stuck in
Pending
state due to insufficient storage.
Resolution: Adjusted PersistentVolumeClaim configurations to match available storage.
- Pods stuck in
- ClusterRole Binding Errors:
- Missing permissions for service accounts.
Resolution: Updated ClusterRoleBinding to correctly reference service accounts.
- Missing permissions for service accounts.
This structured approach ensures clear separation of concerns, facilitates smoother deployments, and enables efficient management of microservices. The folder hierarchy mirrors the application architecture, making it intuitive for developers and DevOps engineers to manage and troubleshoot resources
Simplify the deployment and management of Kubernetes resources by leveraging Helm, a package manager for Kubernetes, to manage configurations and enable easy upgrades, rollbacks, and customizations.
- Leverage the pre-configured Helm chart available in the OpenTelemetry Helm repository and the documentation specifically designed for the project. This chart provides all the necessary configurations for deploying the application.
- Add the Helm repository and update it to ensure the latest version of the chart is accessible. This will guarantee that you are deploying the most stable and feature-complete version of the Helm chart.
- Use Helm commands to deploy the application resources in a structured manner. Helm will simplify the deployment process by managing multiple Kubernetes resources as a single unit.
- Ensure the application is successfully deployed, and all associated pods, services, and other resources are running correctly.
- Use Helm’s upgrade feature to simulate updates to the application. For example, update resource configurations or increase the number of replicas, and then reapply the deployment to test the update process.
- Practice rolling back the application to a previous stable version using Helm’s rollback feature. This ensures that you can quickly recover from issues introduced by faulty updates and maintain application stability.
- Screenshots of the Helm repository being added and updated.
- Screenshots of the application deployed successfully using Helm.
- Screenshots or logs of the upgrade process showing changes being applied.
- Screenshots or logs demonstrating a successful rollback to a previous version.
Installation:
$ curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
$ helm version
Helm Added to Repo:
$ helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
Creating new namespace:
$ kubectl create namespace otel-demo-helm
Install
Check the Deployed Release:
helm list -n <namespace>
Identify the revision history for the deployed Helm release:
helm history otel-demo --namespace otel-demo-helm
Rollback to the Desired Revision
helm rollback otel-demo revisionNumber --namespace otel-demo-helm
Checking the deployment status:
kubectl get deployments -n otel-demo-helm
Check the Replica Count for frontendProxy:
kubectl get deployment otel-demo-frontendproxy -n otel-demo-helm