Skip to content

Latest commit

 

History

History
168 lines (131 loc) · 7.28 KB

README.md

File metadata and controls

168 lines (131 loc) · 7.28 KB

Rook as an alternative to EBS in AWS

To evaluate storage options we’ll setup a Kubernetes cluster in AWS with a rook cluster deployed along with tools for debugging and metrics collection. Then we’ll deploy a pod with 3 different volumes to compare Rook block storage (backed by instance store), EBS gp2, and EBS io1 (SSD) (re: EBS volume types).

1. Kubernetes cluster setup

Kubernetes kops was used to setup Kubernets cluster in AWS.

For this test Kubernetes nodes are a mid range i3.2xlarge, with instance storage (1900 GiB NVMe SSD) and Up to 10 Gigabit networking performance. Kubernetes is installed on Ubuntu 16.04 LTS with 3 nodes plus the master.

Upon finishing the kops create, we should have fully functioning Kubernetes cluster, kops even sets up context for the newlly created cluster in kube config.

$ brew install kops
$ kops create cluster $NAME \
  --node-count 3 \
  --zones "us-west-2c" \
  --node-size "i3.2xlarge" \
  --master-size "m3.medium" \
  --master-zones "us-west-2c" \
  --admin-access x.x.x.x/32 \
  --api-loadbalancer-type public \
  --cloud aws \
  --image "ami-2606e05e" \
  --kubernetes-version 1.8.2 \
  --ssh-access x.x.x.x/32 \
  --ssh-public-key ~/.ssh/me.pub \
  --yes
...
$ kubectl get nodes
NAME                                          STATUS    AGE       VERSION
ip-172-20-42-159.us-west-2.compute.internal   Ready     1m        v1.8.2
ip-172-20-42-37.us-west-2.compute.internal    Ready     2m        v1.8.2
ip-172-20-53-26.us-west-2.compute.internal    Ready     1m        v1.8.2
ip-172-20-55-209.us-west-2.compute.internal   Ready     1m        v1.8.2

2. Rook cluster deployment

Rook is easy to get running, we’ll run the latest release, 0.6 currently. It’ll manage Ceph cluster configured to our spec. First, rook-operator needs to be deployed:

$ kubectl create -f k8s/rook-operator.yaml
clusterrole "rook-operator" created
serviceaccount "rook-operator" created
clusterrolebinding "rook-operator" created
deployment "rook-operator" created

The Rook cluster is configured to deliver block storage using local disks (instance store) attached directly to hosts running our instances. The disk devices are selected by deviceFilter, instance store is /dev/nvme0n1

Once the rook cluster is created, you will notice that rook-operator created several pods in the rook namespace to manage ceph components:

The placement for the osd is set in the cluster.yaml such that only two nodes are used for storage. The third node will host the test client pod.

$ kubectl create -f k8s/rook-cluster.yaml
namespace "rook" created
cluster "rook-eval" created
$ kubectl get pods --namespace rook
NAME                              READY     STATUS    RESTARTS   AGE
rook-api-3588729152-s0dxw         1/1       Running   0          46s
rook-ceph-mgr0-1957545771-bsg7h   1/1       Running   0          46s
rook-ceph-mgr1-1957545771-cth8i   1/1       Running   0          47s
rook-ceph-mon0-t1m3z              1/1       Running   0          1m
rook-ceph-mon1-mkdl4              1/1       Running   0          1m
rook-ceph-mon2-bv1qk              1/1       Running   0          1m
rook-ceph-osd-0027l               1/1       Running   0          46s
rook-ceph-osd-2p90r               1/1       Running   0          46s

Rook storage Pool and StorageClass have to be defined next. Note, that we are creating 2 replicas to provide resiliency on par with EBS:

$ kubectl create -f k8s/rook-storageclass.yaml
pool "replicapool" created
storageclass "rook-block" created

The Rook toolbox was started to provide better visibility into the Rook cluster.

$ kubectl create -f k8s/rook-tools.yaml
pod "rook-tools" created
$ kubectl -n rook exec -it rook-tools -- rookctl status
OVERALL STATUS: OK

USAGE:
TOTAL      USED       DATA      AVAILABLE
5.18 TiB   6.00 GiB   0 B       5.18 TiB

MONITORS:
NAME             ADDRESS                 IN QUORUM   STATUS
rook-ceph-mon0   100.66.63.114:6790/0    true        OK
rook-ceph-mon1   100.66.113.38:6790/0    true        OK
rook-ceph-mon2   100.68.185.191:6790/0   true        OK

MGRs:
NAME             STATUS
rook-ceph-mgr0   Active
rook-ceph-mgr0   Standby

OSDs:
TOTAL     UP        IN        FULL      NEAR FULL
2         2         2         false     false

PLACEMENT GROUPS (100 total):
STATE          COUNT
active+clean   100

At this point we have Kubernetes with Rook cluster up and running in AWS, we’ll be provisioning storage in the next steps.

3. Evaluation pod setup

Let’s create Persistent Volume Claim (PVC) using Rook block device attached to our testing pod along with different types of EBS devices. Before we proceed, EBS volumes have to be created, note volume IDs outputted by each command to be used in our manifest later:

$ aws ec2 create-volume --availability-zone=us-west-2b --size=120 --volume-type=gp2
...
$ aws ec2 create-volume --availability-zone=us-west-2b --size=120 --volume-type=io1 --iops=6000
...

Let’s create a pod with 3 volumes to run our FIO tests against:

  1. Rook volume mounted to /eval. 120 GiB, ext4.
  2. EBS io1 (Provisioned IOPS = 6K) volume mounted to /eval-io1. 120 GiB ext4.
  3. EBS gp2 (General purpose) volume mounted to /eval-gp2. 120 GiB, ext4.

Note that the blog writeup focused on the performance of the io1 volume for a high performance IOPS scenario.

$ kubectl create -f k8s/test-deployment.yaml
deployment "rookeval" created
$ kubectl get pods
NAME                             READY     STATUS    RESTARTS   AGE
rook-operator-3796250946-wwh3g   1/1       Running   0          10m
rookeval-1632283128-2c08m        1/1       Running   0          31s
$ kubectl exec -it rookeval-1632283128-2c08m -- df -Th --exclude-type=tmpfs
Filesystem     Type     Size  Used Avail Use% Mounted on
overlay        overlay  7.7G  2.8G  5.0G  36% /
/dev/xvdbe     ext4     118G   60M  112G   1% /eval-io1
/dev/rbd0      ext4     118G   60M  112G   1% /eval
/dev/xvdbi     ext4     118G   60M  112G   1% /eval-gp2
/dev/xvda1     ext4     7.7G  2.8G  5.0G  36% /etc/hosts

All looks good, ready to finally proceed with FIO tests. Our test pod currently has 3 different storage types to compare. It would be interesting to add rook clusters backed by EBSs of different types, and try different instance types as they provide different controllers and drives. Next time perhaps.


Notes:

  • Rook had higher IOPS in all scenarios except 4K sequential writes. Random writes are the important ones for transactional IO, so I’m focusing on that for now.

  • Need to analyze streaming IO with HDD based devices at some point to compare sequential read/write performance.

  • For the reported results, the testing pod is running on a node different than the storage nodes. For comparison, the test pod was also run on the storage nodes for a hyper-converged scenario. With Ceph being consistent an IO operation is complete only after all replicas are written, so it makes no noticeable difference where the pod lands on you cluster, at least in my testing setup where network capacity is the same across all nodes.