From b09380bfce7b2226e50588255d453ce1cac33447 Mon Sep 17 00:00:00 2001 From: rancher-max Date: Thu, 5 Oct 2023 10:53:42 -0700 Subject: [PATCH] Add a VM Sizing Guide based on stress test data Signed-off-by: rancher-max --- docs/install/requirements.md | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/docs/install/requirements.md b/docs/install/requirements.md index 94de9148..40a0c734 100644 --- a/docs/install/requirements.md +++ b/docs/install/requirements.md @@ -74,6 +74,27 @@ Hardware requirements scale based on the size of your deployments. Minimum recom RKE2 performance depends on the performance of the database, and since RKE2 runs etcd embeddedly and it stores the data dir on disk, we recommend using an SSD when possible to ensure optimal performance. +### VM Sizing Guide +When limited on CPU and RAM on the control-plane + etcd nodes, there could be limitations for the amount of agent nodes that can be joined under standard workload conditions. + +| Server CPU | Server RAM | Number of Agents | +| ---------- | ---------- | ---------------- | +| 2 | 4 GB | 0-225 | +| 4 | 8 GB | 226-450 | +| 8 | 16 GB | 451-1300 | +| 16+ | 32 GB | 1300+ | + +It is recommended to join agent nodes in batches of 50 or less to allow the CPU to free up space, as there is a spike on node join. Remember to modify the default `cluster-cidr` if desiring more than 255 nodes! + +This data was retrieved under specific test conditions. It will vary depending upon environment and workloads. The steps below give an overview of the test that was run to retrieve this. It was last performed on v1.27.4+rke2r1. +1. Monitor resources on grafana using prometheus data source. +2. Deploy workloads in such a way to simulate continuous cluster activity: + - A basic workload that scales up and down continuously + - A workload that is deleted and recreated in a loop + - A constant workload that contains multiple other resources including CRDs. +3. Join agent nodes in batches of 30-50 at a time. + + ## Networking :::tip Important