Refactor SLO calculations #91

rail · 2020-04-22T04:12:42Z

Simplify the main logic by using absolute replica counts instead of
relative.
Rename get_new_worker_count to get_target_replica_count.
Move max_replicas and min_replicas a level higher in the config.
The limits are generic enough to be handled by the main logic.
Do not call the Kubernetes autoscaling API if no change is needed.
Fixes Noop API request should be skipped #38

* Simplify the main logic by using absolute replica counts instead of relative. * Rename `get_new_worker_count` to `get_target_replica_count`. * Move `max_replicas` and `min_replicas` a level higher in the config. The limits are generic enough to be handled by the main logic. * Do not call the Kubernetes autoscaling API if no change is needed. Fixes #38

tomprince

This looks good, but should probably have a pair of eyes on it that didn't have a hand in writing it.

While trying to work through the logic here, I wondered a little bit of there was existing literature on this that we could crib from, rather than deriving things from first principles (which is what I've been doing).

tomprince · 2020-04-22T04:50:36Z

src/k8s_autoscale/main.py

-    capacity = cfg["autoscale"]["args"]["max_replicas"] - running
-    log_env["capacity"] = capacity
+    max_replicas = cfg["autoscale"]["max_replicas"]
+    min_replicas = cfg["autoscale"]["min_replicas"]


This appears to duplicate the value above (both as a local and in the log).

Bah, good catch. Will fix it.

tomprince · 2020-04-22T04:52:13Z

src/k8s_autoscale/main.py

-        adjust_scale(api, target_replicas, cfg["deployment_namespace"], cfg["deployment_name"])
+    logger.info("Calculating target replica count", extra=log_env)
+    target_replicas = get_target_replica_count(pending, running, cfg["autoscale"]["args"])
+    target_replicas = max(min(target_replicas, max_replicas), min_replicas)


I wonder if this calculation deserves a comment on a function.

tomprince · 2020-04-22T04:55:03Z

src/k8s_autoscale/slo.py

+    #                 = running + ((pending - (running*(tasks_per_replica-1))) / tasks_per_replica)
+    #                 = running + ((pending + running - running*tasks_per_replica) / tasks_per_replica)
+    #                 = running + ((pending + running) / tasks_per_replica - running)
+    #                 = (pending + running) / tasks_per_replica


I think this algebra is useful for showing that the new and old calculations are equivalent; it is not clear to me how useful it is on its own (particularly given that it refers to variable names that only exist in the old logic.

tomprince · 2020-04-22T04:55:21Z

src/k8s_autoscale/slo.py

    # How many tasks a replica can process within our tolerance period
-    new_tasks_per_replica = math.floor(args["slo_seconds"] / args["avg_task_duration"])
+    tasks_per_replica = math.floor(args["slo_seconds"] / args["avg_task_duration"])
    # how many tasks can be covered by the running replicas, assuming they are
    # busy and can only take new tasks after they are done with the current one


I think this comment needs updating.

tomprince · 2020-04-22T06:24:08Z

src/k8s_autoscale/slo.py

+    needed_replicas = math.ceil(outstanding / tasks_per_replica)
+    # Do not scale down in case we have pending, because some workers will
+    # receive SIGUSR1 and won't take any tasks after that.
+    needed_replicas = max(needed_replicas, running)


Gah. I lost a long comment I had on this line, suggesting we get rid of clamping here, as I needed_replicas is already an overestimate when needed_replicas < running (assuming the workers complete the tasks that they are running before shutting down), so clamping it makes our estimate even worse.

It makes a lot of sense to scale down when you have the capacity, but I'm worried about the situation when the polling interval is shorter than the task duration.

For example, with avg 60 and SLO 300, we have the following situation. Assuming that some of the running tasks are very close to their end and get shut down, some are not, so we don't take any new tasks, and the polling happens very often.

poll 1: pending: 10, running: 20 -> suggest to scale down to 6
poll 2: pending: 10, running: 6 -> suggest to scale down to 4
poll 3: pending: 10, running: 4 -> suggest to scale down to 3
poll 4: pending: 10, running: 3 -> noop

Also, with this simulation it makes sense to go to 3 in the first run, hmmm. Maybe I should rethink the approach.

So, the value I got for value I arrived at for hitting our SLO with the least slack in the case we are discussing was pending / (tasks_per_replica - 1) which gives the value 3 you have. That said, I don't think scaling down slightly slower that is optimal is a problem, as we end up finishing work faster.

That last argument could also be used to argue for keeping the current clamping you have. If we did that, we should adjust the comment here to say that we don't scale down in order to rush to finish, rather than what is there currently.

However, in practice, given that tasks_per_replica (at least in prod) is always 2, any of values will lead to finishing in about the same time (since needed_replicas >= pending for any of the calculations). That suggest we should use the simplest logic, which I think is dropping any special handling of the the needed_replicas < running (either clamping, or using an alternative formula).

rail self-assigned this Apr 22, 2020

rail requested a review from tomprince April 22, 2020 04:16

tomprince reviewed Apr 22, 2020

View reviewed changes

rail closed this Aug 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor SLO calculations #91

Refactor SLO calculations #91

rail commented Apr 22, 2020

tomprince left a comment

tomprince Apr 22, 2020

rail Apr 22, 2020

tomprince Apr 22, 2020

tomprince Apr 22, 2020

tomprince Apr 22, 2020

tomprince Apr 22, 2020

rail Apr 22, 2020

tomprince Apr 22, 2020

Refactor SLO calculations #91

Refactor SLO calculations #91

Conversation

rail commented Apr 22, 2020

tomprince left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment