feat(KONFLUX-5670): Define computeResources for prefetch-dependencies* tasks #1763

jhutar · 2024-12-17T20:07:49Z

OK, I was trying to reproduce the issue KONFLUX-5951 and was able to do so with Gomod Cachi2 configured. I do not know what caused that issue, but noticed that for me all of 50 concurrent prefetch-dependencies pods were running on the same node!

I have picked 1 CPU and 3 GiB of memory as a request (and in an effort to slowly migrate to model where we have requests == limits), used the same for limits. This is why I used these numbers (graphs from stone-prd-rh01):

... looks like steps step-prefetch-dependencies and step-create-trusted-artifact can take about 2 GB (including cache and so)

... RSS memory is of-course smaller

... regarding CPU it takes about 1 CPU across all containers

... again, only containers step-prefetch-dependencies and step-create-trusted-artifact seems to be bigger players here

hugares · 2024-12-17T20:49:04Z

the failing check is saying you need to generate some files:

Error: File is out of date, run `hack/generate-ta-tasks.sh` and include the updated file with your changes

hugares · 2024-12-17T20:50:39Z

the failing check is saying you need to generate some files:
Error: File is out of date, run `hack/generate-ta-tasks.sh` and include the updated file with your changes

I think this is due to the fact that -ta version does not have same values as non -ta one

jhutar · 2024-12-18T08:26:34Z

Hello @hugares . At the end I had to do some changes in task-generator/trusted-artifacts/ta.go. When I just added ComputeResources to that step in task-generator/trusted-artifacts/golden/prefetch-dependencies/ta.yaml, it did not helped (I was not getting ComputeResources definition in resulting TA task yaml).

Hello @zregvart. Is this correct approach please?

jhutar · 2024-12-18T10:20:34Z

For record, this is a failure rate of prefetch-dependencies.* tasks in stone-prd-rh01 in last week and day:

And average over time:

… tasks

jhutar · 2024-12-19T06:56:02Z

Hello @zregvart. Is this correct approach please?

OK, ignore me Zoran. Looks like yaml in task-generator/trusted-artifacts/golden/prefetch-dependencies/ta.yaml is only used in tests, so changing it will not help with generating yamls, but will help with go test run in CI checks.

hugares

/lgtm

chmeliik

Is this based based on tests with one specific repo, or do you have data across varied uses of git-clone and prefetch-dependencies?

The memory and cpu usage will vary wildly depending on the size of the repo, the configured package managers, the number of dependencies etc.

zregvart · 2024-12-20T09:41:12Z

task-generator/trusted-artifacts/ta.go

+			ComputeResources: core.ResourceRequirements{
+				Requests: core.ResourceList{
+					core.ResourceCPU:    resource.MustParse("1"),
+					core.ResourceMemory: resource.MustParse("3Gi"),
+				},
+				Limits: core.ResourceList{
+					core.ResourceCPU:    resource.MustParse("1"),
+					core.ResourceMemory: resource.MustParse("3Gi"),
+				},
+			},


Ideally the values here would be part of the recipe, nothing that would block this change, this can be added if/when there is a need to modify this in the future

I agree (as far as I understand the architecture here) but that would require bigger code change and I did not wanted to go too deep to it for now. But, let me know and I will happily work on it in different PR.

jhutar · 2024-12-20T11:47:57Z

Hello @chmeliik ! Graphs come from real world builds on stone-prd-rh01.

If the concern is limits are too small, I think they should be OK until some new huge repo appears (I have pasted a graph for whole week of data).

If the concern is the requests (and limits) are too high for small repos without specific requirements, I assume that should be fine, because in that case the pod/task will be running just briefly, so resources will not be allocated for too long.

chmeliik · 2024-12-20T12:18:03Z

Hello @chmeliik ! Graphs come from real world builds on stone-prd-rh01.

Ack 👍

If the concern is limits are too small, I think they should be OK until some new huge repo appears (I have pasted a graph for whole week of data).

Yeah this was it. Ok, let's see if the current requests and limits are enough

chmeliik

LGTM, could you just update the commit message or split into two commits? This PR updates the computeResources for every create-trusted-artifact step and for one separate step in the prefetch-dependencies task

jhutar requested review from brunoapimentel, eskultety, taylormadore and a team as code owners December 17, 2024 20:07

jhutar force-pushed the fix11-prefetch-resources branch from 890ab47 to 52db40e Compare December 18, 2024 08:20

jhutar force-pushed the fix11-prefetch-resources branch from 52db40e to bce5491 Compare December 18, 2024 09:58

jhutar force-pushed the fix11-prefetch-resources branch from bce5491 to 651c76f Compare December 18, 2024 11:08

feat(KONFLUX-5670): Define computeResources for prefetch-dependencies…

faa9fcb

… tasks

jhutar force-pushed the fix11-prefetch-resources branch from 651c76f to faa9fcb Compare December 19, 2024 06:53

hugares approved these changes Dec 19, 2024

View reviewed changes

chmeliik reviewed Dec 20, 2024

View reviewed changes

zregvart approved these changes Dec 20, 2024

View reviewed changes

chmeliik reviewed Dec 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(KONFLUX-5670): Define computeResources for prefetch-dependencies* tasks #1763

feat(KONFLUX-5670): Define computeResources for prefetch-dependencies* tasks #1763

jhutar commented Dec 17, 2024 •

edited

Loading

hugares commented Dec 17, 2024

hugares commented Dec 17, 2024

jhutar commented Dec 18, 2024

jhutar commented Dec 18, 2024

jhutar commented Dec 19, 2024

hugares left a comment

chmeliik left a comment

zregvart Dec 20, 2024

jhutar Dec 20, 2024

jhutar commented Dec 20, 2024

chmeliik commented Dec 20, 2024

chmeliik left a comment

feat(KONFLUX-5670): Define computeResources for prefetch-dependencies* tasks #1763

Are you sure you want to change the base?

feat(KONFLUX-5670): Define computeResources for prefetch-dependencies* tasks #1763

Conversation

jhutar commented Dec 17, 2024 • edited Loading

hugares commented Dec 17, 2024

hugares commented Dec 17, 2024

jhutar commented Dec 18, 2024

jhutar commented Dec 18, 2024

jhutar commented Dec 19, 2024

hugares left a comment

Choose a reason for hiding this comment

chmeliik left a comment

Choose a reason for hiding this comment

zregvart Dec 20, 2024

Choose a reason for hiding this comment

jhutar Dec 20, 2024

Choose a reason for hiding this comment

jhutar commented Dec 20, 2024

chmeliik commented Dec 20, 2024

chmeliik left a comment

Choose a reason for hiding this comment

jhutar commented Dec 17, 2024 •

edited

Loading