CluserRole name templating breaks project monitoring when more than one is installed #92

rgcosma · 2024-08-10T16:07:39Z

Cluster Setup
K8s 1.27, Rancher 2.8.3, RKE2

Describe the bug
The ClusterRole and corresponding ClusterRoleBinding in https://github.com/rancher/prometheus-federator/blob/main/charts/rancher-project-monitoring/0.4.2/templates/rancher-monitoring/hardened.yaml#L47 use the same name template - {{ .Chart.Name }}-patch-sa (same goes for 0.4.1 and 0.4.0 didn't check others)
This breaks project, monitoring after the first one is installed because the helm installer pods errors out with "project-monitoring-patch-sa has incorrect ownership annotation"
Changing the annotations or deleting the role doesn't work, because the helm controller immediately recreates them.
The workaround we applied was to change the above to {{ .Release.Name }} and rebuild the binary because the chart is bundled as a base64 string.

To Reproduce
Deploy the first project on a cluster - monitoring works, Prometheus and Grafana running
Deploy one more project

Result
Second and all subsequent project monitors get stuck with a confusing "WaitingForGrafanaDashboards" status in the web interface

Expected Result
All project monitors have status "Deployed"

The text was updated successfully, but these errors were encountered:

rgcosma · 2024-08-30T07:58:11Z

FYI I ended up editing the subchart and rebuilding the binary, a tedious and convoluted process - why are you bundling a chart as a base64 string?

mallardduck · 2024-09-03T19:02:44Z

@rgcosma - Can you clarify here, you're not using the prometheus-federator project or chart directly right? How are you installing them in your cluster, via the Rancher UI or something else? And what chart version and name are you specifically installing when you encounter this issue?

edit: Ultimately what I'm getting at here is to understand where and why you're getting version 0.4.x in Rancher 2.8.x. As from my understanding that version is not valid for 2.8 and the highest should be 0.3.x on Rancher 2.8.

rgcosma · 2024-09-03T19:10:18Z

@rgcosma - Can you clarify here, you're not using the prometheus-federator project or chart directly right? How are you installing them in your cluster, via the Rancher UI or something else? And what chart version and name are you specifically installing when you encounter this issue?

Hi! I am installing the prometheus-federator chart directly, tried via Helm and Argo same result. Chart version is 103.0.2+up0.4.0 downloaded from https://github.com/rancher/charts/tree/release-v2.8/charts/prometheus-federator and name is prometheus-federator

rgcosma · 2024-11-12T13:21:21Z

why you're getting version 0.4.x in Rancher 2.8.x

Ah I see the confusion: we install chart 103.x+0.4.x which uses the federator binary 0.3.5 which embeds the project monitoring chart 0.4.x

The bit that needed patching is in the project-monitoring chart, in the hardened.yaml file replaced all occurrences of {{ .Chart.Name }} with {{ .Release.Name }}

github-actions bot added the team/observability&backup label Aug 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CluserRole name templating breaks project monitoring when more than one is installed #92

CluserRole name templating breaks project monitoring when more than one is installed #92

rgcosma commented Aug 10, 2024 •

edited

Loading

rgcosma commented Aug 30, 2024

mallardduck commented Sep 3, 2024 •

edited

Loading

rgcosma commented Sep 3, 2024

rgcosma commented Nov 12, 2024

CluserRole name templating breaks project monitoring when more than one is installed #92

CluserRole name templating breaks project monitoring when more than one is installed #92

Comments

rgcosma commented Aug 10, 2024 • edited Loading

rgcosma commented Aug 30, 2024

mallardduck commented Sep 3, 2024 • edited Loading

rgcosma commented Sep 3, 2024

rgcosma commented Nov 12, 2024

rgcosma commented Aug 10, 2024 •

edited

Loading

mallardduck commented Sep 3, 2024 •

edited

Loading