Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CluserRole name templating breaks project monitoring when more than one is installed #92

Open
rgcosma opened this issue Aug 10, 2024 · 4 comments

Comments

@rgcosma
Copy link

rgcosma commented Aug 10, 2024

Cluster Setup
K8s 1.27, Rancher 2.8.3, RKE2

Describe the bug
The ClusterRole and corresponding ClusterRoleBinding in https://github.com/rancher/prometheus-federator/blob/main/charts/rancher-project-monitoring/0.4.2/templates/rancher-monitoring/hardened.yaml#L47 use the same name template - {{ .Chart.Name }}-patch-sa (same goes for 0.4.1 and 0.4.0 didn't check others)
This breaks project, monitoring after the first one is installed because the helm installer pods errors out with "project-monitoring-patch-sa has incorrect ownership annotation"
Changing the annotations or deleting the role doesn't work, because the helm controller immediately recreates them.
The workaround we applied was to change the above to {{ .Release.Name }} and rebuild the binary because the chart is bundled as a base64 string.

To Reproduce
Deploy the first project on a cluster - monitoring works, Prometheus and Grafana running
Deploy one more project

Result
Second and all subsequent project monitors get stuck with a confusing "WaitingForGrafanaDashboards" status in the web interface

Expected Result
All project monitors have status "Deployed"

@rgcosma
Copy link
Author

rgcosma commented Aug 30, 2024

FYI I ended up editing the subchart and rebuilding the binary, a tedious and convoluted process - why are you bundling a chart as a base64 string?

@mallardduck
Copy link
Member

mallardduck commented Sep 3, 2024

@rgcosma - Can you clarify here, you're not using the prometheus-federator project or chart directly right? How are you installing them in your cluster, via the Rancher UI or something else? And what chart version and name are you specifically installing when you encounter this issue?

edit: Ultimately what I'm getting at here is to understand where and why you're getting version 0.4.x in Rancher 2.8.x. As from my understanding that version is not valid for 2.8 and the highest should be 0.3.x on Rancher 2.8.

@rgcosma
Copy link
Author

rgcosma commented Sep 3, 2024

@rgcosma - Can you clarify here, you're not using the prometheus-federator project or chart directly right? How are you installing them in your cluster, via the Rancher UI or something else? And what chart version and name are you specifically installing when you encounter this issue?

Hi! I am installing the prometheus-federator chart directly, tried via Helm and Argo same result. Chart version is 103.0.2+up0.4.0 downloaded from https://github.com/rancher/charts/tree/release-v2.8/charts/prometheus-federator and name is prometheus-federator

@rgcosma
Copy link
Author

rgcosma commented Nov 12, 2024

why you're getting version 0.4.x in Rancher 2.8.x

Ah I see the confusion: we install chart 103.x+0.4.x which uses the federator binary 0.3.5 which embeds the project monitoring chart 0.4.x

The bit that needed patching is in the project-monitoring chart, in the hardened.yaml file replaced all occurrences of {{ .Chart.Name }} with {{ .Release.Name }}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants