-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flux (and apps) in MC logs visibility to customers #948
Comments
@marians @pipo02mix if some context is missing, please addi it :) |
I'm quite strongly against allowing the (unfiltered) log access. It's a managed solution - customers should not have access to anything in |
(1) and (2) are very much separate requests. I suggest to split them. (2) is quite customer specific, so potentially worth discussing in a non-public repo. |
IMO makes sense to give them view access for the flux that managed the apps/cluster they install (not the one manage our components). If we put that in a dashboard is even better |
The problem is that even if we figure out the permissions problem reasonably, there's no way we can do org separation in logs. I really don't see much value in here, as really flux's CR statuses are pretty verbose. Do you have as specific use case, where logs were really needed to debug a problem? |
To my understanding, org separation is not an issue here. This is about access for customer admins, who are bound to the |
we will bring this to SIG product to discuss if we shall provide access or not to logs for apps on the MC to the customers |
What do you think of this @JosephSalisbury @weatherhog @puja108 ? |
I reckon this can be done with #311, sure up to @weatherhog for priority, but iirc it's soon |
From reading the above it is still unclear to me what the customer requested (what logs do they want to see exactly? which Flux? what resources are they interested in?) and more importantly what their use case for seeing those logs is, i.e. what are they trying to achieve, how is it failing, why do they need log access to debug that,.... Only when the use case is clear can we realistically evaluate what makes sense going forward and if log access in any form would really solve it and/or what alternative we'd rather offer. |
the use case would be As a cluster admin, I would like to debug why some (managed) apps are not installed in my workload cluster I am not sure if they need logs or |
Regarding which logs the customer wanted to see, that info is very detailed: those of the |
Discussion in SIG Product Sync:
|
From a product point of view: Admins on the customer side should have access to all logs on the management cluster. We want to be transparent. Admins should be able to understand what is going on. They should also be able to debug their own actions or the managment cluster misbehaving. So an admin role should have the permission to access logs of all pods. |
@giantswarm/team-atlas , tangent question: does our Loki support SSO ? |
@PrashantR31 Loki itself does not provide any authentication mechanisms. So in the End this a no to SSO support. If authorisation is needed for loki this is mainly dependent on the customers preferred way. It could be a combination out of loki and oauth2-proxy or it also could be loki saas. |
Thanks @weatherhog , just too bad Loki does not have SSO support But what I was thinking that:
|
Fyi @PrashantR31 Atlas already has a story about what you described #311 |
Action from refinement in Rainbow :
|
Yeah, that is it, a new ClusterRole. Not sure we can limit by pods inside a namespace though. Not sure how we can evaluate what logs come of every operator but I would assume no credentials are leaked at all, maybe account IDs and other infra resources names but I think those are also available in the CRs. In any case will try to find out if we can limit to specific pods and then we add as needed. |
If we leak no credentials, I think we're good - that was my only concern 👍 |
Imo your last comment @piontec should be our only concern. The "why" isn't the problem. I think @paurosello described it well and imo customers should have access to all logs on the management cluster. There will also be different roles in the future. But next to central logging we need to have native logging (kubectl logs) as well. At the moment I would see four use cases and we might already separate them
Going one by one component is a first option. And then we need a way to become confident to create a role that opens up all logs. @piontec @gianfranco-l depending on the amount of work that is necessary, can you check if you can enable flux logs so this story becomes unblocked? @TheoBrigitte @weatherhog can you take the general story around access to MC logs into Atlas and think about it more holistically? Afaik you already have a story about letting customers monitor their own controllers on the management clusters too. |
@teemow We will provide access to the logs to customers once we have a logging solution up an running (one central Loki). The main issues will arise from how we want to do multi-teancy but this is in refinement. |
@QuentinBisson what is the reason to wait for central logging? As I mentioned I think customers should be able to use a simple kubectl logs for all components in the MC too. |
I was answering about the log story regarding atlas and after rereading, I understand how it was misinterpreted 😅 . One interesting thing for us would be to ask customers how they are using MC logs as it could be useful for use once we have loki up and running |
My question was if Atlas can take this too. Somebody needs to think about the general concept of how we structure access to logs on the management cluster. |
Flux has a strict set of pods, so targeting them with RBAC shouldn't be that hard. @paurosello , have you already started, do you have something you can share? @gianfranco-l , as we need to prioritize this, I'm putting it back into our Inbox. @teemow Generally speaking, we can "just" grant the (Cluster)Role access to |
Yes we can create the cluster role with potential restrictions if we think it's needed but the binding part from customer to role should be elsewhere (i.e. done by another team). @giantswarm/team-honeybadger how do you currently create cluster roles at the MC level so we could do the same? |
@QuentinBisson my point was that Atlas should think about whether and which roles are needed and takes full ownership of how we allow customers to access logs on management clusters 🙏 @piontec as mentioned above (#948 (comment)) we need to assess and build up confidence. A simple role to access all pods/log is not what I want. But I would leave the assessment and the structure of the roles to Atlas. I only asked if Honey Badger can unblock this by adding a role for flux. |
A new role has been introduced int he This has been tested by Adam from THG and provides him with the log access he requires for Flux. |
added Atlas label and in Atlas board so that they can work on their part of the scope as outlined here |
@gianfranco-l @mproffitt we have another request to give access to app-operator logs to debug the installation of an app (ArgoCD) on the management cluster. Can @mproffitt enable this similarly to how he did this for flux? @TheoBrigitte as you can see this is getting more and more important. Imo customers would also like to inspect logs of capi controllers soon. So a more general set of permissions that allow customers to read the logs on the management cluster would be good. And I don't mean yet via loki. Just via |
@teemow Exposing the logs from pods within the IMHO requests seem to be leaning towards granting more and more privileges to things we own on the management clusters and this is something I consider to be potentially dangerous. We need to consider how we can best protect our resources whilst enabling the customer to achieve their own product vision. |
I hear your worries. But for me the benefits and our general philosophy around shared responsibility, knowledge sharing and providing a transparent platform and not a blackbox to the customers outweights this imo. This enables customers to solve their own problems much quicker and not having to wait for our support to respond. This is good for the customers and good for us. |
But to be able to do that, we have to re-work and fix part of our deployment configs, namely secrets shared across all the installations. If we don't do that, we risk one customer finding out secrets to other installations - I still find this unacceptable. On the other hand: we talked within the team and we think keeping customers from accessing everything on the MC doesn't make sense anymore. It's too hard too check if every possible escalation path is secured. It's super time-consuming to check every custom operator customers want to run against RBAC policies and what they can access. I think we should separate access credentials between installations "soon" and then throw the towel and give up: just give customers MC admin permissions, with them accepting the associated risk. WDYT? |
Why does giving access to logs give access to secrets? Can you explain this a bit better? I disagree that it is a "throw a towel" situation. We need to think about what is important. And UX and security need to go hand in hand. This is a challenge and yes it needs thorough investigation. Just simplifying the role of the customer to admin permissions is not what we should do imo. This would leave all the work and mistakes to the customer. Yes we maybe want to give admin permissions in general. But we also need to make sure to have well defined roles so it is easy for the customer to follow a least privilege principle for both users and applications. |
I agree about well-defined roles. What I mean that is hard or maybe even impossible for us is to judge if our MCs are still secure giving the following facts:
What I want to say is that it is already super hard to tell if the sum of all mentioned above is secure or not. With each custom role, we make it even more complex and we increase the probability that an attack vector exists that allows you to effectively get cluster-admin as a customer anyway. As such, maybe we shouldn't protect our Secrets at all cost, but instead make sure that secrets we get there are safe to read by s customer. Then, we can tell the customers "it's OK to run this or that with cluster-admin, but please be aware of consequences; still, if you really want to, you can". And for us, since the customer already has potential admin access, we don't have to make sure that it can't read something from the |
💯. We aren't there yet. First step is to work on our secret management and second is to write more documentation about the management cluster structure etc. Customers need to fully understand how the management cluster works before they do something with admin permissions. And our goal should still be to define proper roles so that admin permissions for the customers are only necessary in very specific (and currently unsupported) use-cases. This would enable customers and they aren't blocked by us. Less pressure on our teams. But it also adds responsibility to the customer as they can break more things. We need to make this clear and also build a safety net. But I like this as a goal to work towards. @puja108 wdyt? I'll add a note to our board so that we can create a proper epic about it in horizon first. |
Yeah, I'm also evaluating and talking to upbound and Redhat around kcp towards such use cases. It could enable better interfaces and tighter security within our MCs. Some of these topics also go towards a possible MC team within KaaS to improve architecture of MCs. Definitely a lot of sub-stories to it, but let's start with the bigger picture in Horizon for now. |
Closing, as it's done. |
From a customer request:
In the context of Management cluster auth via Dex against Okta , the customer requested:
Need logs visibility for Flux (as an admin)
kubectl logs -n flux-system kustomize-controller-5bd7f98f56-frkd
The text was updated successfully, but these errors were encountered: