-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kuadrant CR policy support proposal #6
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,164 @@ | ||
# RFC Template | ||
|
||
- Feature Name: Control Policy management at the Kuadrant CR level | ||
- Start Date: 2022-11-15 | ||
- RFC PR: [Kuadrant/architecture#0000](https://github.com/Kuadrant/architecture/pull/0000) | ||
- Issue tracking: [Kuadrant/architecture#0000](https://github.com/Kuadrant/architecture/issues/0000) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Not every use cases requires, or even would want to permit usage of all of Kuadrant's capabilities to | ||
be enabled on the target cluster. This proposal aims at giving users control, through the `kuadrant` | ||
custom resource, of what gets enabled on the cluster when deploying Kuadrant. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
The rational behind this proposal is that not every deployment requires all of Kuadrant features to be | ||
supported; currently `RateLimitPolicy` and `AuthPolicy`. There is no reason to impose the workload to | ||
run on the cluster if it isn't need, but there might be more reasons why explicitly disabling one | ||
feature might be desirable: disabling the users to actually apply one or the other policy to their | ||
traffic. Authorization for instance could always be implemented at the application level as per the | ||
organization desire, while rate limiting is achieved through `RateLimitPolicy` attached to the network | ||
resources. Just as the opposite situation might be applicable to an organization that wants to do | ||
authn/z by leveraging Kuadrant, but have no use cases for rate limiting. | ||
|
||
This proposal aims not only at defining a way to disable one of these features at deploy time, but | ||
also clarify what the behavior would be when such a state transitions while some policy are already | ||
existent; what the user experience would be when trying to apply a policy that's been explicitly | ||
disabled by the cluster administrator when Kuadrant was installed. | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
When applying the `Kuadrant` resource to the cluster for the first time, you have the option to | ||
disable the support for a policy by specifying the following in the `spec` of the resource: | ||
|
||
```yaml | ||
authentication: | ||
policy: disabled | ||
rateLimiting: | ||
policy: disabled | ||
``` | ||
|
||
The `policy` key of either of the service, `authentication` and `rateLimiting`, defaults to `enabled`. | ||
That value can be specified as `disabled`. The behavior on both services is consistent: it disables | ||
the support for the policy backed by the service, i.e. `AuthPolicy` and `RateLimitPolicy` | ||
respectively. | ||
|
||
When disabling `RateLimitPolicy`, Kuadrant will not create the `Limitador` object used by the | ||
`limitador-operator`. Disabling the `AuthPolicy` yields the same result for the `Authorino` object. | ||
Additionally it will avoid registering the [external | ||
authorizer](https://istio.io/latest/docs/tasks/security/authorization/authz-custom/) with Istio. The | ||
both policies API will be exposed and the reconcilers will still be registered. Creating a policy that | ||
is disabled would be denied. | ||
|
||
A disabled policy can be enabled by changing the value of the `policy` key. Flipping the value can | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how would we enforce this? |
||
only happen when no policy object exist, e.g. in order to disable the `authentication`, no | ||
`AuthPolicy` can exist or this would fail. | ||
|
||
While no default `Limitador` or `Authorino` object would be created, a user can absolutely create one | ||
maleck13 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
or the other and use their APIs directly. In the case of authentication, the user would need to | ||
register the external authorizer with Istio themselves. | ||
|
||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
Changing the definition of the `kuadrant` custom resource is relatively straight forward. It is | ||
intended that the way to enable or disable the support for either policy is not a boolean value. Using | ||
an enumeration, i.e. `disabled` or `enabled`, enables growing the set of values later. | ||
|
||
## Installation | ||
|
||
On initial installation, honoring one or the other setting would be fairly straight forward as well: | ||
|
||
The both CRDs, as well as their reconcilers all get registered whether the `policy` support is | ||
`enabled` or `disabled`. | ||
|
||
- `enabled` The Kuadrant controller makes sure the external authorizer gets registered with Istio and | ||
makes sure there is an `Authorino` and/or `Limitador` object without any changes as it does today. | ||
- `disabled` None of the above would happen. Most importantly, any missing part required for the above | ||
shouldn't have the reconciliation fail and prevent further work it requires to do (as reconciling | ||
another policy that's being `enabled`). | ||
|
||
Additionally a `ValidatingWebhookConfiguration` gets registered for each CR's `CREATE` operation for | ||
the entire cluster. The webhook will only allow admissions of resources for which the policy is | ||
currently enabled. `UPDATE` and `DELETE` should remain unhindered, as in order to transition the | ||
policy support requires no policies to be present. | ||
|
||
If any of the following steps fails, all previously applied changes need to be unrolled and the | ||
`status` block of the CR would clearly reflected what changed and that Kuadrant is currently _not_ | ||
operational. This is to be considered an atomic operation, either all gets successfully deployed or | ||
nothing is. | ||
|
||
If a policy is disabled but a policy is being added, when disallowing the admission, the webhook would | ||
use a response code `501 Not Implemented` with the additional `message` indicating the policy is | ||
disabled and as such can't be honored. | ||
|
||
### In an OSSM cluster | ||
|
||
The deployment of `Kuadrant` should fail if `authentication` isn't explicitly disabled. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should we make this explicit in this proposal? I assume we will want it to succeed in the future? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure this is required after Kuadrant/kuadrant-operator#111 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @eguzki's right. This section can now be removed from the RFC. |
||
|
||
## State transitions of policy support | ||
|
||
If the `Kuadrant` object gets updated to change the support of a policy, the change would only be | ||
considered if no such policy exists in the cluster. No state transitions with regards to policy | ||
support is possible if policies of the given type are present. | ||
|
||
While it would be good to be able to enable the support for a policy while such policies already | ||
exist, there is much to consider as to what this would entail - here are a few things to consider: | ||
|
||
- can we apply all these policies? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I feel we have this issue anyway? If I remove all the policies and recreate post enabling we will need to be able to reconcile them |
||
- what ordering should be considered to apply them? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure the controller should be concerned about the ordering, unless I have missed something. Is it about conflict? To resolve a conflict do we not need to fall back to the oldest policy wins anyway from policy attachment? |
||
- what would a conflict mean? | ||
- what would disabling `authentication` for a cluster, while existing routes are currently protected by these? | ||
|
||
Requiring to first remove all policies and then only safely disable or enable the support for that | ||
kind of policy seems reasonable at this stage and can always be added later, which clearly specified | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree so the questions above may still be valid for discussion but I think as a starting point this is simple set of states to deal with initially but also agree that we shouldn't rule this out in the future as likely the flow a user will follow here is |
||
behavior. | ||
|
||
### Disabling the support for a policy | ||
|
||
When disabling the support for a policy, merely the admission webhook's behavior changes. The | ||
`Authorino` resource or `Limitador` ones remain untouched. | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
The flipside of the flexibility this provides users with, this introduces quite some complexity. Hence | ||
the desire to keep this very rigid as to what are the preconditions to change the support for a | ||
policy. | ||
|
||
# Rationale and alternatives | ||
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
Possibly the gain of not running unnecessary services on the cluster isn't big enough for us to | ||
justify adding this complexity. Yet, as of today, we need a way to disable `authentication` in the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No longer true. See https://github.com/Kuadrant/architecture/pull/6/files#r1038517609 |
||
case of running within an OSSM backed cluster. | ||
|
||
We could disable `authentication` on a heuristic base approach: running within OSSM? Disable | ||
`authentication`. The user experience seems less than desirable and confusing as best, as in future | ||
version this would probably change. That being said we're still pre-v1. We'd also need to find a way | ||
to know very early on what environment we are being deployed in. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm. So this for me comes to the fundamental question. Do we want to support these different system states as feature or do we want to say in v0.x.y authpolicy is not supported in a particular environment add the webhook mentioned earlier during installation and reject any AuthPolicies in that environment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would not follow the approach of adding/removing CRDs dynamically based on some Kuadrant CR. CRD's are global cluster objects and that raises potential conflicts. What if some user in the cluster is using the Limitador operator to limit their service by their own (not using kuadrant)? Or how does this idea fit when kuadrant supports multiple kuadrant data plane instances? Regarding having AuthPolicy part of the Authorino Operator (or RateLimitPolicy part of the Limitador Operator) is interesting to consider. I do not see now a hard stopper. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see Kuadrant's RLP and KAP as definitions of Limitador Operator and Authorino Operator respectively. These CRDs of Kuadrant are committed to GW-API's policy attachment value proposition, which the aforementioned operators aren't. In fact, I'd see such as a violation of the current dependency that exists between Kuadrant and its functional components, where the lower dependencies (Limitador and Authorino) currently can exist without knowing anything about Kuadrant or its purpose. Furthermore and especially within the scope of this particular RFC, I agree with @eguzki's point of CRDs being global definitions in the cluster and therefore their installation not the way to control whether a particular workload in a particular namespace shall be deployed or not. For both of these reasons, I believe that installing the CRD is one thing, while choosing which functional component to deploy is another separate one. Therefore, RLP and KAP CRDs should continue living in the Kuadrant Operator IMO. |
||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
While I couldn't find, probably because of lacking knowledge about similar use-cases, anything that | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think in general not enabling an API in a cluster or disabling an API is done by making the CRD available or deleting it |
||
contradicts what this approach proposes, I couldn't see how what this would entail goes against any | ||
Kubernetes best practices for Operators. | ||
|
||
# Unresolved questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
- Is this really what we want? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is the key question. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we want this. However we might want other features before this one to make users/consumers adopt Kuadrant. I know I do not know :) |
||
- Can we get authentication supported in OSSM and possibly pun on this a little longer? | ||
- Should we have a deployment that skips authentication entirely when in an OSSM cluster? | ||
|
||
# Future possibilities | ||
[future-possibilities]: #future-possibilities | ||
|
||
- relaxing the requirements for state transitions on policy support | ||
- support some kind of lazy support for policy, where limitador and/or authorino are only lazily | ||
deployed to the cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it leaves room for 'authorization' in the meaning of the flag as well.