diff --git a/keps/sig-node/3619-supplemental-groups-policy/README.md b/keps/sig-node/3619-supplemental-groups-policy/README.md new file mode 100644 index 000000000000..a416c78bc2bd --- /dev/null +++ b/keps/sig-node/3619-supplemental-groups-policy/README.md @@ -0,0 +1,797 @@ +# KEP-3619: Fine-grained SupplementalGroups control + + + + +- [Release Signoff Checklist](#release-signoff-checklist) +- [Summary](#summary) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [User Stories](#user-stories) + - [Story 1: Deploy a Security Policy to enforce SupplementalGroupsPolicy field](#story-1-deploy-a-security-policy-to-enforce--field) + - [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional) + - [Risks and Mitigations](#risks-and-mitigations) +- [Design Details](#design-details) + - [Test Plan](#test-plan) + - [Prerequisite testing updates](#prerequisite-testing-updates) + - [Unit tests](#unit-tests) + - [Integration tests](#integration-tests) + - [e2e tests](#e2e-tests) + - [Graduation Criteria](#graduation-criteria) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + - [Version Skew Strategy](#version-skew-strategy) +- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) + - [Feature Enablement and Rollback](#feature-enablement-and-rollback) + - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) + - [Monitoring Requirements](#monitoring-requirements) + - [Dependencies](#dependencies) + - [Scalability](#scalability) + - [Troubleshooting](#troubleshooting) +- [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) +- [Alternatives](#alternatives) +- [Infrastructure Needed (Optional)](#infrastructure-needed-optional) + + +## Release Signoff Checklist + + + +Items marked with (R) are required *prior to targeting to a milestone / release*. + +- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) +- [ ] (R) KEP approvers have approved the KEP status as `implementable` +- [ ] (R) Design details are appropriately documented +- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) + - [ ] e2e Tests for all Beta API Operations (endpoints) + - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) + - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free +- [ ] (R) Graduation criteria is in place + - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) +- [ ] (R) Production readiness review completed +- [ ] (R) Production readiness review approved +- [ ] "Implementation History" section is up-to-date for milestone +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes + + + +[kubernetes.io]: https://kubernetes.io/ +[kubernetes/enhancements]: https://git.k8s.io/enhancements +[kubernetes/kubernetes]: https://git.k8s.io/kubernetes +[kubernetes/website]: https://git.k8s.io/website + +## Summary + +`PodSecurityContext.SupplementalGroups` is a list of groups applied to the first process run in each container, in addition to the container's primary GID, the `fsGroup` (if specified), and __group memberships defined in the container image for the UID__ of the container process. However, most cluster administrators and users would not recognize this behavior, especially regarding the groups defined in the container image. + +For example, if we have a Pod like this: +```yaml +spec: + # the primary UID will be uid=1000 + securityContext: { runAsUser:1000, runAsGroup:1000, fsGroup: [50000], supplementalGroups:[60000]} + containers: + - image: some-image # the image defines uid=1000 belongs to gid=70000 +``` +Then, groups of the first process of the container will be `1000(runAsGroup), 50000(fsGroup), 60000(supplementalGroups), 70000(defined in the image)`. + +The KEP proposes a complementary field named `SupplementalGroupsPolicy` that enables users to customize to respect/ignore the group membership of the primary UID of the container defined in the container image when applying `SupplementalGroups` to the container process. + +## Motivation + +As described above, `SupplementalGroups` will be just added to the groups of the primary UID defined in the container image. When a cluster enforces some security policy that protects the value of `SupplementalGroups`, the effect of its enforcement is limited, i.e., users can easily bypass the enforcement just by using a custom image. If such a bypass happened, it would be unexpected behavior for most cluster administrators because the enforcement is almost useless. Moreover, the bypass will cause unexpected file access permission. In some user cases, the unexpected file access permission will be a security concern. For example, using `hostPath` volumes could be a severe problem because UID/GIDs matter in accessing files/directories in the volumes. + +However, Kubernetes provides no API surface to prevent this bypass, although it could sometimes lead to a security concern. To mitigate the bypass, the cluster administrators would need to deploy a custom low-level container runtime(e.g., [pfnet-research/strict-supplementalgroups-container-runtime](https://github.com/pfnet-research/strict-supplementalgroups-container-runtime)) that modifies OCI container runtime spec(`config.json`) produced by CRI implementations (e.g., containerd, cri-o). A custom `RuntimeClass` would be introduced for it. It would be an extra operational burden for cluster administrators. + +Thus, this KEP proposes to offer a new API field named `SupplementalGroupsPolicy` that enables cluster administrators to avoid the bypass. The new API allows cluster administrators to deploy security policies that protect `SupplementalGroupsPolicy` field in the cluster to avoid the unexpected bypass of `SupplementalGroups` described above. + +### Goals + + + +To Provide a new API field to control how the groups of the first process in the container are calculated. + +### Non-Goals + + + +## Proposal + + + +### User Stories + + + +#### Story 1: Deploy a Security Policy to enforce `SupplementalGroupsPolicy` field + +Assume a multi-tenant kubernetes cluster with `hostPath` volumes below situations: + +- Multi-tenant model is namespace based (namespace per tenant(user/group) model) + - access to each namespace is controlled by RBAC +- PSP(or other policy engines) is enforced in each namespace (detail is described below) +- A `hostPath` volume (say `/mnt/hostpath`) is maintained in all the nodes by administrators + - with permission `drwxr-xr-x nobody nogroup /mnt/hostpath` + - the directory mounts an NFS volume that is shared by all the tenants, and UIDs/GIDs are managed by the cluster admin + - Any tenant CAN create a directory under this directory + +Then, cluster administrators can deploy security policies protecting `supplementalGroupsPolicy` in each tenant namespace. Here is an example policy description for the namespace `user-alice` for user `alice (uid=1000)` (assume `alice` belongs to only `gid=60000` in NFS): + - `runAsUser` must be `1000` + - `runAsGroup` must be `1000` + - `supplementalGroups` must be `[60000]` + - `fsGroup` must be one of `1000, 60000` + - `supplementalGroupsPolicy` must be `IgnoreGroupsInImage` + +Please note that a security policy without `supplementalGroupsPolicy` would lead unexpected groups for the first process in containers as described in [Motivation](#motivation) section. + +### Notes/Constraints/Caveats (Optional) + + + +The proposal affects to the CRI implementaitons (e.g., containerd, cri-o, gVisor, etc.) + +### Risks and Mitigations + + + +- How to track CRI implementations status of this proposal +- How to feature-gate this in CRI implementations + +## Design Details + +A new field named `SupplementalGroupsPolicy` will be introduced to `PodSecurityContext`: + +```go +type PodSecurityContext struct { + ... + // A list of groups applied to the first process run in each container. + // supplementalGroupsPolicy can control how groups will be calculated. + // Note that this field cannot be set when spec.os.name is windows. + // +optional + SupplementalGroups []int64 + // supplementalGroupsPolicy defines behavior of applying supplementalGroups + // to the first process run in each container. + // Valid values are "RespectGroupsInImage" and "IgnoreGroupsInImage". + // If note specified, "RespectGroupsInImage" is used. + // Note that this field cannot be set when spec.os.name is windows. + // +optional + SupplementalGroupsPolicy *PodSecurityGroupsPolicy +} + +type PodSecurityGroupsPolicy string +const ( + // SecurityGroupsPolicyRespectGroupsInImage indicates GIDs + // specified in supplementalGroups will be added to the container's + // primary GID, the fsGroup (if specified), and group memberships + // defined in the container image for the uid of the container process. + SecurityGroupsPolicyRespectGroupsInImage PodSecurityGroupsPolicy = "RespectGroupsInImage" + + // SecurityGroupsPolicyIgnoreGroupsInImage indicates to ignore group + // memberships defined in the container image for the container's + // primary UID of the container process. + // Thus, group applied to the first process run in each container + // will be the container's primary GID, the fsGroup (if specified) + // and the supplementalGroups. + SecurityGroupsPolicyIgnoreGroupsInImage PodSecurityGroupsPolicy = "IgnoreGroupsInImage" +) +``` + +cri-spec (both on `v1` and `v1alpha2`) also needs to be updated similarly because of the change: + +```proto +enum SupplementalGroupsPolicy { + RespectGroupsInImage = 0; + IgnoreGroupsInImage = 1; +} + +message LinuxContainerSecurityContext { +... + repeated int64 supplemental_groups; + optional SupplementalGroupsPolicy supplemental_groups_policy; +} + +message LinuxSandboxSecurityContext { +... + repeated int64 supplemental_groups; + optional SupplementalGroupsPolicy supplemental_groups_policy; +} +``` + +### Test Plan + + + +[ ] I/we understand the owners of the involved components may require updates to +existing tests to make this code solid enough prior to committing the changes necessary +to implement this enhancement. + +##### Prerequisite testing updates + + + +##### Unit tests + + + + + +- ``: `` - `` + +##### Integration tests + + + +- : + +##### e2e tests + + + +- : + +### Graduation Criteria + + + +### Upgrade / Downgrade Strategy + + + +### Version Skew Strategy + + + +- CRI must support this feature, especially when using `SupplementalGroupsPolicy=IgnoreGroupsInImage`. +- kubelet must be at least the version of control-plane components. + +## Production Readiness Review Questionnaire + + + +### Feature Enablement and Rollback + + + +###### How can this feature be enabled / disabled in a live cluster? + + + +- [ ] Feature gate (also fill in values in `kep.yaml`) + - Feature gate name: + - Components depending on the feature gate: +- [ ] Other + - Describe the mechanism: + - Will enabling / disabling the feature require downtime of the control + plane? + - Will enabling / disabling the feature require downtime or reprovisioning + of a node? (Do not assume `Dynamic Kubelet Config` feature is enabled). + +###### Does enabling the feature change any default behavior? + + + +###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? + + + +###### What happens if we reenable the feature if it was previously rolled back? + +###### Are there any tests for feature enablement/disablement? + + + +### Rollout, Upgrade and Rollback Planning + + + +###### How can a rollout or rollback fail? Can it impact already running workloads? + + + +###### What specific metrics should inform a rollback? + + + +###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? + + + +###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? + + + +### Monitoring Requirements + + + +###### How can an operator determine if the feature is in use by workloads? + + + +###### How can someone using this feature know that it is working for their instance? + + + +- [ ] Events + - Event Reason: +- [ ] API .status + - Condition name: + - Other field: +- [ ] Other (treat as last resort) + - Details: + +###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? + + + +###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? + + + +- [ ] Metrics + - Metric name: + - [Optional] Aggregation method: + - Components exposing the metric: +- [ ] Other (treat as last resort) + - Details: + +###### Are there any missing metrics that would be useful to have to improve observability of this feature? + + + +### Dependencies + + + +###### Does this feature depend on any specific services running in the cluster? + + + +### Scalability + + + +###### Will enabling / using this feature result in any new API calls? + + + +###### Will enabling / using this feature result in introducing new API types? + + + +###### Will enabling / using this feature result in any new calls to the cloud provider? + + + +###### Will enabling / using this feature result in increasing size or count of the existing API objects? + + + +###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? + + + +###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? + + + +### Troubleshooting + + + +###### How does this feature react if the API server and/or etcd is unavailable? + +###### What are other known failure modes? + + + +###### What steps should be taken if SLOs are not being met to determine the problem? + +## Implementation History + + + +## Drawbacks + + + +## Alternatives + + + +As described in the [Motivation](#motivation) section, cluster administrators would need to deploy a custom low-level container runtime(e.g., [pfnet-research/strict-supplementalgroups-container-runtime](https://github.com/pfnet-research/strict-supplementalgroups-container-runtime)) that modifies OCI container runtime spec(`config.json`) produced by CRI implementations (e.g., containerd, cri-o). A custom `RuntimeClass` would be introduced for it. + +## Infrastructure Needed (Optional) + + + +N/A \ No newline at end of file diff --git a/keps/sig-node/3619-supplemental-groups-policy/kep.yaml b/keps/sig-node/3619-supplemental-groups-policy/kep.yaml new file mode 100644 index 000000000000..7a45abf97e23 --- /dev/null +++ b/keps/sig-node/3619-supplemental-groups-policy/kep.yaml @@ -0,0 +1,42 @@ +title: "Fine grained SupplementalGroups control" +kep-number: 3619 +authors: + - "@everpeace" +owning-sig: sig-xyz +participating-sigs: + - sig-node +status: provisional +creation-date: 2022-10-14 +reviewers: + - TBD +approvers: + - TBD + +see-also: [] +replaces: [] + +# The target maturity stage in the current dev cycle for this KEP. +stage: alpha + +# The most recent milestone for which work toward delivery of this KEP has been +# done. This can be the current (upcoming) milestone, if it is being actively +# worked on. +latest-milestone: "v1.27" + +# The milestone at which this feature was, or is targeted to be, at each stage. +milestone: + alpha: "v1.27" + beta: "v1.xx" + stable: "v1.yy" + +# The following PRR answers are required at alpha release +# List the feature gate name and the components for which it must be enabled +feature-gates: + - name: SupplementalGroupsPolicy + components: + - kube-apiserver + - kubelet +disable-supported: true + +# The following PRR answers are required at beta release +metrics: []