Skip to content

Commit

Permalink
update release notes and docs
Browse files Browse the repository at this point in the history
  • Loading branch information
songjiaxun committed Apr 12, 2024
1 parent 17c849c commit f116f06
Show file tree
Hide file tree
Showing 5 changed files with 113 additions and 38 deletions.
4 changes: 2 additions & 2 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,10 @@ limitations under the License.
```

## Install
- Run the following command to install the latest driver with version `v0.1.13`. The driver will be installed under a new namespace `gcs-fuse-csi-driver`. The installation may take a few minutes.
- Run the following command to install the latest driver with version `v1.2.0`. The driver will be installed under a new namespace `gcs-fuse-csi-driver`. The installation may take a few minutes.
```bash
# Replace <cluster-project-id> with your cluster project ID.
make install STAGINGVERSION=v0.1.13 PROJECT=<cluster-project-id>
make install STAGINGVERSION=v1.2.0 PROJECT=<cluster-project-id>
```

- If you would like to build your own images, follow the [Cloud Storage FUSE CSI Driver Development Guide](development.md) to build and push the images. Run the following command to install the driver.
Expand Down
20 changes: 8 additions & 12 deletions docs/known-issues.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,37 +31,33 @@ After the CSI driver creates the mount point, it will inform kubelet to proceed

In the sidecar container, which is an unprivileged container, a process connects to the UDS and calls [recvmsg(2)](https://man7.org/linux/man-pages/man2/recvmsg.2.html) to receive the file descriptor. Then the process calls Cloud Storage FUSE passing the file descriptor to start to serve the FUSE mount point. Instead of passing the actual mount point path, we pass the file descriptor to Cloud Storage FUSE as it supports the [magic /dev/fd/N syntax](https://github.com/GoogleCloudPlatform/gcsfuse/blob/8ab11cd07016a247f64023697383c6e88bc022b0/vendor/github.com/jacobsa/fuse/mount_linux.go#L128-L134). Before the Cloud Storage FUSE takes over the file descriptor, any operations against the mount point will hang.

Since the CSI driver sets `requiresRepublish: true`, it periodically checks whether the GCSFuse volume is still needed by the containers. When the CSI driver detects all the main workload containers have terminated, it creates an exit file in a Pod emptyDir volume to notify the sidecar container to terminate.

### Implications of the sidecar container design

Until the Cloud Storage FUSE takes over the file descriptor, the mount point is not accessible. Any operations against the mount point will hang, including [stat(2)](https://man7.org/linux/man-pages/man2/lstat.2.html) that is used to check if the mount point exists.

The sidecar container, or more precisely, the Cloud Storage FUSE process that serves the mount point needs to remain running for the full duration of the Pod's lifecycle. If the Cloud Storage FUSE process is killed, the workload application will throw IO error `Transport endpoint is not connected`.

The sidecar container auto-termination depends on Kubernetes API correctly reporting the Pod status. However, due to a [Kubernetes issue](https://github.com/kubernetes/kubernetes/issues/106896), container status is not updated after termination caused by Pod deletion. As a result, the sidecar container may not automatically terminate in some scenarios.

### Issues

- [The CSI driver does not support volumes for initContainers](https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/38)
- [The sidecar container is at the spec.containers[0] position which may cause issues in some workloads](https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/20)
- [subPath does not work when Anthos Service Mesh is enabled](https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/47)
- ["Error: context deadline exceeded" when Anthos Service Mesh is enabled](https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/46)
- [The sidecar container does not work well with istio-proxy sidecar container](https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/53)
- [The sidecar container does not respect terminationGracePeriodSeconds when the Pod restartPolicy is OnFailure or Always](https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/168)

### Solutions

Unfortunately, there is no good short-term solution or workaround for the above issues due to the restrictions of the sidecar container mode design.

The [sidecar containers KEP](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/753-sidecar-containers) is implemented in this PR [Add SidecarContainers feature](https://github.com/kubernetes/kubernetes/pull/116429).
The GCS FUSE SCI Driver now utilizes the [Kubernetes native sidecar container feature](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/753-sidecar-containers), available in GKE versions 1.29.3-gke.1093000 or later.

> The new feature gate "SidecarContainers" is now available. This feature introduces sidecar containers, a new type of init container that starts before other containers but remains running for the full duration of the pod's lifecycle and will not block pod termination.
The Kubernetes native sidecar container feature introduces sidecar containers, a new type of init container that starts before other containers but remains running for the full duration of the pod's lifecycle and will not block pod termination.

This new feature is a good long-term solution. Instead of injecting the sidecar container as a regular container, we will leverage the new SidecarContainers feature to inject the container as an init container, so that other non-sidecar init container can also use the CSI driver.

We are currently testing the SidecarContainers feature, and will adopt the feature when it is available on GKE.
Instead of injecting the sidecar container as a regular container, the sidecar container is now injected as an init container, so that other non-sidecar init containers can also use the CSI driver. Moreover, the sidecar container lifecycle, such as auto-termination, is managed by Kubernetes.

## Issues in Autopilot clusters

- [Resource limitation for the sidecar container on Autopilot using GPU: 2 CPU and 14GB Memory](https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/35)
- [Cannot upload files larger than 10Gi in Autopilot clusters](https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/21)

## Other issues

- [Multiple PVs referring to the same bucket does not work](https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/48)
21 changes: 16 additions & 5 deletions docs/releases.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,24 @@ limitations under the License.
| [v0.1.12](https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/releases/tag/v0.1.12) | Released | 2024-01-25 | [v1.4.0](https://github.com/GoogleCloudPlatform/gcsfuse/releases/tag/v1.4.0) | [7898e40bf57f](https://gcr.io/gke-release/gcs-fuse-csi-driver-sidecar-mounter@sha256:7898e40bf57f159dc828511f4217cb42c08fa4df0c9ad732a0b0747b66e415c6) | None | 1.25.16-gke.1268000 | 1.26.12-gke.1111000 | 1.27.9-gke.1092000 | None | 1.29.0-gke.1381000 |
| [v0.1.13](https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/releases/tag/v0.1.13) | Released | 2024-02-08 | [v1.4.1](https://github.com/GoogleCloudPlatform/gcsfuse/releases/tag/v1.4.1) | [972699a4bf89](https://gcr.io/gke-release/gcs-fuse-csi-driver-sidecar-mounter@sha256:972699a4bf8973f7614f09908412a1fca24ea939eac2d3fcca599109f71fc162) | None | 1.25.16-gke.1360000 | 1.26.13-gke.1052000 | 1.27.10-gke.1055000 | 1.28.6-gke.1095000 | 1.29.1-gke.1425000 |
| [v0.1.14](https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/releases/tag/v0.1.14) | Released | 2024-02-20 | [v1.4.1](https://github.com/GoogleCloudPlatform/gcsfuse/releases/tag/v1.4.1) | [c83609ecf50d](https://gcr.io/gke-release/gcs-fuse-csi-driver-sidecar-mounter@sha256:c83609ecf50d05a141167b8c6cf4dfe14ff07f01cd96a9790921db6748d40902) | None | 1.25.16-gke.1537000 | 1.26.14-gke.1006000 | 1.27.11-gke.1018000 | 1.28.6-gke.1456000 | 1.29.2-gke.1060000 |
| [v1.2.0](https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/releases/tag/v1.2.0) | Released | 2024-04-04 | [v2.0.0](https://github.com/GoogleCloudPlatform/gcsfuse/releases/tag/v2.0.0) | [31880114306b](https://gcr.io/gke-release/gcs-fuse-csi-driver-sidecar-mounter@sha256:31880114306b1fb5d9e365ae7d4771815ea04eb56f0464a514a810df9470f88f) | None | TBD | TBD | TBD | TBD | 1.29.3-gke.1093000 |

> Note: The above GKE versions may not be valid any more, please follow the [GKE documentation](https://cloud.google.com/kubernetes-engine/docs/concepts/release-channels#what_versions_are_available_in_a_channel) to check what versions are available in a channel.
The new CSI driver version will be first available in GKE Rapid channel on its release date. For Regular and Stable channels, plan for a 4-week and 12-week wait respectively.

## Releases

### v1.2.0

- Update gcsfuse to v2.0.0.
- Update golang version to 1.22.2.
- Add GCSFuse file cache features.
- Add volume attributes supports.
- Adopt Kubernetes native sidecar container features in GKE 1.29 to support init container volume mounting. Fix the [issue](https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/168) where the sidecar container does not respect terminationGracePeriodSeconds when the Pod restartPolicy is OnFailure or Always.
- Add a rate limiter to the CSI node server to avoid GCP API throttling errors.
- Refactor code to increase stability and readability.

### v0.1.14

- Fix sidecar container auto-termination logic for Pods with restart policy OnFailure.
Expand Down Expand Up @@ -84,15 +95,15 @@ This release is abandoned.
- Updated go modules.
- Updated gcsfuse version to v1.2.1-gke.0.
- Updated CSI driver golang builder version to go1.21.4.
- Allow users to override sidecar grace-period to fix https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/91.
- Add CSI fsgroup delegation support to fix https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/16.
- Allow users to override sidecar grace-period to fix <https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/91>.
- Add CSI fsgroup delegation support to fix <https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/16>.

### v0.1.6

- Updated go modules.
- Updated sidecar container versions.
- Updated CSI driver golang builder version to go1.21.2.
- Make the sidecar container follow the [Restricted Pod Security Standard](https://kubernetes.io/docs/concepts/security/pod-security-standards/#restricted), setting securityContext.capabilities.drop=["ALL"] to fix the issue https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/52
- Make the sidecar container follow the [Restricted Pod Security Standard](https://kubernetes.io/docs/concepts/security/pod-security-standards/#restricted), setting securityContext.capabilities.drop=["ALL"] to fix the issue <https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/52>
- Fixed the behavior when users pass "0" to the pod annotation to configure the sidecar container resources, allowing the sidecar container to consume unlimited resources on Standard clusters.
- Fixed sidecar container validation logic in webhook.

Expand Down Expand Up @@ -131,7 +142,7 @@ This release is abandoned.
- Fixed copyright information.
- Updated documentation.
- Added ARM node support.
- Fixed issue https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/23.
- Fixed issue <https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/23>.
- Fixed other issues.

### v0.1.2
Expand All @@ -151,4 +162,4 @@ This release is abandoned.

### v0.1.0

- Initial alpha release of the Google Cloud Storage FUSE CSI Driver.
- Initial alpha release of the Google Cloud Storage FUSE CSI Driver.
2 changes: 1 addition & 1 deletion docs/terraform.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ If you are using Terraform to create GKE clusters, use `gcs_fuse_csi_driver_conf

The following example is a `.tf` file excerpt showing how to enable the CSI driver, GKE Workload Identity, and GKE Metadata Server:

```
```terraform
resource "google_container_cluster" "primary" {
# Enable GKE Workload Identity.
Expand Down
Loading

0 comments on commit f116f06

Please sign in to comment.