Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design a vineyard csi driver for workloads based on kubernetes volumes. #1533

Merged
merged 4 commits into from
Sep 19, 2023

Conversation

dashanji
Copy link
Member

@dashanji dashanji commented Aug 22, 2023

What do these changes do?

  • Design the vineyard csi driver framework including the driver logic and deployment manifests.
  • Create the soft link of vineyard socket for every PersistentVolumeClaim.
  • Map every vineyard object to a PersistentVolumeClaim/PersistentVolume.
  • Introduce https://github.com/kubernetes-csi/csi-test to validate the idempotence of vineyard csi driver API.

Related issue number

Fixes parts of #1528

@netlify
Copy link

netlify bot commented Aug 22, 2023

Deploy Preview for v6d ready!

Name Link
🔨 Latest commit 7d829cf
🔍 Latest deploy log https://app.netlify.com/sites/v6d/deploys/650050a2a0a1c400087eb28b
😎 Deploy Preview https://deploy-preview-1533--v6d.netlify.app/notes/references/crds
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@dashanji dashanji force-pushed the design-vineyard-csi-driver branch 2 times, most recently from ea420b5 to 6958569 Compare August 24, 2023 07:55
Copy link
Member

@sighingnow sighingnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any simple readme/docs/or even internal docs that can tell how to getting started with the csi-driver?

I would like to take a try.

k8s/Dockerfile Outdated
# Use distroless as minimal base image to package the manager binary
# Refer to https://github.com/GoogleContainerTools/distroless for more details
FROM gcr.io/distroless/static:nonroot
FROM ${BASE_IMAGE}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need other base images?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the image gcr.io/distroless/static:nonroot doesn't have the mount command and the csi driver image need it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then you should use gcr.io/distroless/base:debug which has busybox installed and there would be a mount command available.

NEVER use your build environments (e.g., golang image) as the base to release your artifacts.

from sklearn.metrics import mean_squared_log_error, mean_absolute_error

def path_to_vineyard_name(path):
return path.replace("/", "_")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for such hacks as name can have '/' since #1550.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great patch!

@dashanji dashanji force-pushed the design-vineyard-csi-driver branch 3 times, most recently from 2e1c8bc to a85597a Compare September 5, 2023 11:54
@dashanji dashanji changed the title [WIP] Design a vineyard csi driver for workloads based on kubernetes volumes. Design a vineyard csi driver for workloads based on kubernetes volumes. Sep 5, 2023
Copy link
Member

@sighingnow sighingnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some first glance experiences:

  • The deployment of vineyardd and csi-driver should be separated, i.e., we first have vineyardd, the we want a volume view for those objects, we ask the vineyard-operator to launch/deploy csi agents on nodes (specify the namespace and instance name of Vineyardd CRD that the csi driver will manage).
  • the owner-reference of csi driver deployment should be the corresponding vineyard cluster
  • the owner-reference of those pv/pvc should be the csi-driver deployment

About the PUT/GET api, I suggest something like

  • vineyard.read('/path...managed...by...csi driver')
  • vineyard.write('/path..........', object)

where the corresponding vineyard IPC socket will be resolved from the abstract file inside the volume.

Copy link
Member

@sighingnow sighingnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added some minor comments.

@@ -0,0 +1,271 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the pkg/ directory to outside pkg/csi/.

return s[0], s[1], nil
}
}
return "", "", fmt.Errorf("Invalid endpoint: %v", ep)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using fmt.Errorf, using github.com/pkg/errors.Errorf instead to track the traceback of errors for debugging in the future.

k8s/cmd/main.go Outdated
@@ -27,6 +27,7 @@ import (
gosdklog "github.com/v6d-io/v6d/go/vineyard/pkg/common/log"
"github.com/v6d-io/v6d/k8s/cmd/commands/client"
"github.com/v6d-io/v6d/k8s/cmd/commands/create"
"github.com/v6d-io/v6d/k8s/cmd/commands/csidriver"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer csi over csidriver.

roleRef:
kind: ClusterRole
name: vineyard-csi-nodes
apiGroup: rbac.authorization.k8s.io
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a global service account vineyard (or maybe vineyard-operator) to cover all permissions we require and use it anywhere when a service account is needed?

for simplifying the management.

@dashanji
Copy link
Member Author

dashanji commented Sep 6, 2023

Some first glance experiences:

  • The deployment of vineyardd and csi-driver should be separated, i.e., we first have vineyardd, the we want a volume view for those objects, we ask the vineyard-operator to launch/deploy csi agents on nodes (specify the namespace and instance name of Vineyardd CRD that the csi driver will manage).
  • the owner-reference of csi driver deployment should be the corresponding vineyard cluster
  • the owner-reference of those pv/pvc should be the csi-driver deployment

About the PUT/GET api, I suggest something like

  • vineyard.read('/path...managed...by...csi driver')
  • vineyard.write('/path..........', object)

where the corresponding vineyard IPC socket will be resolved from the abstract file inside the volume.

Thanks for the advice! It really makes sense to me.

@dashanji dashanji force-pushed the design-vineyard-csi-driver branch 7 times, most recently from 769a4fc to 2894f2c Compare September 15, 2023 09:24

### Write time

| data_multiplier | without vineyard | with vineyard |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data_multipiler -> data scale, and use data size (e.g., xxx Mi) in this table.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

import vineyard

def test_model():
os.system('echo 3 > /proc/sys/vm/drop_caches')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want drop page cache, you first need a sync command, then echo to ....


def test_model():
os.system('echo 3 > /proc/sys/vm/drop_caches')
enable_vineyard = os.environ.get('ENABLE_VINEYARD', False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ENABLE_VINEYARD -> WITH_VINEYARD.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

k8s/Dockerfile Outdated
# Use distroless as minimal base image to package the manager binary
# Refer to https://github.com/GoogleContainerTools/distroless for more details
FROM gcr.io/distroless/static:nonroot
FROM ${BASE_IMAGE}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then you should use gcr.io/distroless/base:debug which has busybox installed and there would be a mount command available.

NEVER use your build environments (e.g., golang image) as the base to release your artifacts.

k8s/Makefile Outdated
-t $(VINEYARD_CSI_IMAGE); \
else \
docker build -f k8s/Dockerfile . \
--build-arg BASE_IMAGE=golang:1.19-buster \
--build-arg BASE_IMAGE=gcr.io/distroless/base:debug \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can directly replace the FROM statement in the dockerfile to avoid such a layer of complexity.

@@ -38,6 +38,9 @@ spec:
type: string
type: object
type: array
enableDebugLog:
default: false
type: boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you let user set the log-level (as we previously discussed)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, it looks like users only can set verbose or non-verbose. Do you think EnableVerbose is more meaningful?


var csiExample = util.Examples(`
# start the csidriver with the specific endpoint and node id
vineyardctl csidriver --endpoint=unix:///csi/csi.sock --nodeid=csinode1`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What the nodeid means?


USER root

CMD ["go", "test", "-run", "./..."]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need a separate docker file for go test?


Based on the above results, we can find that the read time of vineyard is nearly a constant, which is not affected by the data scale. The reason is that the data is stored in the shared memory of vineyard cluster, so it's actually a pointer copy operation.

As a result, we can find that with vineyard, the argo workflow duration of the pipeline is reduced by 10%~20% and the actual execution time of the pipeline is reduced by about 30%.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put the readme to the tutorial directory , same as what you have done for another "e xample" under the k8s/ directory:

https://github.com/v6d-io/v6d/tree/main/docs/tutorials/kubernetes

examples.

* Add a new CRD CSIDriver to manage the deployment and status of the vineyard csi driver.
* Add the CSIDriver benchmark test and relevant doc.
* Introduce the kubernetes csi test to enhancing the robustness of the vineyard CSI Driver.
* Generate the doc for CSIDriver CRD.

Signed-off-by: Ye Cao <[email protected]>
* Update the golang-1.19 to gcr.io/distroless/base:debug as csidriver's base image.
* Improve the csidriver example readme.

Signed-off-by: Ye Cao <[email protected]>
@dashanji dashanji force-pushed the design-vineyard-csi-driver branch from 05991dd to 61f124d Compare September 18, 2023 08:39
* Add the deploy/delete csidriver API in the vineyardctl.
* Update the CRD API doc and vineyardctl doc.
* Update the helm chart.
* Update the base image from gcr.io/distroless/static:nonroot to gcr.io/distroless/base:debug.

Signed-off-by: Ye Cao <[email protected]>
@dashanji dashanji force-pushed the design-vineyard-csi-driver branch from 61f124d to e6fd19c Compare September 18, 2023 08:51
@dashanji
Copy link
Member Author

The new commit mainly contains 3 parts:

  • Improve the vineyard csi driver doc and move it to the kubernetes tutorials.
  • Add the deploy/delete csidriver API in the vineyardctl.
  • Update the helm chart.

@sighingnow Could you please take another look? Thanks.

@@ -9,6 +9,7 @@ Vineyard on Kubernetes
./kubernetes/using-vineyard-operator.rst
./kubernetes/ml-pipeline-mars-pytorch.rst
./kubernetes/data-sharing-with-vineyard-on-kubernetes.rst
./kubernetes/speed-up-pipelines-based-on-volumes-with-vineyard-csi-driver.rst
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

speed-up-pipelines-based-on-volumes-with-vineyard-csi-driver

Efficient data sharing in Kubeflow with Vineyard CSI driver.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

k8s/Makefile Outdated
@@ -223,9 +224,11 @@ vendor:
docker-build:
cd .. && \
if docker build --help | grep -q load; then \
docker build --load -f k8s/Dockerfile . -t $(IMG); \
docker build --load -f k8s/Dockerfile . \
-t $(IMG); \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep this part untouched. No need to add a newline here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@dashanji dashanji force-pushed the design-vineyard-csi-driver branch from 904092a to 044c84a Compare September 18, 2023 11:23
@sighingnow sighingnow merged commit e5ca1c7 into v6d-io:main Sep 19, 2023
19 of 23 checks passed
@dashanji dashanji deleted the design-vineyard-csi-driver branch September 19, 2023 05:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants