Skip to content

Commit

Permalink
Change the API of kubeflow pipeline from vineyard.csi.read/writer t…
Browse files Browse the repository at this point in the history
…o `client.get/put` (#1614)

Signed-off-by: Ye Cao <[email protected]>
  • Loading branch information
dashanji authored Nov 27, 2023
1 parent aadb6dd commit 7a9b3fc
Show file tree
Hide file tree
Showing 22 changed files with 855 additions and 32 deletions.
22 changes: 22 additions & 0 deletions docs/notes/cloud-native/deploy-kubernetes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,27 @@ Deploy on Kubernetes

Vineyard is managed by the :ref:`vineyard-operator` on Kubernetes.

Quick start
-----------

If you want to install vineyard cluster quickly, you can
use the following command.

Install `vineyardctl`_ as follows.

.. code:: bash
pip3 install vineyard
Use the vineyardctl to install vineyard cluster.

.. code:: bash
python3 -m vineyard.ctl install vineyard-cluster --create-namespace
Also, you could follow the next guide to install vineyard cluster steps
by steps.

Install vineyard-operator
-------------------------

Expand Down Expand Up @@ -196,5 +217,6 @@ automates much of the boilerplate configuration required when deploying workflow
^^^^^^^^^^^^
:code:`vineyardctl` is the command-line tool for working with the Vineyard Operator.

.. _vineyardctl: https://github.com/v6d-io/v6d/blob/main/k8s/cmd/README.md
.. _kind: https://kind.sigs.k8s.io
.. _CRD: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions
14 changes: 0 additions & 14 deletions docs/notes/developers/build-from-source.rst
Original file line number Diff line number Diff line change
Expand Up @@ -139,20 +139,6 @@ After building the vineyard library successfully, you can package an install whe
python3 setup.py bdist_wheel
Install vineyardctl
-------------------

Vineyardctl is available on the Github release page, you can download the binary as follows:

.. code:: shell
export LATEST_TAG=$(curl -s "https://api.github.com/repos/v6d-io/v6d/tags" | jq -r '.[0].name')
export OS=$(uname -s | tr '[:upper:]' '[:lower:]')
export ARCH=${$(uname -m)/x86_64/amd64}
curl -Lo vineyardctl https://github.com/v6d-io/v6d/releases/download/$LATEST_TAG/vineyardctl-$LATEST_TAG-$OS-$ARCH
chmod +x vineyardctl
sudo mv vineyardctl /usr/local/bin/
Building the documentation
--------------------------
Expand Down
9 changes: 5 additions & 4 deletions docs/tutorials/kubernetes/using-vineyard-operator.rst
Original file line number Diff line number Diff line change
Expand Up @@ -252,7 +252,7 @@ Check the status of all relevant resources managed by the ``vineyardd-sample`` c
.. code:: bash
$ kubectl get all -l app.kubernetes.io/instance=vineyardd -n vineyard-system
$ kubectl get all -l app.kubernetes.io/instance=vineyard-system-vineyardd-sample -n vineyard-system
.. admonition:: Expected output
:class: admonition-details
Expand Down Expand Up @@ -307,11 +307,11 @@ First, let's deploy the Python client on two Vineyard nodes as follows.
containers:
- name: vineyard-python
imagePullPolicy: IfNotPresent
image: vineyardcloudnative/vineyard-python:v0.11.4
image: python:3.10
command:
- /bin/bash
- -c
- sleep infinity
- pip3 install vineyard && sleep infinity
volumeMounts:
- mountPath: /var/run
name: vineyard-sock
Expand Down Expand Up @@ -341,7 +341,8 @@ Wait for the vineyard python client pod ready.
.. code:: bash
NAME READY STATUS RESTARTS AGE
vineyard-python-client-6fd8c47c98-7btkv 1/1 Running 0 93s
vineyard-python-client-6fd84bc897-27glp 1/1 Running 0 93s
vineyard-python-client-6fd84bc897-tlb22 1/1 Running 0 93s
Use the kubectl exec command to enter the first vineyard python client pod.
Expand Down
2 changes: 1 addition & 1 deletion k8s/examples/vineyard-csidriver/Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
REGISTRY := "ghcr.io/v6d-io/v6d/kubeflow-example"
REGISTRY := "ghcr.io/v6d-io/v6d/csidriver-example"
docker-build:
docker build prepare-data/ -f Dockerfile \
--build-arg APP=prepare-data.py \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,23 +4,23 @@
@dsl.container_component
def PreProcess(data_multiplier: int):
return dsl.ContainerSpec(
image = 'ghcr.io/v6d-io/v6d/kubeflow-example/preprocess-data',
image = 'ghcr.io/v6d-io/v6d/csidriver-example/preprocess-data',
command = ['python3', 'preprocess.py'],
args = [f'--data_multiplier={data_multiplier}', '--with_vineyard=True'],
)

@dsl.container_component
def Train():
return dsl.ContainerSpec(
image = 'ghcr.io/v6d-io/v6d/kubeflow-example/train-data',
image = 'ghcr.io/v6d-io/v6d/csidriver-example/train-data',
command = ['python3', 'train.py'],
args = ['--with_vineyard=True'],
)

@dsl.container_component
def Test():
return dsl.ContainerSpec(
image = 'ghcr.io/v6d-io/v6d/kubeflow-example/test-data',
image = 'ghcr.io/v6d-io/v6d/csidriver-example/test-data',
command = ['python3', 'test.py'],
args = ['--with_vineyard=True'],
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,23 +99,23 @@ deploymentSpec:
command:
- python3
- preprocess.py
image: ghcr.io/v6d-io/v6d/kubeflow-example/preprocess-data
image: ghcr.io/v6d-io/v6d/csidriver-example/preprocess-data
exec-test:
container:
args:
- --with_vineyard=True
command:
- python3
- test.py
image: ghcr.io/v6d-io/v6d/kubeflow-example/test-data
image: ghcr.io/v6d-io/v6d/csidriver-example/test-data
exec-train:
container:
args:
- --with_vineyard=True
command:
- python3
- train.py
image: ghcr.io/v6d-io/v6d/kubeflow-example/train-data
image: ghcr.io/v6d-io/v6d/csidriver-example/train-data
pipelineInfo:
description: An example pipeline that trains and logs a regression model.
name: machine-learning-pipeline-with-vineyard
Expand Down
6 changes: 3 additions & 3 deletions k8s/examples/vineyard-csidriver/pipeline-kfp-v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,22 @@
@dsl.container_component
def PreProcess(data_multiplier: int):
return dsl.ContainerSpec(
image = 'ghcr.io/v6d-io/v6d/kubeflow-example/preprocess-data',
image = 'ghcr.io/v6d-io/v6d/csidriver-example/preprocess-data',
command = ['python3', 'preprocess.py'],
args=[f'--data_multiplier={data_multiplier}'],
)

@dsl.container_component
def Train():
return dsl.ContainerSpec(
image='ghcr.io/v6d-io/v6d/kubeflow-example/train-data',
image='ghcr.io/v6d-io/v6d/csidriver-example/train-data',
command = ['python3', 'train.py'],
)

@dsl.container_component
def Test():
return dsl.ContainerSpec(
image='ghcr.io/v6d-io/v6d/kubeflow-example/test-data',
image='ghcr.io/v6d-io/v6d/csidriver-example/test-data',
command = ['python3', 'test.py'],
)

Expand Down
6 changes: 3 additions & 3 deletions k8s/examples/vineyard-csidriver/pipeline-kfp-v2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,19 +23,19 @@ deploymentSpec:
command:
- python3
- preprocess.py
image: ghcr.io/v6d-io/v6d/kubeflow-example/preprocess-data
image: ghcr.io/v6d-io/v6d/csidriver-example/preprocess-data
exec-test:
container:
command:
- python3
- test.py
image: ghcr.io/v6d-io/v6d/kubeflow-example/test-data
image: ghcr.io/v6d-io/v6d/csidriver-example/test-data
exec-train:
container:
command:
- python3
- train.py
image: ghcr.io/v6d-io/v6d/kubeflow-example/train-data
image: ghcr.io/v6d-io/v6d/csidriver-example/train-data
pipelineInfo:
description: An example pipeline that trains and logs a regression model.
name: machine-learning-pipeline
Expand Down
2 changes: 1 addition & 1 deletion k8s/examples/vineyard-csidriver/prepare-data.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ spec:
spec:
containers:
- name: prepare-data
image: ghcr.io/v6d-io/v6d/kubeflow-example/prepare-data
image: ghcr.io/v6d-io/v6d/csidriver-example/prepare-data
imagePullPolicy: Always
command: ["python3", "/prepare-data.py"]
volumeMounts:
Expand Down
10 changes: 10 additions & 0 deletions k8s/examples/vineyard-kubeflow/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
FROM python:3.10

RUN pip3 install --no-cache-dir pandas requests scikit-learn numpy vineyard

WORKDIR /

ARG APP
ENV APP ${APP}

COPY ${APP} /${APP}
23 changes: 23 additions & 0 deletions k8s/examples/vineyard-kubeflow/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
REGISTRY := "ghcr.io/v6d-io/v6d/kubeflow-example"
docker-build:
docker build prepare-data/ -f Dockerfile \
--build-arg APP=prepare-data.py \
-t $(REGISTRY)/prepare-data

docker build preprocess/ -f Dockerfile \
--build-arg APP=preprocess.py \
-t $(REGISTRY)/preprocess-data

docker build train/ -f Dockerfile \
--build-arg APP=train.py \
-t $(REGISTRY)/train-data

docker build test/ -f Dockerfile \
--build-arg APP=test.py \
-t $(REGISTRY)/test-data

push-images:
docker push $(REGISTRY)/prepare-data
docker push $(REGISTRY)/preprocess-data
docker push $(REGISTRY)/train-data
docker push $(REGISTRY)/test-data
88 changes: 88 additions & 0 deletions k8s/examples/vineyard-kubeflow/pipeline-with-vineyard.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
from kfp import dsl
from kubernetes.client.models import V1EnvVar
import kubernetes as k8s

def PreProcess(data_multiplier: int, registry: str):
vineyard_volume = dsl.PipelineVolume(
volume=k8s.client.V1Volume(
name="vineyard-socket",
host_path=k8s.client.V1HostPathVolumeSource(
path="/var/run/vineyard-kubernetes/vineyard-system/vineyardd-sample"
)
)
)

op = dsl.ContainerOp(
name='Preprocess Data',
image = f'{registry}/preprocess-data',
container_kwargs={
'image_pull_policy': "Always",
'env': [V1EnvVar('VINEYARD_IPC_SOCKET', '/var/run/vineyard.sock')]
},
pvolumes={
"/data": dsl.PipelineVolume(pvc="benchmark-data"),
"/var/run": vineyard_volume,
},
command = ['python3', 'preprocess.py'],
arguments=[f'--data_multiplier={data_multiplier}', '--with_vineyard=True'],
)
op.add_pod_label('scheduling.k8s.v6d.io/vineyardd-namespace', 'vineyard-system')
op.add_pod_label('scheduling.k8s.v6d.io/vineyardd', 'vineyardd-sample')
op.add_pod_label('scheduling.k8s.v6d.io/job', 'preprocess-data')
op.add_pod_annotation('scheduling.k8s.v6d.io/required', '')
return op

def Train(comp1, registry: str):
op = dsl.ContainerOp(
name='Train Data',
image=f'{registry}/train-data',
container_kwargs={
'image_pull_policy': "Always",
'env': [V1EnvVar('VINEYARD_IPC_SOCKET', '/var/run/vineyard.sock')]
},
pvolumes={
"/data": comp1.pvolumes['/data'],
"/var/run": comp1.pvolumes['/var/run'],
},
command = ['python3', 'train.py'],
arguments=['--with_vineyard=True'],
)
op.add_pod_label('scheduling.k8s.v6d.io/vineyardd-namespace', 'vineyard-system')
op.add_pod_label('scheduling.k8s.v6d.io/vineyardd', 'vineyardd-sample')
op.add_pod_label('scheduling.k8s.v6d.io/job', 'train-data')
op.add_pod_annotation('scheduling.k8s.v6d.io/required', 'preprocess-data')
return op

def Test(comp2, registry: str):
op = dsl.ContainerOp(
name='Test Data',
image=f'{registry}/test-data',
container_kwargs={
'image_pull_policy': "Always",
'env': [V1EnvVar('VINEYARD_IPC_SOCKET', '/var/run/vineyard.sock')]
},
pvolumes={
"/data": comp2.pvolumes['/data'],
"/var/run": comp2.pvolumes['/var/run']
},
command = ['python3', 'test.py'],
arguments=['--with_vineyard=True'],
)
op.add_pod_label('scheduling.k8s.v6d.io/vineyardd-namespace', 'vineyard-system')
op.add_pod_label('scheduling.k8s.v6d.io/vineyardd', 'vineyardd-sample')
op.add_pod_label('scheduling.k8s.v6d.io/job', 'test-data')
op.add_pod_annotation('scheduling.k8s.v6d.io/required', 'train-data')
return op

@dsl.pipeline(
name='Machine Learning Pipeline',
description='An example pipeline that trains and logs a regression model.'
)
def pipeline(data_multiplier: int, registry: str):
comp1 = PreProcess(data_multiplier=data_multiplier, registry=registry)
comp2 = Train(comp1, registry=registry)
comp3 = Test(comp2, registry=registry)

if __name__ == '__main__':
from kfp import compiler
compiler.Compiler().compile(pipeline, __file__[:-3]+ '.yaml')
Loading

0 comments on commit 7a9b3fc

Please sign in to comment.