Skip to content

Commit

Permalink
Add the fluid integration exmaple.
Browse files Browse the repository at this point in the history
Signed-off-by: Ye Cao <[email protected]>
  • Loading branch information
dashanji committed Oct 19, 2023
1 parent 35a9b83 commit 0380a1f
Show file tree
Hide file tree
Showing 8 changed files with 371 additions and 0 deletions.
3 changes: 3 additions & 0 deletions k8s/examples/fluid/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
FROM python:3.10

COPY ./configure-vineyard-socket.py /configure-vineyard-socket.py
2 changes: 2 additions & 0 deletions k8s/examples/fluid/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
build-image:
docker build -t configure-vineyard-socket:latest -f Dockerfile .
101 changes: 101 additions & 0 deletions k8s/examples/fluid/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
## Fluid integration

If you are using [Fluid](https://fluidframework.com/) in your application, now it's a chance to cache your **Python Object** using **Vineyard** based on **Fluid**.

### Prerequisites

- A kubernetes cluster with version >= 1.25.10. If you don't have one by hand, you can refer to the guide [Initialize Kubernetes Cluster](https://v6d.io/tutorials/kubernetes/using-vineyard-operator.html#step-0-optional-initialize-kubernetes-cluster) to create one.

Check warning on line 7 in k8s/examples/fluid/README.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/README.md#L7

[list-item-indent] Incorrect list-item indent: add 2 spaces
- Install the [Vineyardctl](https://v6d.io/notes/developers/build-from-source.html#install-vineyardctl) by following the official guide.

Check warning on line 8 in k8s/examples/fluid/README.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/README.md#L8

[list-item-indent] Incorrect list-item indent: add 2 spaces

### Install the argo server on Kubernetes

1. Install the argo server on Kubernetes:

Check warning on line 12 in k8s/examples/fluid/README.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/README.md#L12

[list-item-indent] Incorrect list-item indent: add 1 space

```bash

Check warning on line 14 in k8s/examples/fluid/README.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/README.md#L14

[no-shell-dollars] Do not use dollar signs before shell commands
$ kubectl create namespace argo
$ kubectl apply -n argo -f https://github.com/argoproj/argo-workflows/releases/download/v3.4.8/install.yaml
```

2. Check the status of the argo server:

Check warning on line 19 in k8s/examples/fluid/README.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/README.md#L19

[list-item-indent] Incorrect list-item indent: add 1 space

```bash
$ kubectl get pod -n argo
NAME READY STATUS RESTARTS AGE
argo-server-7698c96655-xsd5k 1/1 Running 0 10m
workflow-controller-b888f4458-ts58f 1/1 Running 0 10m
```

### Submit the job wihout Vineyard

1. Submit the workflow:

Check warning on line 30 in k8s/examples/fluid/README.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/README.md#L30

[list-item-indent] Incorrect list-item indent: add 1 space

```bash
$ cd k8s/examples/fluid
$ argo submit --watch argo_workflow.yaml
```

### Submit the job with Vineyard


1. Install the vineyard deployment:

Check warning on line 40 in k8s/examples/fluid/README.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/README.md#L40

[list-item-indent] Incorrect list-item indent: add 1 space

Check warning on line 40 in k8s/examples/fluid/README.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/README.md#L40

[no-consecutive-blank-lines] Remove 1 line before node

```bash
$ vineyardctl deploy vineyard-deployment
```

2. Install the Fluid:

Check warning on line 46 in k8s/examples/fluid/README.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/README.md#L46

[list-item-indent] Incorrect list-item indent: add 1 space

```bash
$ kubectl create ns fluid-system
$ helm repo add fluid https://fluid-cloudnative.github.io/charts
$ helm repo update
$ helm install fluid fluid/fluid
```

3. Build the `configure-vineyard-socket` image and load to the cluster.

Check warning on line 55 in k8s/examples/fluid/README.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/README.md#L55

[list-item-indent] Incorrect list-item indent: add 1 space

```bash
$ cd k8s/examples/fluid && make build-image && kind load docker-image configure-vineyard-socket
```

4. Install the Vineyard profile:

Check warning on line 61 in k8s/examples/fluid/README.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/README.md#L61

[list-item-indent] Incorrect list-item indent: add 1 space

```bash
$ kubectl apply -f vineyard_profile.yaml
```

5. Install the Vineyard Dataset:

Check warning on line 67 in k8s/examples/fluid/README.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/README.md#L67

[list-item-indent] Incorrect list-item indent: add 1 space

```bash
$ kubectl apply -f dataset.yaml
```

6. Check the dataset status and make sure the dataset is in `Bound` status as follows.

Check warning on line 73 in k8s/examples/fluid/README.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/README.md#L73

[list-item-indent] Incorrect list-item indent: add 1 space

```bash
$ kubectl get dataset -A
NAMESPACE NAME UFS TOTAL SIZE CACHED CACHE CAPACITY CACHED PERCENTAGE PHASE AGE
default vineyard [Calculating] N/A N/A N/A Bound 105s
```

After that, the vineyard pv and pvc will be created automatically as follows.

```bash
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
vineyard Bound default-vineyard 100Pi RWX fluid 87s
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
default-vineyard 100Pi RWX Retain Bound default/vineyard fluid 2m43s
```

7. Submit the workflow with Vineyard:

Check warning on line 92 in k8s/examples/fluid/README.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/README.md#L92

[list-item-indent] Incorrect list-item indent: add 1 space

```bash
$ argo submit --watch argo_workflow_with_vineyard.yaml
```

### result

The execution time of the workflow without Vineyard is 41s.

Check warning on line 100 in k8s/examples/fluid/README.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/README.md#L100

[no-consecutive-blank-lines] Remove 1 line after node

97 changes: 97 additions & 0 deletions k8s/examples/fluid/argo_workflow.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: mlops-
spec:
entrypoint: dag
templates:
- name: producer
volumes:
- name: data
hostPath:
path: /data
type: DirectoryOrCreate
container:
image: python:3.10
volumeMounts:
- name: data
mountPath: /data
command: [bash, -c]
args:
- |
pip install numpy pandas;
cat << EOF >> producer.py
import time
import numpy as np
import pandas as pd
def generate_random_dataframe(num_rows):
return pd.DataFrame({
'Id': np.random.randint(1, 100000, num_rows),
'TotalRooms': np.random.randint(2, 11, num_rows),
"GarageAge": np.random.randint(1, 31, num_rows),
"RemodAge": np.random.randint(1, 31, num_rows),
"HouseAge": np.random.randint(1, 31, num_rows),
"TotalBath": np.random.randint(1, 5, num_rows),
"TotalPorchSF": np.random.randint(1, 1001, num_rows),
"TotalSF": np.random.randint(1000, 6001, num_rows),
"TotalArea": np.random.randint(1000, 6001, num_rows),
'MoSold': np.random.randint(1, 13, num_rows),
'YrSold': np.random.randint(2006, 2022, num_rows),
'SalePrice': np.random.randint(50000, 800001, num_rows),
})
def producer():
st = time.time()
print('Generating data....', flush=True)
df = generate_random_dataframe(1000000000)
ed = time.time()
print('##################################')
print('Generating data time: ', ed - st, flush=True)
st = time.time()
print('Serializing data....', flush=True)
df.to_pickle('/data/df.pkl')
ed = time.time()
print('##################################')
print('Serializing data time: ', ed - st, flush=True)
EOF
python producer.py;
- name: consumer
volumes:
- name: data
hostPath:
path: /data
type: DirectoryOrCreate
container:
image: python:3.10
volumeMounts:
- name: data
mountPath: /data
command: [bash, -c]
args:
- |
pip install pandas;
cat << EOF >> consumer.py
import time
import pandas as pd
def consumer():
st = time.time()
print('Deserializing data....', flush=True)
df = pd.read_pickle('/data/df.pkl')
ed = time.time()
print('##################################')
print('Deserializing data time: ', ed - st, flush=True)
EOF
python consumer.py;
- name: dag
dag:
tasks:
- name: producer
template: producer
- name: consumer
template: consumer
dependencies:
- producer
95 changes: 95 additions & 0 deletions k8s/examples/fluid/argo_workflow_with_vineyard.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: mlops-
spec:
entrypoint: dag
templates:
- name: producer
volumes:
- name: data
persistentVolumeClaim:
claimName: vineyard
container:
image: python:3.10
volumeMounts:
- name: data
mountPath: /data
command: [bash, -c]
args:
- |
pip install vineyard numpy pandas;
cat << EOF >> producer.py
import time
import numpy as np
import pandas as pd
def generate_random_dataframe(num_rows):
return pd.DataFrame({
'Id': np.random.randint(1, 100000, num_rows),
'TotalRooms': np.random.randint(2, 11, num_rows),
"GarageAge": np.random.randint(1, 31, num_rows),
"RemodAge": np.random.randint(1, 31, num_rows),
"HouseAge": np.random.randint(1, 31, num_rows),
"TotalBath": np.random.randint(1, 5, num_rows),
"TotalPorchSF": np.random.randint(1, 1001, num_rows),
"TotalSF": np.random.randint(1000, 6001, num_rows),
"TotalArea": np.random.randint(1000, 6001, num_rows),
'MoSold': np.random.randint(1, 13, num_rows),
'YrSold': np.random.randint(2006, 2022, num_rows),
'SalePrice': np.random.randint(50000, 800001, num_rows),
})
def producer()
st = time.time()
print('Generating data....', flush=True)
df = generate_random_dataframe(1000000000)
ed = time.time()
print('##################################')
print('Generating data time: ', ed - st, flush=True)
st = time.time()
print('Serializing data....', flush=True)
vineyard.csi.write(df, '/data/df.pkl')
ed = time.time()
print('##################################')
print('Serializing data time: ', ed - st, flush=True)
EOF
python producer.py;
- name: consumer
volumes:
- name: data
persistentVolumeClaim:
claimName: vineyard
container:
image: python:3.10
volumeMounts:
- name: data
mountPath: /data
command: [bash, -c]
args:
- |
pip install vineyard pandas;
cat << EOF >> consumer.py
import time
import vineyard
def consumer()
st = time.time()
print('Deserializing data....', flush=True)
df = vineyard.csi.read('/data/df.pkl')
ed = time.time()
print('##################################')
print('Deserializing data time: ', ed - st, flush=True)
EOF
python consumer.py;
- name: dag
dag:
tasks:
- name: producer
template: producer
- name: consumer
template: consumer
dependencies:
- producer
32 changes: 32 additions & 0 deletions k8s/examples/fluid/configure-vineyard-socket.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import json

with open("/etc/fluid/config.json", "r") as f:

Check warning on line 3 in k8s/examples/fluid/configure-vineyard-socket.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/configure-vineyard-socket.py#L3

Using open without explicitly specifying an encoding
lines = f.readlines()

rawStr = lines[0]
print(rawStr)


script = """
#!/bin/sh
set -ex
mkdir -p $targetPath
while true; do
if [ ! -S "$targetPath/vineyard.sock" ]; then
mount --bind $socketPath $targetPath
fi
sleep 10
done
"""

obj = json.loads(rawStr)

with open("mount-vineyard-socket.sh", "w") as f:

Check warning on line 25 in k8s/examples/fluid/configure-vineyard-socket.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/configure-vineyard-socket.py#L25

Using open without explicitly specifying an encoding
f.write("targetPath=\"%s\"\n" % obj['targetPath'])
if obj['mounts'][0]['mountPoint'].startswith("local://"):
f.write("socketPath=\"%s\"\n" % obj['mounts'][0]['mountPoint'][len("local://"):])

Check warning on line 28 in k8s/examples/fluid/configure-vineyard-socket.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/configure-vineyard-socket.py#L28

Bad indentation. Found 6 spaces, expected 8
else:
f.write("socketPath=\"%s\"\n" % obj['mounts'][0]['mountPoint'])

Check warning on line 30 in k8s/examples/fluid/configure-vineyard-socket.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

k8s/examples/fluid/configure-vineyard-socket.py#L30

Bad indentation. Found 6 spaces, expected 8

f.write(script)
18 changes: 18 additions & 0 deletions k8s/examples/fluid/dataset.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
name: vineyard
spec:
mounts:
# This directory should be the same as the vineyard socket directory in the vineyard deployment
- mountPoint: local:///var/run/vineyard-kubernetes/vineyard-system/vineyardd-sample
name: vineyard
accessModes:
- ReadWriteMany
---
apiVersion: data.fluid.io/v1alpha1
kind: ThinRuntime
metadata:
name: vineyard
spec:
profileName: vineyard-profile
23 changes: 23 additions & 0 deletions k8s/examples/fluid/vineyard_profile.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
apiVersion: data.fluid.io/v1alpha1
kind: ThinRuntimeProfile
metadata:
name: vineyard-profile
spec:
fileSystemType: fuse
volumes:
- name: vineyard
hostPath:
# This path should be the same as the vineyard socket path in the vineyard deployment
path: /var/run/vineyard-kubernetes/vineyard-system/vineyardd-sample
type: DirectoryOrCreate
fuse:
image: configure-vineyard-socket
imageTag: latest
imagePullPolicy: IfNotPresent
volumeMounts:
- name: vineyard-socket
mountPath: /var/run/vineyard-kubernetes/vineyard-system/vineyardd-sample
command:
- sh
- -c
- "python3 /configure-vineyard-socket.py && chmod u+x ./mount-vineyard-socket.sh && ./mount-vineyard-socket.sh"

0 comments on commit 0380a1f

Please sign in to comment.