Skip to content

Commit

Permalink
feat: release v1.5.1
Browse files Browse the repository at this point in the history
+ support mlu type
+ fix go mod path
+ rename libcndev.so when building images
+ add mlu driver and mcu version info
+ add mlu_container metric and deprecation notice

Signed-off-by: Yuting Tan <[email protected]>
  • Loading branch information
yttan committed Mar 5, 2021
1 parent 7f11950 commit 7747617
Show file tree
Hide file tree
Showing 19 changed files with 349 additions and 61 deletions.
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
# Changelog

## v1.5.1

+ Add MLU driver, mcu and mlu type labels
+ Add mlu_container metric. Use `<metric> * on(boardid) group_right ai_mlu_container` to append k8s container info to a metric.
+ **Deprecation:** container_resource_mlu_utilization will be removed in the future
+ **Deprecation:** container_resource_mlu_memory_utilization will be removed in the future
+ **Deprecation:** container_resource_mlu_board_power will be removed in the future

## v1.5.0

+ Open source basic functions.
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ export CC=aarch64-linux-gnu-gcc
endif

generate:
mockgen -package mock -destination pkg/mock/cndev.go -mock_names=Cndev=Cndev github.com/cambricon/mlu-exporter/pkg/cndev Cndev
mockgen -package mock -destination pkg/mock/podrsources.go -mock_names=PodResources=PodResources github.com/cambricon/mlu-exporter/pkg/podresources PodResources
mockgen -package mock -destination pkg/mock/cndev.go -mock_names=Cndev=Cndev github.com/Cambricon/mlu-exporter/pkg/cndev Cndev
mockgen -package mock -destination pkg/mock/podrsources.go -mock_names=PodResources=PodResources github.com/Cambricon/mlu-exporter/pkg/podresources PodResources

lint:
golangci-lint run -v
Expand Down
8 changes: 6 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ Use the following command to start the exporter.
docker run -d \
-p 30108:30108 \
--privileged=true \
cambricon-mlu-exporter:v1.5.0
cambricon-mlu-exporter:v1.5.1
```

Then use the following command to get the metrics.
Expand All @@ -70,7 +70,7 @@ docker run -d \
-p 30108:30108 \
-v examples/metrics.yaml:/var/lib/mlu-exporter/metrics.yaml \
--privileged=true \
cambricon-mlu-exporter:v1.5.0 \
cambricon-mlu-exporter:v1.5.1 \
mlu-exporter \
--metrics-config=/var/lib/mlu-exporter/metrics.yaml \
--metrics-path=/metrics \
Expand Down Expand Up @@ -124,3 +124,7 @@ kubectl apply -f examples/cambricon-mlu-exporter-sm.yaml
Then checkout your Prometheus to get the MLU metrics.

You can also set the command args described above in the MLU exporter daemonset spec. And set the metrics configuration in the MLU exporter configMap.

## Upgrade Notice

**Please see [changelog](CHANGELOG.md) for deprecation and breaking changes.**
4 changes: 2 additions & 2 deletions build_image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
curpath=$(dirname "$0")
cd "$curpath" || exit 1

: "${TAG:=v1.5.0}"
: "${TAG:=v1.5.1}"
: "${ARCH:=amd64}"
: "${LIBCNDEV:=/usr/local/neuware/lib64/libcndev.so}"

Expand Down Expand Up @@ -65,7 +65,7 @@ if ! file "$LIBCNDEV" --dereference | grep -q "$file_arch"; then
exit 1
fi

cp "$LIBCNDEV" "$curpath/libs/linux/$ARCH/"
cp "$LIBCNDEV" "$curpath/libs/linux/$ARCH/libcndev.so"

echo "Building Cambricon MLU Exporter docker image."

Expand Down
79 changes: 77 additions & 2 deletions examples/cambricon-mlu-exporter-cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,103 +22,164 @@ data:
labels:
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "boardid"
cluster: "coregroup"
nodeName: "nodeHostname"
mcu: "mcu"
driver: "driver"
board_health:
name: "board_health"
help: "The health state of Cambricon MLU, 1 means health, 0 means sick"
labels:
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "boardid"
nodeName: "nodeHostname"
mcu: "mcu"
driver: "driver"
physical_memory_total:
name: "mem_total_bytes"
help: "The total physical memory of Cambricon MLU, unit is 'B'"
labels:
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "boardid"
nodeName: "nodeHostname"
mcu: "mcu"
driver: "driver"
physical_memory_used:
name: "mem_used_bytes"
help: "The used physical memory of Cambricon MLU, unit is 'B'"
labels:
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "boardid"
nodeName: "nodeHostname"
mcu: "mcu"
driver: "driver"
memory_utilization:
name: "mem_utilization"
help: "The memory utilization of Cambricon MLU, unit is '%'"
labels:
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "boardid"
nodeName: "nodeHostname"
mcu: "mcu"
driver: "driver"
board_utilization:
name: "board_utilization"
help: "The utilization of Cambricon MLU, unit is '%'"
labels:
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "boardid"
nodeName: "nodeHostname"
mcu: "mcu"
driver: "driver"
board_capacity:
name: "board_capacity"
help: "The capacity metric of Cambricon MLU, unit is 'T'"
labels:
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "boardid"
nodeName: "nodeHostname"
mcu: "mcu"
driver: "driver"
board_usage:
name: "board_usage"
help: "The usage metric of Cambricon MLU, unit is 'T'"
labels:
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "boardid"
nodeName: "nodeHostname"
mcu: "mcu"
driver: "driver"
core_utilization:
name: "core"
help: "The utilization metric of Cambricon MLU core, unit is '%'"
labels:
core: "core"
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "boardid"
nodeName: "nodeHostname"
mcu: "mcu"
driver: "driver"
fan_speed:
name: "fan"
help: "Fan speed of Cambricon MLU, unit is 'rpm', '-1' means no fan on MLU"
labels:
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "boardid"
nodeName: "nodeHostname"
mcu: "mcu"
driver: "driver"
board_power:
name: "board_power"
help: "The power usage of Cambricon MLU, unit is 'w'"
labels:
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "boardid"
nodeName: "nodeHostname"
mcu: "mcu"
driver: "driver"
board_version:
name: "board_version"
help: "The board version info of Cambricon MLU"
labels:
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "boardid"
nodeName: "nodeHostname"
mcu: "mcu"
driver: "driver"
podresources:
board_allocated:
name: "board_allocated"
help: "The allocated number of Cambricon MLUs in a node"
labels:
model: "model"
mluType: "mlu_type"
driver: "driver"
nodeName: "nodeHostname"
mlu_container:
name: "mlu_container"
help: "The k8s container info of Cambricon MLUs allocated"
labels:
slot: "mlu"
model: "model"
mluType: "mlu_type"
sn: "boardid"
nodeName: "nodeHostname"
mcu: "mcu"
driver: "driver"
namespace: "pod_namespace"
pod: "pod_name"
container: "container_name"
container_mlu_utilization:
name: "container_resource_mlu_utilization"
help: "The utilization metric of Cambricon MLU in container, unit is '%'"
help: "Deprecated (this metric will be removed in the future): The utilization metric of Cambricon MLU in container, unit is '%'"
labels:
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "board_id"
nodeName: "nodeHostname"
namespace: "namespace"
Expand All @@ -130,6 +191,7 @@ data:
labels:
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "board_id"
vfID: "vfid"
nodeName: "nodeHostname"
Expand All @@ -138,10 +200,23 @@ data:
container: "containerName"
container_mlu_memory_utilization:
name: "container_resource_mlu_memory_utilization"
help: "The memory utilization metric of Cambricon MLU in container, unit is '%'"
help: "Deprecated (this metric will be removed in the future): The memory utilization metric of Cambricon MLU in container, unit is '%'"
labels:
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "board_id"
nodeName: "nodeHostname"
namespace: "namespace"
pod: "pod"
container: "containerName"
container_mlu_board_power:
name: "container_resource_mlu_board_power"
help: "Deprecated (this metric will be removed in the future): The board power usage of Cambricon MLU in container, unit is 'w'"
labels:
slot: "slot"
model: "model"
mluType: "mlu_type"
sn: "board_id"
nodeName: "nodeHostname"
namespace: "namespace"
Expand Down
2 changes: 1 addition & 1 deletion examples/cambricon-mlu-exporter-ds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ spec:
spec:
containers:
- name: cambricon-mlu-monitor
image: cambricon-mlu-exporter:v1.5.0
image: cambricon-mlu-exporter:v1.5.1
imagePullPolicy: IfNotPresent
command:
- /usr/bin/mlu-exporter
Expand Down
Loading

0 comments on commit 7747617

Please sign in to comment.