Skip to content

Commit

Permalink
Merge branch 'main' into workflow-traces-update
Browse files Browse the repository at this point in the history
  • Loading branch information
Gregory-Pereira authored Jul 11, 2024
2 parents f290233 + 241e0e4 commit 9389041
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 13 deletions.
2 changes: 1 addition & 1 deletion recipes/computer_vision/object_detection/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ The local Model Service relies on a volume mount to the localhost to access the
make run
```

As stated above, by default the model service will use [`facebook/detr-resnet-101`](https://huggingface.co/facebook/detr-resnet-101). However you can use other compatabale models. Simply pass the new `MODEL_NAME` and `MODEL_PATH` to the make command. Make sure the model is downloaded and exists in the [models directory](../../../models/):
As stated above, by default the model service will use [`facebook/detr-resnet-101`](https://huggingface.co/facebook/detr-resnet-101). However you can use other compatible models. Simply pass the new `MODEL_NAME` and `MODEL_PATH` to the make command. Make sure the model is downloaded and exists in the [models directory](../../../models/):

```bash
# from path model_servers/object_detection_python from repo containers/ai-lab-recipes
Expand Down
13 changes: 7 additions & 6 deletions training/amd-bootc/Containerfile
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,11 @@ RUN grep -q /usr/lib/containers/storage /etc/containers/storage.conf || \
sed -i -e '/additionalimage.*/a "/usr/lib/containers/storage",' \
/etc/containers/storage.conf && \
if [ -f "/run/.input/ilab" ]; then \
cp /run/.input/ilab /usr/local/bin/ilab; \
cp /run/.input/ilab /usr/bin/ilab; \
else \
curl -o /usr/local/bin/ilab "https://raw.githubusercontent.com/containers/ai-lab-recipes/main/training/ilab-wrapper/ilab"; \
fi
curl -o /usr/bin/ilab "https://raw.githubusercontent.com/containers/ai-lab-recipes/main/training/ilab-wrapper/ilab"; \
fi \
&& chmod +x /usr/bin/ilab

ARG INSTRUCTLAB_IMAGE="quay.io/ai-lab/instructlab-amd:latest"

Expand All @@ -46,9 +47,9 @@ RUN if [ -n "${SSHPUBKEY}" ]; then \
echo ${SSHPUBKEY} > /usr/ssh/root.keys && chmod 0600 /usr/ssh/root.keys; \
fi

RUN sed -i 's/__REPLACE_TRAIN_DEVICE__/cuda/' /usr/local/bin/ilab
RUN sed -i 's/__REPLACE_CONTAINER_DEVICE__/nvidia.com\/gpu=all/' /usr/local/bin/ilab
RUN sed -i "s%__REPLACE_CONTAINER_NAME__%${INSTRUCTLAB_IMAGE}%" /usr/local/bin/ilab
RUN sed -i 's/__REPLACE_TRAIN_DEVICE__/cuda/' /usr/bin/ilab
RUN sed -i 's/__REPLACE_CONTAINER_DEVICE__/nvidia.com\/gpu=all/' /usr/bin/ilab
RUN sed -i "s%__REPLACE_CONTAINER_NAME__%${INSTRUCTLAB_IMAGE}%" /usr/bin/ilab

# Added for running as an OCI Container to prevent Overlay on Overlay issues.
VOLUME /var/lib/containers
Expand Down
19 changes: 13 additions & 6 deletions training/nvidia-bootc/Containerfile
Original file line number Diff line number Diff line change
Expand Up @@ -138,12 +138,18 @@ RUN mv /etc/selinux /etc/selinux.tmp \
dnf module enable -y nvidia-driver:${DRIVER_BRANCH} && \
dnf install -y nvidia-fabric-manager-${DRIVER_VERSION} libnvidia-nscq-${DRIVER_BRANCH}-${DRIVER_VERSION} ; \
fi \
# Install rhc connect for insights telemetry gathering
&& . /etc/os-release && if [ "${ID}" == "rhel" ]; then \
dnf install -y rhc rhc-worker-playbook; \
fi \
&& dnf clean all \
&& ln -s ../cloud-init.target /usr/lib/systemd/system/default.target.wants \
&& mv /etc/selinux.tmp /etc/selinux \
&& ln -s /usr/lib/systemd/system/nvidia-toolkit-firstboot.service /usr/lib/systemd/system/basic.target.wants/nvidia-toolkit-firstboot.service \
&& echo "blacklist nouveau" > /etc/modprobe.d/blacklist_nouveau.conf

&& echo "blacklist nouveau" > /etc/modprobe.d/blacklist_nouveau.conf \
&& sed '/\[Unit\]/a ConditionPathExists = /dev/nvidia-nvswitchctl' /usr/lib/systemd/system/nvidia-fabricmanager.service \
&& ln -s /usr/lib/systemd/system/nvidia-fabricmanager.service /etc/systemd/system/multi-user.target.wants/nvidia-fabricmanager.service \
&& ln -s /usr/lib/systemd/system/nvidia-persistenced.service /etc/systemd/system/multi-user.target.wants/nvidia-persistenced.service

ARG SSHPUBKEY

Expand All @@ -162,15 +168,16 @@ RUN grep -q /usr/lib/containers/storage /etc/containers/storage.conf || \
sed -i -e '/additionalimage.*/a "/usr/lib/containers/storage",' \
/etc/containers/storage.conf && \
if [ -f "/run/.input/ilab" ]; then \
cp /run/.input/ilab /usr/local/bin/ilab; \
cp /run/.input/ilab /usr/bin/ilab; \
else \
curl -o /usr/local/bin/ilab "https://raw.githubusercontent.com/containers/ai-lab-recipes/main/training/ilab-wrapper/ilab"; \
fi
curl -o /usr/bin/ilab "https://raw.githubusercontent.com/containers/ai-lab-recipes/main/training/ilab-wrapper/ilab"; \
fi \
&& chmod +x /usr/bin/ilab

ARG INSTRUCTLAB_IMAGE="quay.io/ai-lab/instructlab-nvidia:latest"
ARG GPU_COUNT_COMMAND="nvidia-ctk --quiet cdi list | grep -P nvidia.com/gpu='\\\\d+' | wc -l"

RUN for i in /usr/local/bin/ilab*; do \
RUN for i in /usr/bin/ilab*; do \
sed -i 's/__REPLACE_TRAIN_DEVICE__/cuda/' $i; \
sed -i 's/__REPLACE_CONTAINER_DEVICE__/nvidia.com\/gpu=all/' $i; \
sed -i "s%__REPLACE_IMAGE_NAME__%${INSTRUCTLAB_IMAGE}%" $i; \
Expand Down

0 comments on commit 9389041

Please sign in to comment.