Skip to content

Commit

Permalink
Add bootc to codegen (#174)
Browse files Browse the repository at this point in the history
Standardize on the name codegen not code_generation.

Fix errors in other natural_language recipes found while
developing codegen bootc.

Signed-off-by: Daniel J Walsh <[email protected]>
  • Loading branch information
rhatdan authored Apr 8, 2024
1 parent 0ae9386 commit ff31a46
Show file tree
Hide file tree
Showing 14 changed files with 263 additions and 10 deletions.

This file was deleted.

55 changes: 55 additions & 0 deletions recipes/natural_language_processing/codegen/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
APP ?= codegen
MODELIMAGE ?= quay.io/ai-lab/mistral-7b-instruct:latest
APPIMAGE ?= quay.io/ai-lab/${APP}:latest
SERVERIMAGE ?= quay.io/ai-lab/llamacpp-python:latest
SSHPUBKEY ?= $(shell cat ${HOME}/.ssh/id_rsa.pub;)
BOOTCIMAGE ?= quay.io/ai-lab/${APP}-bootc:latest
FROM ?=

.PHONY: build
build:
podman build -f builds/Containerfile -t ${APPIMAGE} .

.PHONY: bootc
bootc:
podman build $${FROM:+--from $${FROM}} --cap-add SYS_ADMIN --build-arg "SSHPUBKEY=$(SSHPUBKEY)" -f bootc/Containerfile -t ${BOOTCIMAGE} .

.PHONY: quadlet
quadlet:
# Modify quadlet files to match the server, model and app image
mkdir -p build
sed -e "s|SERVERIMAGE|${SERVERIMAGE}|" \
-e "s|APPIMAGE|${APPIMAGE}|g" \
-e "s|MODELIMAGE|${MODELIMAGE}|g" \
quadlet/${APP}.image \
> build/${APP}.image
sed -e "s|SERVERIMAGE|${SERVERIMAGE}|" \
-e "s|APPIMAGE|${APPIMAGE}|g" \
-e "s|MODELIMAGE|${MODELIMAGE}|g" \
quadlet/${APP}.yaml \
> build/${APP}.yaml
cp quadlet/${APP}.kube build/${APP}.kube

.PHONY: install
install:
wget https://www.slimjetbrowser.com/chrome/files/103.0.5060.53/google-chrome-stable_current_amd64.deb
sudo dpkg -i google-chrome-stable_current_amd64.deb
wget https://chromedriver.storage.googleapis.com/103.0.5060.53/chromedriver_linux64.zip
unzip chromedriver_linux64.zip
pip install -r tests/requirements.txt

.PHONY: run
run:
podman run -it -p 8501:8501 -e MODEL_SERVICE_ENDPOINT=http://10.88.0.1:8001/v1 ghcr.io/ai-lab-recipes/${APP}

.PHONY: functional_tests
functional_tests:
python3 -m pytest -vvv --driver=Chrome --driver-path=./chromedriver tests/functional

.PHONY: integration_test
integration_tests:
URL=${URL} python3 -m pytest -vvv --driver=Chrome --driver-path=./chromedriver tests/integration

.PHONY: clean
clean:
rm -rf build
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ In order to build this application we will need a model, a Model Service and an
* [Build the AI Application](#build-the-ai-application)
* [Deploy the AI Application](#deploy-the-ai-application)
* [Interact with the AI Application](#interact-with-the-ai-application)
* [Embed the AI Application in a Bootable Container Image](#embed-the-ai-application-in-a-bootable-container-image)

### Download a model

Expand Down Expand Up @@ -92,3 +93,50 @@ podman run --rm -it -p 8501:8501 -e MODEL_SERVICE_ENDPOINT=http://10.88.0.1:8001
Everything should now be up an running with the chat application available at [`http://localhost:8501`](http://localhost:8501). By using this recipe and getting this starting point established, users should now have an easier time customizing and building their own LLM enabled code generation applications.

_Note: Future recipes will demonstrate integration between locally hosted LLM's and developer productivity tools like VSCode._

### Embed the AI Application in a Bootable Container Image

To build a bootable container image that includes this sample chatbot workload as a service that starts when a system is booted, cd into this folder
and run:


```
make BOOTCIMAGE=quay.io/your/codegen-bootc:latest bootc
```

Substituting the bootc/Containerfile FROM command is simple using the Makefile FROM option.

```
make FROM=registry.redhat.io/rhel9-beta/rhel-bootc:9.4 BOOTCIMAGE=quay.io/your/codegen-bootc:latest bootc
```

The magic happens when you have a bootc enabled system running. If you do, and you'd like to update the operating system to the OS you just built
with the codegen application, it's as simple as ssh-ing into the bootc system and running:

```
bootc switch quay.io/your/codegen-bootc:latest
```

Upon a reboot, you'll see that the codegen service is running on the system.

Check on the service with

```
ssh user@bootc-system-ip
sudo systemctl status codegen
```

#### What are bootable containers?

What's a [bootable OCI container](https://containers.github.io/bootc/) and what's it got to do with AI?

That's a good question! We think it's a good idea to embed AI workloads (or any workload!) into bootable images at _build time_ rather than
at _runtime_. This extends the benefits, such as portability and predictability, that containerizing applications provides to the operating system.
Bootable OCI images bake exactly what you need to run your workloads into the operating system at build time by using your favorite containerization
tools. Might I suggest [podman](https://podman.io/)?

Once installed, a bootc enabled system can be updated by providing an updated bootable OCI image from any OCI
image registry with a single `bootc` command. This works especially well for fleets of devices that have fixed workloads - think
factories or appliances. Who doesn't want to add a little AI to their appliance, am I right?

Bootable images lend toward immutable operating systems, and the more immutable an operating system is, the less that can go wrong at runtime!
57 changes: 57 additions & 0 deletions recipes/natural_language_processing/codegen/bootc/Containerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Example: an AI powered sample application is embedded as a systemd service
# via Podman quadlet files in /usr/share/containers/systemd
#
# Use build command:
# podman build --build-arg "sshpubkey=$(cat $HOME/.ssh/id_rsa.pub)" -t quay.io/exampleos/myos .
# The --build-arg "SSHPUBKEY=$(cat ~/.ssh/id_rsa.pub)" option inserts your
# public key into the image, allowing root access via ssh.

FROM quay.io/centos-bootc/centos-bootc:stream9
ARG SSHPUBKEY

RUN mkdir /usr/etc-system && \
echo 'AuthorizedKeysFile /usr/etc-system/%u.keys' >> /etc/ssh/sshd_config.d/30-auth-system.conf && \
echo ${SSHPUBKEY} > /usr/etc-system/root.keys && chmod 0600 /usr/etc-system/root.keys

# pre-pull workload images:
# Comment the pull commands to keep bootc image smaller.
# The quadlet .image file added above pulls following images on boot if not
# pre-pulled here

ARG RECIPE=codegen
ARG MODELIMAGE=quay.io/ai-lab/mistral-7b-instruct:latest
ARG APPIMAGE=quay.io/ai-lab/${RECIPE}:latest
ARG SERVERIMAGE=quay.io/ai-lab/llamacpp-python:latest

# Add quadlet files to setup system to automatically run AI application on boot
COPY build/${RECIPE}.kube build/${RECIPE}.yaml /usr/share/containers/systemd

# Modify quadlet files to match the server, model and app image
RUN sed -e "s|SERVERIMAGE|${SERVERIMAGE}|" \
-e "s|APPIMAGE|${APPIMAGE}|g" \
-e "s|MODELIMAGE|${MODELIMAGE}|g" \
-i \
/usr/share/containers/systemd/${RECIPE}.yaml

# Because images are prepulled, no need for .image quadlet
# COPY build/${RECIPE}.image /usr/share/containers/systemd
# RUN sed -e "s|SERVERIMAGE|${SERVERIMAGE}|" \
# -e "s|APPIMAGE|${APPIMAGE}|g" \
# -e "s|MODELIMAGE|${MODELIMAGE}|g" \
# -i \
# /usr/share/containers/systemd/${RECIPE}.image

# Setup /usr/lib/containers/storage as an additional store for images.
# Remove once the base images have this set by default.
RUN sed -i -e '/additionalimage.*/a "/usr/lib/containers/storage",' \
/etc/containers/storage.conf

# Added for running as an OCI Container to prevent Overlay on Overlay issues.
VOLUME /var/lib/containers

# Prepull the model, model_server & application images to populate the system.
RUN podman pull --root /usr/lib/containers/storage ${SERVERIMAGE}
RUN podman pull --root /usr/lib/containers/storage ${APPIMAGE}
RUN podman pull --root /usr/lib/containers/storage ${MODELIMAGE}

RUN podman system reset --force 2>/dev/null
94 changes: 94 additions & 0 deletions recipes/natural_language_processing/codegen/bootc/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
## Embed workload (AI sample applications) in a bootable container image

### Create a custom centos-bootc:stream9 image

* [Containerfile](./Containerfile) - embeds an LLM-powered sample code generation application.

Details on the application can be found [in the codegen/README.md](../README.md). By default, this Containerfile includes a model-server
that is meant to run with CPU - no additional GPU drivers or toolkits are embedded. You can substitute the llamacpp_python model-server image
for one that has GPU drivers and toolkits with additional build-args. The `FROM` must be replaced with a base image that has the necessary
kernel drivers and toolkits if building for GPU enabled systems. For an example of an NVIDIA/CUDA base image,
see [NVIDIA bootable image example](https://gitlab.com/bootc-org/examples/-/tree/main/nvidia?ref_type=heads)

In order to pre-pull the workload images, you need to build from the same architecture you're building for.
If not pre-pulling the workload images, you can cross build (ie, build from a Mac for an X86_64 system).
To build the derived bootc image for x86_64 architecture, run the following:

```bash
cd recipes/natural_language_processing/codegen

# for CPU powered sample LLM application
# to switch to an alternate platform like aarch64, pass --platform linux/arm64
# the --cap-add SYS_ADMIN switch is needed when you are embedding Podman
# commands within the container build. If the registry you are pulling images
# from requires authentication, then you will need to volume mount the
# auth_json file with SELinux separation disabled.
podman login --auth-file auth.json quay.io/yourrepo
podman build --build-arg "sshpubkey=$(cat ~/.ssh/id_rsa.pub)" \
--security-opt label=disable \
-v ./auth.json:/run/containers/0/auth.json \
--cap-add SYS_ADMIN \
-t quay.io/yourrepo/youros:tag .

# for GPU powered sample LLM application with llamacpp cuda model server
podman build --build-arg "sshpubkey=$(cat ~/.ssh/id_rsa.pub)" \
--build-arg "model-server-image="quay.io/redhat-et/locallm-llamacpp-cuda-model-server:latest" \
--from <YOUR BOOTABLE IMAGE WITH NVIDIA/CUDA> \
--cap-add SYS_ADMIN \
--platform linux/amd64 \
-t quay.io/yourrepo/youros:tag .
podman push quay.io/yourrepo/youros:tag
```
### Update a bootc-enabled system with the new derived image
To build a disk image from an OCI bootable image, you can refer to [bootc-org/examples](https://gitlab.com/bootc-org/examples).
For this example, we will assume a bootc enabled system is already running.
If already running a bootc-enabled OS, `bootc switch` can be used to update the system to target a new bootable OCI image with embedded workloads.
SSH into the bootc-enabled system and run:
```bash
bootc switch quay.io/yourrepo/youros:tag
```
The necessary image layers will be downloaded from the OCI registry, and the system will prompt you to reboot into the new operating system.
From this point, with any subsequent modifications and pushes to the `quay.io/yourrepo/youreos:tag` OCI image, your OS can be updated with:
```bash
bootc upgrade
```
### Accessing the embedded workloads
The codegen can be accessed by visiting port `8150` of the running bootc system.
They will be running as systemd services from Podman quadlet files placed at `/usr/share/containers/systemd/` on the bootc system.
For more information about running containerized applications as systemd services with Podman, refer to this
[Podman quadlet post](https://www.redhat.com/sysadmin/quadlet-podman) or, [podman documentation](https://podman.io/docs)
To monitor the sample applications, SSH into the bootc system and run either:
```bash
systemctl status codegen
```
You can also view the pods and containers that are managed with systemd by running:
```
podman pod list
podman ps -a
```
To stop the sample applications, SSH into the bootc system and run:
```bash
systemctl stop codegen
```
To run the sample application _not_ as a systemd service, stop the services then
run the appropriate commands based on the application you have embedded.
```bash
podman kube play /usr/share/containers/systemd/codegen.yaml
```
9 changes: 9 additions & 0 deletions recipes/natural_language_processing/codegen/quadlet/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
### Run code-generation as a systemd service

```bash
(cd ../;make quadlet)
sudo cp ../build/codegen.yaml ../build/codegen.kube /usr/share/containers/systemd/codegen.kube ../build/codegen.image /usr/share/containers/systemd/
sudo /usr/libexec/podman/quadlet --dryrun #optional
sudo systemctl daemon-reload
sudo systemctl start codegen
```

0 comments on commit ff31a46

Please sign in to comment.