Add bootc to codegen (#174)

Standardize on the name codegen not code_generation. Fix errors in other natural_language recipes found while developing codegen bootc. Signed-off-by: Daniel J Walsh <[email protected]>
containers · Apr 8, 2024 · ff31a46 · ff31a46
1 parent 0ae9386
commit ff31a46
Show file tree

Hide file tree

Showing 14 changed files with 263 additions and 10 deletions.
diff --git a/recipes/natural_language_processing/code_generation/quadlet/README.md b/recipes/natural_language_processing/code_generation/quadlet/README.md
diff --git a/recipes/natural_language_processing/codegen/Makefile b/recipes/natural_language_processing/codegen/Makefile
@@ -0,0 +1,55 @@
+APP ?= codegen
+MODELIMAGE ?= quay.io/ai-lab/mistral-7b-instruct:latest
+APPIMAGE ?= quay.io/ai-lab/${APP}:latest
+SERVERIMAGE ?= quay.io/ai-lab/llamacpp-python:latest
+SSHPUBKEY ?= $(shell cat ${HOME}/.ssh/id_rsa.pub;)
+BOOTCIMAGE ?= quay.io/ai-lab/${APP}-bootc:latest
+FROM ?=
+
+.PHONY: build
+build:
+	podman build -f builds/Containerfile -t ${APPIMAGE} .
+
+.PHONY: bootc
+bootc:
+	podman build $${FROM:+--from $${FROM}} --cap-add SYS_ADMIN --build-arg "SSHPUBKEY=$(SSHPUBKEY)" -f bootc/Containerfile -t ${BOOTCIMAGE} .
+
+.PHONY: quadlet
+quadlet:
+	# Modify quadlet files to match the server, model and app image
+	mkdir -p build
+	sed -e "s|SERVERIMAGE|${SERVERIMAGE}|" \
+	    -e "s|APPIMAGE|${APPIMAGE}|g" \
+	    -e "s|MODELIMAGE|${MODELIMAGE}|g" \
+	    quadlet/${APP}.image \
+	    > build/${APP}.image
+	sed -e "s|SERVERIMAGE|${SERVERIMAGE}|" \
+	    -e "s|APPIMAGE|${APPIMAGE}|g" \
+	    -e "s|MODELIMAGE|${MODELIMAGE}|g" \
+	    quadlet/${APP}.yaml \
+	    > build/${APP}.yaml
+	cp quadlet/${APP}.kube build/${APP}.kube
+
+.PHONY: install
+install:
+	wget https://www.slimjetbrowser.com/chrome/files/103.0.5060.53/google-chrome-stable_current_amd64.deb
+	sudo dpkg -i google-chrome-stable_current_amd64.deb
+	wget https://chromedriver.storage.googleapis.com/103.0.5060.53/chromedriver_linux64.zip
+	unzip chromedriver_linux64.zip
+	pip install -r tests/requirements.txt
+
+.PHONY: run
+run: 
+	podman run -it -p 8501:8501 -e MODEL_SERVICE_ENDPOINT=http://10.88.0.1:8001/v1 ghcr.io/ai-lab-recipes/${APP}
+
+.PHONY: functional_tests
+functional_tests:
+	python3 -m pytest -vvv --driver=Chrome --driver-path=./chromedriver tests/functional
+
+.PHONY: integration_test
+integration_tests:
+	URL=${URL} python3 -m pytest -vvv --driver=Chrome --driver-path=./chromedriver tests/integration
+
+.PHONY: clean
+clean:
+	rm -rf build
diff --git a/...uage_processing/code_generation/README.md → ...ral_language_processing/codegen/README.md b/...uage_processing/code_generation/README.md → ...ral_language_processing/codegen/README.md
@@ -20,6 +20,7 @@ In order to build this application we will need a model, a Model Service and an
 * [Build the AI Application](#build-the-ai-application)
 * [Deploy the AI Application](#deploy-the-ai-application)
 * [Interact with the AI Application](#interact-with-the-ai-application)
+* [Embed the AI Application in a Bootable Container Image](#embed-the-ai-application-in-a-bootable-container-image)
 
 ### Download a model
 
@@ -92,3 +93,50 @@ podman run --rm -it -p 8501:8501 -e MODEL_SERVICE_ENDPOINT=http://10.88.0.1:8001
 Everything should now be up an running with the chat application available at [`http://localhost:8501`](http://localhost:8501). By using this recipe and getting this starting point established, users should now have an easier time customizing and building their own LLM enabled code generation applications.
 
 _Note: Future recipes will demonstrate integration between locally hosted LLM's and developer productivity tools like VSCode._
+
+### Embed the AI Application in a Bootable Container Image
+
+To build a bootable container image that includes this sample chatbot workload as a service that starts when a system is booted, cd into this folder
+and run:
+
+
+```
+make BOOTCIMAGE=quay.io/your/codegen-bootc:latest bootc
+```
+
+Substituting the bootc/Containerfile FROM command is simple using the Makefile FROM option.
+
+```
+make FROM=registry.redhat.io/rhel9-beta/rhel-bootc:9.4 BOOTCIMAGE=quay.io/your/codegen-bootc:latest bootc
+```
+
+The magic happens when you have a bootc enabled system running. If you do, and you'd like to update the operating system to the OS you just built
+with the codegen application, it's as simple as ssh-ing into the bootc system and running:
+
+```
+bootc switch quay.io/your/codegen-bootc:latest
+```
+
+Upon a reboot, you'll see that the codegen service is running on the system.
+
+Check on the service with
+
+```
+ssh user@bootc-system-ip
+sudo systemctl status codegen
+```
+
+#### What are bootable containers?
+
+What's a [bootable OCI container](https://containers.github.io/bootc/) and what's it got to do with AI?
+
+That's a good question! We think it's a good idea to embed AI workloads (or any workload!) into bootable images at _build time_ rather than
+at _runtime_. This extends the benefits, such as portability and predictability, that containerizing applications provides to the operating system.
+Bootable OCI images bake exactly what you need to run your workloads into the operating system at build time by using your favorite containerization
+tools. Might I suggest [podman](https://podman.io/)?
+
+Once installed, a bootc enabled system can be updated by providing an updated bootable OCI image from any OCI
+image registry with a single `bootc` command. This works especially well for fleets of devices that have fixed workloads - think
+factories or appliances. Who doesn't want to add a little AI to their appliance, am I right?
+
+Bootable images lend toward immutable operating systems, and the more immutable an operating system is, the less that can go wrong at runtime!
diff --git a/...ge_processing/code_generation/ai-lab.yaml → ...l_language_processing/codegen/ai-lab.yaml b/...ge_processing/code_generation/ai-lab.yaml → ...l_language_processing/codegen/ai-lab.yaml
diff --git a/recipes/natural_language_processing/codegen/bootc/Containerfile b/recipes/natural_language_processing/codegen/bootc/Containerfile
@@ -0,0 +1,57 @@
+# Example: an AI powered sample application is embedded as a systemd service
+# via Podman quadlet files in /usr/share/containers/systemd
+#
+# Use build command:
+# podman build --build-arg "sshpubkey=$(cat $HOME/.ssh/id_rsa.pub)" -t quay.io/exampleos/myos .
+# The --build-arg "SSHPUBKEY=$(cat ~/.ssh/id_rsa.pub)" option inserts your
+# public key into the image, allowing root access via ssh.
+
+FROM quay.io/centos-bootc/centos-bootc:stream9
+ARG SSHPUBKEY
+
+RUN mkdir /usr/etc-system && \
+    echo 'AuthorizedKeysFile /usr/etc-system/%u.keys' >> /etc/ssh/sshd_config.d/30-auth-system.conf && \
+    echo ${SSHPUBKEY} > /usr/etc-system/root.keys && chmod 0600 /usr/etc-system/root.keys
+
+# pre-pull workload images:
+# Comment the pull commands to keep bootc image smaller.
+# The quadlet .image file added above pulls following images on boot if not
+# pre-pulled here
+
+ARG RECIPE=codegen
+ARG MODELIMAGE=quay.io/ai-lab/mistral-7b-instruct:latest
+ARG APPIMAGE=quay.io/ai-lab/${RECIPE}:latest
+ARG SERVERIMAGE=quay.io/ai-lab/llamacpp-python:latest
+
+# Add quadlet files to setup system to automatically run AI application on boot
+COPY build/${RECIPE}.kube build/${RECIPE}.yaml /usr/share/containers/systemd
+
+# Modify quadlet files to match the server, model and app image
+RUN sed -e "s|SERVERIMAGE|${SERVERIMAGE}|" \
+    -e "s|APPIMAGE|${APPIMAGE}|g" \
+    -e "s|MODELIMAGE|${MODELIMAGE}|g" \
+    -i \
+    /usr/share/containers/systemd/${RECIPE}.yaml
+
+# Because images are prepulled, no need for .image quadlet
+# COPY build/${RECIPE}.image /usr/share/containers/systemd
+# RUN sed -e "s|SERVERIMAGE|${SERVERIMAGE}|" \
+#    -e "s|APPIMAGE|${APPIMAGE}|g" \
+#    -e "s|MODELIMAGE|${MODELIMAGE}|g" \
+#    -i \
+#    /usr/share/containers/systemd/${RECIPE}.image
+
+# Setup /usr/lib/containers/storage as an additional store for images.
+# Remove once the base images have this set by default.
+RUN sed -i -e '/additionalimage.*/a "/usr/lib/containers/storage",' \
+        /etc/containers/storage.conf
+
+# Added for running as an OCI Container to prevent Overlay on Overlay issues.
+VOLUME /var/lib/containers
+
+# Prepull the model, model_server & application images to populate the system.
+RUN podman pull --root /usr/lib/containers/storage ${SERVERIMAGE}
+RUN podman pull --root /usr/lib/containers/storage ${APPIMAGE}
+RUN podman pull --root /usr/lib/containers/storage ${MODELIMAGE}
+
+RUN podman system reset --force 2>/dev/null
diff --git a/recipes/natural_language_processing/codegen/bootc/README.md b/recipes/natural_language_processing/codegen/bootc/README.md
@@ -0,0 +1,94 @@
+## Embed workload (AI sample applications) in a bootable container image
+
+### Create a custom centos-bootc:stream9 image
+
+* [Containerfile](./Containerfile) - embeds an LLM-powered sample code generation application.
+
+Details on the application can be found [in the codegen/README.md](../README.md). By default, this Containerfile includes a model-server
+that is meant to run with CPU - no additional GPU drivers or toolkits are embedded. You can substitute the llamacpp_python model-server image
+for one that has GPU drivers and toolkits with additional build-args. The `FROM` must be replaced with a base image that has the necessary
+kernel drivers and toolkits if building for GPU enabled systems. For an example of an NVIDIA/CUDA base image,
+see [NVIDIA bootable image example](https://gitlab.com/bootc-org/examples/-/tree/main/nvidia?ref_type=heads)
+
+In order to pre-pull the workload images, you need to build from the same architecture you're building for.
+If not pre-pulling the workload images, you can cross build (ie, build from a Mac for an X86_64 system).
+To build the derived bootc image for x86_64 architecture, run the following:
+
+```bash
+cd recipes/natural_language_processing/codegen
+
+# for CPU powered sample LLM application
+# to switch to an alternate platform like aarch64, pass --platform linux/arm64
+# the --cap-add SYS_ADMIN switch is needed when you are embedding Podman
+# commands within the container build. If the registry you are pulling images
+# from requires authentication, then you will need to volume mount the
+# auth_json file with SELinux separation disabled.
+podman login --auth-file auth.json quay.io/yourrepo
+podman build --build-arg "sshpubkey=$(cat ~/.ssh/id_rsa.pub)" \
+           --security-opt label=disable \
+	   -v ./auth.json:/run/containers/0/auth.json \
+	   --cap-add SYS_ADMIN \
+	   -t quay.io/yourrepo/youros:tag .
+
+# for GPU powered sample LLM application with llamacpp cuda model server
+podman build --build-arg "sshpubkey=$(cat ~/.ssh/id_rsa.pub)" \
+           --build-arg "model-server-image="quay.io/redhat-et/locallm-llamacpp-cuda-model-server:latest" \
+           --from <YOUR BOOTABLE IMAGE WITH NVIDIA/CUDA> \
+           --cap-add SYS_ADMIN \
+           --platform linux/amd64 \
+           -t quay.io/yourrepo/youros:tag .
+
+podman push quay.io/yourrepo/youros:tag
+```
+
+### Update a bootc-enabled system with the new derived image
+
+To build a disk image from an OCI bootable image, you can refer to [bootc-org/examples](https://gitlab.com/bootc-org/examples).
+For this example, we will assume a bootc enabled system is already running.
+If already running a bootc-enabled OS, `bootc switch` can be used to update the system to target a new bootable OCI image with embedded workloads.
+
+SSH into the bootc-enabled system and run:
+
+```bash
+bootc switch quay.io/yourrepo/youros:tag
+```
+
+The necessary image layers will be downloaded from the OCI registry, and the system will prompt you to reboot into the new operating system.
+From this point, with any subsequent modifications and pushes to the `quay.io/yourrepo/youreos:tag` OCI image, your OS can be updated with:
+
+```bash
+bootc upgrade
+```
+
+### Accessing the embedded workloads
+
+The codegen can be accessed by visiting port `8150` of the running bootc system.
+They will be running as systemd services from Podman quadlet files placed at `/usr/share/containers/systemd/` on the bootc system.
+For more information about running containerized applications as systemd services with Podman, refer to this
+[Podman quadlet post](https://www.redhat.com/sysadmin/quadlet-podman) or, [podman documentation](https://podman.io/docs)
+
+To monitor the sample applications, SSH into the bootc system and run either:
+
+```bash
+systemctl status codegen
+```
+
+You can also view the pods and containers that are managed with systemd by running:
+
+```
+podman pod list
+podman ps -a
+```
+
+To stop the sample applications, SSH into the bootc system and run:
+
+```bash
+systemctl stop codegen
+```
+
+To run the sample application _not_ as a systemd service, stop the services then
+run the appropriate commands based on the application you have embedded.
+
+```bash
+podman kube play /usr/share/containers/systemd/codegen.yaml
+```
diff --git a/...sing/code_generation/builds/Containerfile → ...e_processing/codegen/builds/Containerfile b/...sing/code_generation/builds/Containerfile → ...e_processing/codegen/builds/Containerfile
diff --git a/...g/code_generation/builds/requirements.txt → ...rocessing/codegen/builds/requirements.txt b/...g/code_generation/builds/requirements.txt → ...rocessing/codegen/builds/requirements.txt
diff --git a/...processing/code_generation/codegen-app.py → ...anguage_processing/codegen/codegen-app.py b/...processing/code_generation/codegen-app.py → ...anguage_processing/codegen/codegen-app.py
diff --git a/...ode_generation/llms-vscode-integration.md → ...essing/codegen/llms-vscode-integration.md b/...ode_generation/llms-vscode-integration.md → ...essing/codegen/llms-vscode-integration.md
diff --git a/recipes/natural_language_processing/codegen/quadlet/README.md b/recipes/natural_language_processing/codegen/quadlet/README.md
@@ -0,0 +1,9 @@
+### Run code-generation as a systemd service
+
+```bash
+(cd ../;make quadlet)
+sudo cp ../build/codegen.yaml ../build/codegen.kube /usr/share/containers/systemd/codegen.kube ../build/codegen.image /usr/share/containers/systemd/
+sudo /usr/libexec/podman/quadlet --dryrun #optional
+sudo systemctl daemon-reload
+sudo systemctl start codegen
+```
diff --git a/...ing/code_generation/quadlet/codegen.image → ..._processing/codegen/quadlet/codegen.image b/...ing/code_generation/quadlet/codegen.image → ..._processing/codegen/quadlet/codegen.image
diff --git a/...e_generation/quadlet/codegen.kube.example → ...e_processing/codegen/quadlet/codegen.kube b/...e_generation/quadlet/codegen.kube.example → ...e_processing/codegen/quadlet/codegen.kube
diff --git a/...sing/code_generation/quadlet/codegen.yaml → ...e_processing/codegen/quadlet/codegen.yaml b/...sing/code_generation/quadlet/codegen.yaml → ...e_processing/codegen/quadlet/codegen.yaml