diff --git a/README.md b/README.md index 9efce2e74..a444f754f 100644 --- a/README.md +++ b/README.md @@ -19,7 +19,7 @@ However, each sample application can be paired with a variety of model servers. Learn how to build and run the llamacpp_python model server by following the [llamacpp_python model server README.](/model_servers/llamacpp_python/README.md). -## Current Recipes: +## Current Recipes There are several sample applications in this repository. They live in the [recipes](./recipes) folder. They fall under the categories: @@ -37,6 +37,23 @@ Learn how to build and run each application by visiting each of the categories a the [chatbot recipe](./recipes/natural_language_processing/chatbot). +## Bootable Containers + +In each sample application, you'll find an `embed-in-bootable-image` folder. +What's a [bootable OCI container](https://containers.github.io/bootc/) and what's it got to do with AI? + +That's a good question! We think it's a good idea to embed AI workloads (or any workload!) into bootable images at _build time_ rather than +at _runtime_. This takes the benefits of containerizing applications (portablity & predictability, for example) a step further. +Bootable OCI images bake exactly what you need to run your workloads at build time using your favorite containerization tools. Might I suggest +[podman](https://podman.io/)? Once installed, a bootc enabled system can be updated by providing an updated bootable OCI image from any OCI +image registry with a single `bootc` command. This works especially well for fleets of devices that have fixed workloads - think +factories or appliances. Who doesn't want to add a little AI to their appliance, am I right? +Bootable images lend toward immutable operating systems, and the more immutable an operating system is, the less that can go wrong at runtime! +So, there is the explanation for why we include in _each_ sample application an `embed-in-bootable-image` folder that provides a `Containerfile` +to define how to embed that application into a bootable OCI image. + +Check out the [chatbot example](./recipes/natural_language_processing/chatbot/embed-in-bootable-image)! + ## Current Locallm Images built from this repository Images for many sample applications and models are available in `quay.io`. All currently built images are tracked in diff --git a/recipes/natural_language_processing/chatbot/embed-in-bootable-image/Containerfile b/recipes/natural_language_processing/chatbot/embed-in-bootable-image/Containerfile new file mode 100644 index 000000000..3743a2d6d --- /dev/null +++ b/recipes/natural_language_processing/chatbot/embed-in-bootable-image/Containerfile @@ -0,0 +1,27 @@ +# In this example, an AI powered sample application will be embedded as a systemd service +# by placing podman quadlet files in /usr/share/containers/systemd + +FROM quay.io/centos-bootc/centos-bootc:stream9 +# Build like this: +# podman build --build-arg "sshpubkey=$(cat ~/.ssh/mykey.pub)" -t quay.io/exampleos/example-image . +#Substitute YOUR public key for the below-private key holder for the following public key will have root access +ARG sshpubkey +ARG model-server-image=quay.io/redhat-et/locallm-model-service:latest + +RUN mkdir /usr/etc-system && \ + echo 'AuthorizedKeysFile /usr/etc-system/%u.keys' >> /etc/ssh/sshd_config.d/30-auth-system.conf && \ + echo $sshpubkey > /usr/etc-system/root.keys && chmod 0600 /usr/etc-system/root.keys + +RUN dnf install -y vim && dnf clean all + +# Code-generation application +COPY quadlet/chatbot.kube.example /usr/share/containers/systemd/chatbot.kube +COPY quadlet/chatbot.yaml /usr/share/containers/systemd/chatbot.yaml +COPY quadlet/chatbot.image /usr/share/containers/systemd/chatbot.image + +# pre-load workload images +# Comment the pull commands to keep bootc image smaller. +# With above quadlet .image file, these will be pulled on boot if not pre-loaded here +RUN podman pull quay.io/redhat-et/locallm-mistral-7b-gguf:latest +RUN podman pull quay.io/redhat-et/locallm-chatbot:latest +RUN podman pull $model-server-image diff --git a/recipes/natural_language_processing/chatbot/embed-in-bootable-image/README.md b/recipes/natural_language_processing/chatbot/embed-in-bootable-image/README.md new file mode 100644 index 000000000..1b80ebd01 --- /dev/null +++ b/recipes/natural_language_processing/chatbot/embed-in-bootable-image/README.md @@ -0,0 +1,88 @@ +## Embed workload (AI sample applications) in a bootable container image + +### Create a custom centos-bootc:stream9 image + +* [Containerfile](./Containerfile) - embeds an LLM-powered sample chat application. + +Details on the application can be found [in the chatbot/README.md](../README.md). By default, this Containerfile includes a model-server +that is meant to run with CPU - no additional GPU drivers or toolkits are embedded. You can substitute the llamacpp_python model-server image +for one that has GPU drivers and toolkits with additional build-args. The `FROM` must be replaced with a base image that has the necessary +kernel drivers and toolkits if building for GPU enabled systems. For an example of an NVIDIA/CUDA base image, +see [NVIDIA bootable image example](https://gitlab.com/bootc-org/examples/-/tree/main/nvidia?ref_type=heads) + +In order to pre-pull the workload images, you need to build from the same architecture you're building for. +If not pre-pulling the workload images, you can cross build (ie, build from a Mac for an X86_64 system). +To build the derived bootc image for x86_64 architecture, run the following: + +```bash +cd recipes/natural_language_processing/chatbot + +# for CPU powered sample LLM application +# to switch to aarch64 platform, pass --platform linux/arm64 +podman build --build-arg "sshpubkey=$(cat ~/.ssh/id_rsa.pub)" \ + --cap-add SYS_ADMIN \ + --platform linux/amd64 \ + -t quay.io/yourrepo/youros:tag . + +# for GPU powered sample LLM application with llamacpp cuda model server +podman build --build-arg "sshpubkey=$(cat ~/.ssh/id_rsa.pub)" \ + --build-arg "model-server-image="quay.io/redhat-et/locallm-llamacpp-cuda-model-server:latest" \ + --from \ + --cap-add SYS_ADMIN \ + --platform linux/amd64 \ + -t quay.io/yourrepo/youros:tag . + +podman push quay.io/yourrepo/youros:tag +``` + +### Update a bootc-enabled system with the new derived image + +To build a disk image from an OCI bootable image, you can refer to [bootc-org/examples](https://gitlab.com/bootc-org/examples). +For this example, we will assume a bootc enabled system is already running. +If already running a bootc-enabled OS, `bootc switch` can be used to update the system to target a new bootable OCI image with embedded workloads. + +SSH into the bootc-enabled system and run: + +```bash +bootc switch quay.io/yourrepo/youros:tag +``` + +The necessary image layers will be downloaded from the OCI registry, and the system will prompt you to reboot into the new operating system. +From this point, with any subsequent modifications and pushes to the `quay.io/yourrepo/youreos:tag` OCI image, your OS can be updated with: + +```bash +bootc upgrade +``` + +### Accessing the embedded workloads + +The chatbot can be accessed by visiting port `8150` of the running bootc system. +They will be running as systemd services from podman quadlet files placed at `/usr/share/containers/systemd/` on the bootc system. +For more information about running containerized applications as systemd services with podman, refer to this +[podman quadlet post](https://www.redhat.com/sysadmin/quadlet-podman) or, [podman documentation](https://podman.io/docs) + +To monitor the sample applications, SSH into the bootc system and run either: + +```bash +systemctl status chatbot +``` + +You can also view the pods and containers that are managed with systemd by running: + +``` +podman pod list +podman ps -a +``` + +To stop the sample applications, SSH into the bootc system and run: + +```bash +systemctl stop chatbot +``` + +To run the sample application _not_ as a systemd service, stop the services then +run the appropriate commands based on the application you have embedded. + +```bash +podman kube play /usr/share/containers/systemd/chatbot.yaml +``` diff --git a/recipes/natural_language_processing/chatbot/embed-in-bootable-image/quadlet/README.md b/recipes/natural_language_processing/chatbot/embed-in-bootable-image/quadlet/README.md new file mode 100644 index 000000000..8578ec6ea --- /dev/null +++ b/recipes/natural_language_processing/chatbot/embed-in-bootable-image/quadlet/README.md @@ -0,0 +1,10 @@ +### Run chatbot as a systemd service + +```bash +cp chatbot.yaml /usr/share/containers/systemd/chatbot.yaml +cp chatbot.kube.example /usr/share/containers/chatbot.kube +cp chatbot.image /usr/share/containers/chatbot.image +/usr/libexec/podman/quadlet --dryrun (optional) +systemctl daemon-reload +systemctl start chatbot +``` diff --git a/recipes/natural_language_processing/chatbot/quadlet/chatbot.image b/recipes/natural_language_processing/chatbot/embed-in-bootable-image/quadlet/chatbot.image similarity index 100% rename from recipes/natural_language_processing/chatbot/quadlet/chatbot.image rename to recipes/natural_language_processing/chatbot/embed-in-bootable-image/quadlet/chatbot.image diff --git a/recipes/natural_language_processing/chatbot/quadlet/chatbot.kube.example b/recipes/natural_language_processing/chatbot/embed-in-bootable-image/quadlet/chatbot.kube.example similarity index 100% rename from recipes/natural_language_processing/chatbot/quadlet/chatbot.kube.example rename to recipes/natural_language_processing/chatbot/embed-in-bootable-image/quadlet/chatbot.kube.example diff --git a/recipes/natural_language_processing/chatbot/quadlet/chatbot.yaml b/recipes/natural_language_processing/chatbot/embed-in-bootable-image/quadlet/chatbot.yaml similarity index 100% rename from recipes/natural_language_processing/chatbot/quadlet/chatbot.yaml rename to recipes/natural_language_processing/chatbot/embed-in-bootable-image/quadlet/chatbot.yaml diff --git a/recipes/natural_language_processing/chatbot/quadlet/README.md b/recipes/natural_language_processing/chatbot/quadlet/README.md deleted file mode 100644 index 3edb09902..000000000 --- a/recipes/natural_language_processing/chatbot/quadlet/README.md +++ /dev/null @@ -1,10 +0,0 @@ -### Run chatbot-langchain as a systemd service - -```bash -cp chatbot.yaml /etc/containers/systemd/chatbot.yaml -cp chatbot.kube.example /etc/containers/chatbot.kube -cp chatbot.image /etc/containers/chatbot.image -/usr/libexec/podman/quadlet --dryrun (optional) -systemctl daemon-reload -systemctl start chatbot -```