From 46604b75b0e4ce2b569c8f9c2ec4e6afdbfbc75f Mon Sep 17 00:00:00 2001
From: Colin Walters <walters@verbum.org>
Date: Sun, 24 Mar 2024 09:40:49 -0400
Subject: [PATCH] docs: Add a new "build guidance" section

I originally was thinking these docs needed to live in
downstream places but...it will be really helpful
to us to have generic recommended guidance here.

Signed-off-by: Colin Walters <walters@verbum.org>
---
 docs/src/SUMMARY.md        |   4 +
 docs/src/build-guidance.md | 153 +++++++++++++++++++++++++++++++++++++
 2 files changed, 157 insertions(+)
 create mode 100644 docs/src/build-guidance.md

diff --git a/docs/src/SUMMARY.md b/docs/src/SUMMARY.md
index cf2b5e989..64e57424b 100644
--- a/docs/src/SUMMARY.md
+++ b/docs/src/SUMMARY.md
@@ -6,6 +6,10 @@
 
 - [Installation](installation.md)
 
+# Building images
+
+- [Building images](build-guidance.md)
+
 # Using bootc
 
 - [Upgrade and rollback](upgrades.md)
diff --git a/docs/src/build-guidance.md b/docs/src/build-guidance.md
new file mode 100644
index 000000000..da8ad9947
--- /dev/null
+++ b/docs/src/build-guidance.md
@@ -0,0 +1,153 @@
+# Generic guidance for building images
+
+The bootc project intends to be operating system and distribution independent as possible,
+similar to its related projects [podman](http://podman.io/) and [systemd](https://systemd.io/),
+etc.
+
+The recommendations for creating bootc-compatible images will in general need to
+be owned by the OS/distribution - in particular the ones who create the default
+bootc base image(s). However, some guidance is very generic to most Linux
+systems (and bootc only supports Linux).
+
+Let's however restate a base goal of this project:
+
+> The original Docker container model of using "layers" to model
+> applications has been extremely successful.  This project
+> aims to apply the same technique for bootable host systems - using
+> standard OCI/Docker containers as a transport and delivery format
+> for base operating system updates.
+
+Every tool and technique for creating application base images
+should apply to the host Linux OS as much as possible.
+
+## Installing software
+
+For package management tools like `apt`, `dnf`, `zypper` etc.
+(generically, `$pkgsystem`) it is very much expected that
+the pattern of
+
+`RUN $pkgsystem install somepackage && $pkgsystem clean all`
+
+type flow Just Works here - the same way as it does
+"application" container images.  This pattern is really how
+Docker got started.
+
+There's not much special to this that doesn't also apply
+to application containers; but see below.
+
+## systemd units
+
+The model that is most popular with the Docker/OCI world
+is "microservice" style containers with the application as
+pid 1, isolating the applications from each other and
+from the host system - as opposed to "system containers"
+which run an init system like systemd, typically also
+SSH and often multiple logical "application" components
+as part of the same container.
+
+The bootc project generally expects systemd as pid 1,
+and if you embed software in your derived image, the
+default would then be that that software is initially
+launched via a systemd unit.
+
+```
+RUN dnf -y install postgresql
+```
+
+Would typically also carry a systemd unit, and that
+service will be launched the same way as it would
+on a package-based system.
+
+## Users and groups
+
+This is one of the more complex topics. Generally speaking, bootc has nothing to
+do directly with configuring users or groups; it is a generic OS
+update/configuration mechanism. (There is currently just one small exception in
+that `bootc install` has a special case `--root-ssh-authorized-keys` argument,
+but it's very much optional).
+
+### Generic base images
+
+Commonly OS/distribution base images will be generic, i.e.
+without any configuration.  It is *strongly recommended*
+to avoid hardcoded passwords and ssh keys with publicly-available
+private keys (as Vagrant does) in generic images.
+
+#### Injecting SSH keys via systemd credentials
+
+The systemd project has documentation for [credentials](https://systemd.io/CREDENTIALS/)
+which can be used in some environments to inject a root
+password or SSH authorized_keys.  For many cases, this
+is a best practice.
+
+#### Injecting users and SSH keys via cloud-init, etc.
+
+Many IaaS and virtualization systems are oriented towards a "metadata server"
+(see e.g. [AWS instance metadata](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html))
+that are commonly processed by software such as [cloud-init](https://cloud-init.io/)
+or equivalent.
+
+The base image you're using may include such software, or you
+can install it in your own derived images.
+
+#### Adding users and credentials in the image
+
+Relative to package-oriented systems, a new ability is to inject
+users and credentials as part of a derived build:
+
+```dockerfile
+RUN useradd someuser
+```
+
+However, it is important to understand some issues with this:
+
+- Typically user/group IDs are allocated dynamically, and this can result in "drift" (see below)
+- For systems configured with persistent `/home` → `/var/home`, any changes to `/var` made
+  in the container image after initial installation will not be applied on subsequent updates.
+
+##### Using systemd-sysusers
+
+See [systemd-sysusers](https://www.freedesktop.org/software/systemd/man/latest/systemd-sysusers.html).
+
+A key aspect of how this works is that `sysusers` will make changes
+to the traditional `/etc/passwd` file - meaning state is machine local,
+and there can be hysteresis.
+
+##### Using systemd JSON user records
+
+See [JSON user records](https://systemd.io/USER_RECORD/).  Unlike `sysusers`,
+the canonical state for these live in `/usr`, avoiding potential drift/hysteresis
+across updates and ensuring immutability.
+
+### Machine-local state for users
+
+At this point, it is important to understand the [filesystem](filesystem.md)
+layout - the default is up to the base image.
+
+The default Linux concept of a user has data stored in both `/etc` (`/etc/passwd`, `/etc/shadow` and groups)
+and `/home`.  The choice for how these work is up to the base image, but
+a common default for generic base images is to have both be machine-local persistent state.
+In this model `/home` would be a symlink to `/var/home/someuser`.
+
+But it is also valid to default to having e.g. `/home` be a `tmpfs`
+to ensure user data is cleaned up across reboots (and this pairs particularly
+well with a transient `/etc` as well).
+
+#### Injecting users and SSH keys via at system provisioning time
+
+For base images where `/etc` and `/var` are configured to persist by default, it
+will then be generally supported to inject users via "installers" such
+as [Anaconda](https://github.com/rhinstaller/anaconda/) (interactively or
+via kickstart) or any others.
+
+Typically generic installers such as this are designed for "one time bootstrap"
+and again then the configuration becomes mutable machine-local state
+that can be changed "day 2" via some other mechanism.
+
+The simple case is a user with a password - typically the installer helps
+set the initial password, but to change it there is a different in-system
+tool (such as `passwd` or a GUI as part of [Cockpit](https://cockpit-project.org/), GNOME/KDE/etc).
+
+It is intended that these flows work equivalently in a bootc-compatible
+system, to support users directly installing "generic" base images, without
+requiring changes to the tools above.