-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BIOS/Firmware concept #117
base: main
Are you sure you want to change the base?
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,381 @@ | ||
--- | ||
title: Server BIOS/Firmware update | ||
oep-number: 1 | ||
creation-date: 2024-09-02 | ||
status: under review | ||
authors: | ||
- "@aobort" | ||
reviewers: | ||
- | ||
--- | ||
|
||
# OEP-0001: Server BIOS/Firmware update | ||
|
||
## Table of Contents | ||
|
||
- [Summary](#summary) | ||
- [Motivation](#motivation) | ||
- [Goals](#goals) | ||
- [Non-Goals](#non-goals) | ||
- [Proposal](#proposal) | ||
- [Custom resources](#custom-resources) | ||
- [ServerFirmware](#serverfirmware) | ||
- [ServerFirmwareGroup](#serverfirmwaregroup) | ||
- [DiscoveredFirmware](#discoveredfirmware) | ||
- [Firmware operator](#firmware-operator) | ||
- [server-controller](#server-controller) | ||
- [server-group-controller](#server-group-controller) | ||
- [discovery-controller](#discovery-controller) | ||
- [Update service](#update-service) | ||
- [API server](#api-server) | ||
- [Scheduler](#scheduler) | ||
- [Job runner](#job-runner) | ||
- [Update/scan/discovery job](#updatescandiscovery-job) | ||
- [Request handling](#request-handling) | ||
- [Alternatives](#alternatives) | ||
|
||
## Summary | ||
|
||
Linked issue: [#99 BIOS/Firmware Update](https://github.com/ironcore-dev/metal-operator/issues/99) | ||
|
||
The following is a concept of a solution aimed to solve listed problems in regard to hardware servers' BIOS/Firmware updates. | ||
The following sections guide through: | ||
|
||
- Kubernetes API types, which represent servers' firmware state; | ||
- Kubernetes operator, which reconciles these API types; | ||
- Dedicated service, which provides API to schedule and execute firmware updates on specified servers; | ||
- Communication between operator and update service; | ||
|
||
Throughout this document, the words are used to define and the significance of particular requirements is capitalized: | ||
|
||
- `MUST` or `REQUIRED` means that the item is mandatory requirement; | ||
- `MUST NOT` means that the item is an absolute prohibition; | ||
- `SHOULD` or `RECOMMENDED` means that there may exist valid reasons in particular circumstances for not complying with an item; | ||
- `SHOULD NOT` means that there may exist valid reasons in particular circumstances when listed behavior is acceptable; | ||
- `MAY` or `OPTIONAL` means that the item is truly optional; | ||
|
||
Throughout this document, the following terminology is used: | ||
|
||
- `firmware operator`: the application running as a workload in Kubernetes cluster, interacting with Kubernetes API. It reconciles custom resources (hereafter CR) related to servers' firmware update workflow; | ||
- `update service`: the application running as a workload in Kubernetes cluster, providing API to schedule update jobs, execute these jobs, collect jobs' results and update corresponding Kubernetes objects; | ||
- `update job`: the execution item, which runs concrete implementation of the BIOS/firmware update routine on target hardware server; | ||
- `scan job`: the execution item, which runs concrete implementation for scanning of the firmware installed on target hardware server; | ||
- `discovery job`: discovering of the BIOS/firmware versions available for download or installation. Vendor-specific as an `update job`; | ||
- `update strategy`: the path chosen to apply updates, e.g.: pre-built boot image with updates, docker image with baked updates, vendor-specific CLI tool, etc. | ||
|
||
## Motivation | ||
|
||
It is necessary to provide a robust, reliable and scalable solution to automate servers' firmware updating process. | ||
Aside from that, it SHOULD also be as much kubernetes-native as possible. | ||
It SHOULD provide a clear and concise API. | ||
It SHOULD provide the ability to automate the update process along with the ability to override common settings in particular circumstances for particular servers. | ||
|
||
### Goals | ||
|
||
The following list gives general design goals for BIOS/Firmware updates: | ||
|
||
- the solution SHOULD be vendor-agnostic aside from concrete update job implementation; | ||
- the solution SHOULD allow automated hardware servers' firmware lifecycle maintaining; | ||
- the solution MUST be extensible by the possibility of using plugins for update strategy; | ||
- the solution MUST be extensible by the possibility of adding vendor-specific update jobs; | ||
- the solution SHOULD be as kubernetes-native as possible; | ||
|
||
### Non-Goals | ||
|
||
## Proposal | ||
|
||
### Custom resources | ||
|
||
The following CRs aimed to represent the current state of a particular server, a desired state of a group of servers and available firmware versions for a particular manufacturer-model: | ||
|
||
- [ServerFirmware](#serverfirmware) | ||
- [ServerFirmwareGroup](#serverfirmwaregroup) | ||
- [DiscoveredFirmware](#discoveredfirmware) | ||
|
||
All the following CRs MUST be cluster-scoped. | ||
The firmware versions defined for concrete `ServerFirmware` object MUST take precedence over those provided by corresponding `ServerFirmwareGroup` object. | ||
|
||
#### ServerFirmware | ||
|
||
`ServerFirmware` CR represents the desired state of concrete hardware server. | ||
The `.spec` of this type references the `Server` object, reflects its `.status.bios` field and contains the list of firmwares desired to be installed. | ||
The `.status` of this type contains information about the BIOS/firmware versions which are actually installed on the server. | ||
Aside from that `.spec` contains the scan threshold and the `.status` contains last scan operation timestamp. | ||
These two fields allow calculating whether the scanning for installed firmware is required or not. | ||
The `ServerFirmware` object SHOULD be created along with corresponding `Server` object and MUST be unique across the cluster. | ||
|
||
```yaml | ||
apiVersion: metal.ironcore.dev/v1alpha1 | ||
kind: ServerFirmware | ||
metadata: | ||
name: foo | ||
spec: | ||
scanThreshold: 30m | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is by There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, naming could be better, for sure |
||
serverRef: | ||
name: foo | ||
bios: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we maybe want to incorporate a vendor/manufacturer in this struct as well? I know the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since update service will anyways request for server object - at least to get related bmc for access type and credentials, I think it's not necessary to store manufacturer and model in this resource. But I have no strong opinion on that, since it would not affect anything. |
||
version: 1.0.0 | ||
firmwares: | ||
- name: ssd | ||
manufacturer: ACME Corp. | ||
version: 1.0.0 | ||
- name: nic | ||
manufacturer: Intel | ||
version: 2.0.0 | ||
status: | ||
lastScanTime: 01-01-2001 01:00:00 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we use a transition condition for this instead of having a dedicated status field? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The idea was not to make scans conditional, but to make sure that the scan will be launched if previous run was far ago enough. Hence if we'll rely on the condition's transition timestamp there will be no difference comparing with dedicated field. However this will cause the need of some additional computation of the server's state: by proposed design the update of the status will be done by update server after scan job reports it's results. Therefore, to set proper condition the update server will have to know also the desired state and to compute difference between firmware discovered by the scan job and desired firmware defined in object's spec, instead of just updating status with timestamp and firmware. |
||
bios: | ||
version: 1.0.0 | ||
firmwares: | ||
- name: ssd | ||
manufacturer: ACME Corp. | ||
version: 1.0.0 | ||
- name: nic | ||
manufacturer: Intel | ||
version: 2.0.0 | ||
``` | ||
|
||
#### ServerFirmwareGroup | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we really need a grouping at this point already? Maybe we should start with the Firmware/BIOS version handling on an individual There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, sure. |
||
|
||
`ServerFirmwareGroup` CR represents the groups of hardware servers and their desired firmware versions. | ||
The group of servers MUST be manufacturer- and model-specific to ensure that defined firmware will be applicable. | ||
The `.spec` of this type contains built-in label selector and the list of firmwares desired to be installed. | ||
The `.status` of this type contains the information about number of servers within the defined group which are in desired state. | ||
The group MUST have a unique selector and selectors in different `ServerFirmwareGroup` objects MUST NOT intersect. | ||
|
||
```yaml | ||
apiVersion: metal.ironcore.dev/v1alpha1 | ||
kind: ServerFirmwareGroup | ||
metadata: | ||
name: bar-group | ||
spec: | ||
manufacturer: Lenovo | ||
model: 7x21 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How are Servers identified by model? The current api only mentions the manufacturer. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unfortunately, I do not have answer. Regarding Lenovo, we can get server's model from SKU. What about other manufacturers - no idea yet. Maybe some API exists from which it could be retrieved. |
||
serverSelector: | ||
matchLabels: | ||
env: prod | ||
firmwares: | ||
- name: ssd | ||
manufacturer: ACME Corp. | ||
version: 1.0.0 | ||
- name: nic | ||
manufacturer: Intel | ||
version: 2.0.0 | ||
status: | ||
serversInGroup: 4 | ||
updatesApplied: 3 | ||
updatesNotApplied: 1 | ||
``` | ||
|
||
#### DiscoveredFirmware | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we really want to automatically "discover" new versions of a Firmware? Or should it be better instructed from the outside: Like I know my There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I bet we want. At least ops should be able to track new versions and for instance check if there are anything related to critical security issues. |
||
|
||
`DiscoveredFirmware` CR represents discovered firmware versions for a specific manufacturer-model. | ||
The `.spec` of this type contains information about manufacture, concrete model,the desired number of versions to store and the interval between each firmware version discovery run. | ||
The `.status` of this type contains the list of firmwares and the last discovery job run time. | ||
Each entry represents the name of individual firmware and the list of available versions. | ||
The maximum length of this list MUST NOT exceed one defined in `.spec`. | ||
The `DiscoveredFirmware` object SHOULD be created as soon as a new manufacturer-model pair is found and MUST be unique across the cluster. | ||
|
||
```yaml | ||
apiVersion: metal.ironcore.dev/v1alpha1 | ||
kind: DiscoveredFirmware | ||
metadata: | ||
name: baz | ||
spec: | ||
manufacturer: Lenovo | ||
model: 7x21 | ||
discoveryInterval: 24h | ||
versionsToStore: 3 | ||
status: | ||
lastDiscoveryTime: 01-01-2001 01:00:00 | ||
bios: | ||
versions: [1.0.0] | ||
firmwares: | ||
- name: ssd | ||
manufacturer: ACME Corp. | ||
version: [1.0.0, 1.1.0, 1.2.0] | ||
- name: nic | ||
manufacturer: Intel | ||
version: [1.5.0, 1.7.0, 2.0.0] | ||
``` | ||
|
||
### Firmware operator | ||
|
||
This is an application that watches and reconciles CRs listed in the previous section. | ||
It consists of the following controllers: | ||
|
||
- [server-controller](#server-controller) (reconciles `ServerFirmware` CR) | ||
- [server-group-controller](#server-group-controller) (reconciles `ServerFirmwareGroup` CR) | ||
- [discovery-controller](#discovery-controller) (reconciles `DiscoveredFirmware` CR) | ||
|
||
The `server-controller` and `discovery-controller` interacts with update service, whilst `server-group-controller` only updates CRs. | ||
|
||
#### server-controller | ||
|
||
This controller reconciles `ServerFirmware` CR. | ||
When an object of this kind is being reconciled, the controller MUST send a scan request to the update service in case `.status.lastScanTime` exceeds the `.spec.scanThreshold`. | ||
After the object becomes updated with actually installed firmware versions, the controller computes the difference between desired state defined in object's `.spec` and actual state reflected in object's `.status`. | ||
If there is discrepancy between these two states, then `server-controller` MUST send an update request to the update service. | ||
After sending any of the mentioned requests, it MUST stop reconciliation by returning an empty result and nil error in case the request was successful and an error otherwise. | ||
|
||
```mermaid | ||
stateDiagram-v2 | ||
s1: server-controller | ||
s2: last scan within threshold? | ||
s3: scan request | ||
s4: update required? | ||
s5: update request | ||
s6: update service | ||
|
||
[*] --> s1: reconciliation request | ||
s1 --> s2: get object and check conditions | ||
s2 --> s3: No | ||
s2 --> s4: Yes | ||
s3 --> s6 | ||
s6 --> [*] | ||
s4 --> s5: Yes | ||
s5 --> s6 | ||
s4 --> [*]: No | ||
``` | ||
|
||
#### server-group-controller | ||
|
||
This controller reconciles `ServerFirmwareGroup` CR. | ||
When an object of this kind is being reconciled, the controller: | ||
|
||
1. MUST discover all `ServerFirmware` objects that matches the defined label selector; | ||
2. for each discovered object it MUST merge `.spec.firmwares` considering that items defined in `ServerFirmware` object's spec take precedence over those defined in `ServerFirmwareGroup` object's spec; | ||
3. MUST update `ServerFirmware` object's `.spec.firmwares` field with a resulting list of firmware versions; | ||
4. SHOULD update object's `.status` with actual values; | ||
|
||
```mermaid | ||
stateDiagram-v2 | ||
s1: server-group-controller | ||
s2: get matching Server objects | ||
s3: for each Server object | ||
s4: merge firmwares | ||
s5: update Server object | ||
s6: update ServerFirmwareGroup object status | ||
|
||
[*] --> s1: reconciliation request | ||
s1 --> s2: get object | ||
s2 --> s3 | ||
s3 --> s4 | ||
s4 --> s5 | ||
s5 --> s3 | ||
s3 --> s6: all matching servers processed | ||
s6 --> [*] | ||
``` | ||
|
||
#### discovery-controller | ||
|
||
This controller reconciles `DiscoveredFirmware` CR. | ||
It MUST send discovery request to the update service to enqueue a firmware discovery job if `.status.lastDiscoveryTime` is older than `.spec.discoveryInterval`. | ||
After sending the request, it MUST stop reconciliation by returning an empty result and nil error in case the request was successful and an error otherwise. | ||
|
||
```mermaid | ||
stateDiagram-v2 | ||
s1: discovery-controller | ||
s2: last discovery older than defined interval? | ||
s3: discovery request | ||
s4: update service | ||
s5: requeue after t | ||
|
||
[*] --> s1: reconciliation request | ||
s1 --> s2: get object and check conditions | ||
s2 --> s5: No | ||
s2 --> s3: Yes | ||
s3 --> s4 | ||
s4 --> [*] | ||
s5 --> [*] | ||
``` | ||
|
||
### Update service | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the update service stateful? If yes, it's state should likely be in CRDs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's supposed to be stateless. |
||
|
||
This is an application providing an API to schedule, execute and collect results of firmware update, discover and scan jobs. | ||
It consists of the following components: | ||
|
||
- [API server](#api-server) | ||
- [scheduler](#scheduler) | ||
- [job runner](#job-runner) | ||
- [update/scan/discovery jobs](#updatescandiscovery-job) (concrete implementations) | ||
|
||
Update service MUST be extensible in part of the possibility of using plugins for update strategy. | ||
At the same time, the only update strategy can be enabled for a particular update service instance. | ||
Update service MAY also include the component to download and store the firmwares. | ||
|
||
#### API server | ||
|
||
API server exposes update service endpoints and forwards incoming requests to the scheduler. | ||
It MUST expose the following endpoints: | ||
|
||
- Update(UpdateRequest) UpdateResponse; | ||
- `UpdateRequest` MUST contain the reference to concrete `Server` object and the list of the firmware-version to be installed. | ||
- `UpdateResponse` MUST contain the status of the request with error code if any. | ||
- Scan(ScanRequest) ScanResponse; | ||
- `ScanRequest` MUST contain the reference to concrete `Server` object. | ||
- `ScanResponse` MUST contain the status of the request with error code if any. | ||
- Discover(DiscoverRequest) DiscoverResponse; | ||
- `DiscoverRequest` MUST contain the reference to concrete `DiscoveredFirmware` object. | ||
- `DiscoverResponse` MUST contain the status of the request with error code if any. | ||
- UpdateServer(UpdateServerRequest) UpdateServerResponse; | ||
- `UpdateServerRequest` MUST contain the reference to concrete `Server` object and the list of installed firmware-versions. | ||
This endpoint MUST be used by update or scan jobs after a task if finished to send results and invoke the object's update. | ||
- `UpdateServerResponse` MUST contain the status of the request with error code if any. | ||
- UpdateDiscoveredFirmware(UpdateDiscoveredFirmwareRequest) UpdateDiscoveredFirmwareResponse; | ||
- `UpdateDiscoveredFirmwareRequest` MUST contain the reference to concrete `DiscoveredFirmware` object and the list of discovered firmware-versions. | ||
This endpoint MUST be used by discovery jobs after a task is finished to send results and invoke the object's update. | ||
- `UpdateDiscoveredFirmwareResponse` MUST contain the status of the request with error code if any. | ||
|
||
Depending on the type of the request, it SHOULD be forwarded to the corresponding scheduler's queue. | ||
|
||
#### Scheduler | ||
|
||
Scheduler is a component of the update service that is responsible for scheduling jobs: | ||
|
||
- it MUST NOT allow running several jobs on the same target server simultaneously; | ||
- it MAY discard incoming update or scan requests if the same jobs targeting the same server are already scheduled; | ||
- it MAY discard incoming discovery requests if the job targeting the same manufacturer-model pair is already scheduled; | ||
- it MUST have a mechanism to limit the number of parallel jobs; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did you consider how this would work together with #76 ? Like, if we schedule a Firmware Upgrade how do we signal this intent to the Workload or Management cluster? Related to above, we should get some kind of health status from Workload cluster when performing an update on a set of servers one by one. Like, when first server upgrade is finished we ensure that cluster is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The idea is to set "Maintenance" state for the server where updates are scheduled. What about upgrading of the cluster, so not to kill it - this might be implemented in update scheduler if there is any API which could help to determine whether current server is a cluster member. |
||
- it MUST have a mechanism to limit the job queue length; | ||
|
||
Scheduler SHOULD have an embedded job runner component corresponding to the update strategy defined on the update service application start. | ||
|
||
#### Job runner | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it be sufficient, if the scheduler spawns Kubernetes jobs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, the idea is to leverage kubernetes jobs. But the application which job would execute would depend on the update strategy, for instance:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Understood. |
||
|
||
Job runner is a component that acts as a pool to spawn worker for a concrete job, considering the chosen update strategy and target server's manufacturer and model pair. | ||
|
||
- it MUST have a mechanism to store the metadata of spawned jobs; | ||
- it MUST have a mechanism to interrupt a running job; | ||
- it MAY have a mechanism to request the state of a long-running task; | ||
|
||
#### Update/scan/discovery job | ||
|
||
Standalone application which is a concrete implementation of firmware update/scan/discovery for a particular manufacturer or manufacturer-model pair. | ||
Implementation depends on the chosen update strategy. | ||
The application MUST run an API server providing the following endpoints: | ||
|
||
- CancelTask(CancelTaskRequest) CancelTaskResponse; | ||
- `CancelTaskRequest` MAY contain the timeout for graceful stop and a flag to force stop. | ||
- `CancelTaskResponse` MUST contain the status of the request with error code if any. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we use this to pause upgrades if we find out that firmware has issues and investigation is needed before proceeding with next servers OR issues in cluster are noticed and we want to halt the upgrade process. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the goal for proposed servers grouping - first run updates on dedicated test group. If all is ok, then run updates on prod servers. |
||
Discovery job MAY provide additional functionality to download firmware from manufacturer's servers and to upload it to the local storage. | ||
Local in this context does not refer to the local filesystem but rather to storage running and available from within the infrastructure. | ||
|
||
Job MUST include an embedded client to be able to interact with the API server using endpoints: | ||
|
||
- UpdateServer(UpdateServerRequest) UpdateServerResponse; | ||
- UpdateDiscoveredFirmware(UpdateDiscoveredFirmwareRequest) UpdateDiscoveredFirmwareResponse; | ||
|
||
#### Request handling | ||
|
||
```mermaid | ||
sequenceDiagram | ||
Firmware operator ->>+API server: send request | ||
API server ->>+Scheduler: enqueue request | ||
Scheduler ->>+Job runner: assign free worker | ||
Job runner ->>+Job: spawn concrete job executor | ||
Job ->>-API server: update request | ||
``` | ||
|
||
## Alternatives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative proposal here: How about splitting this resource into a
ServerBios
andServerFirmwares
. Does it make sense to update the BIOS independently from the Firmware of individual components?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my perspective it might make sense only if there will be completely different workflows to update BIOS and other firmware. Otherwise we'll just duplicate controllers and double CRs.