status | title | creation-date | last-updated | authors | |
---|---|---|---|---|---|
proposed |
Workspace Hinting |
2021-09-03 |
2021-10-26 |
|
- Summary
- Motivation
- Requirements
- Proposal
- Design Details
- Test Plan
- Design Evaluation
- Drawbacks
- Alternatives
- Infrastructure Needed (optional)
- Upgrade & Migration Strategy (optional)
- Implementation Pull request(s)
- Future Work
- References (optional)
Workspaces allow Task authors to declare portions of their Task's filesystem to be supplied at runtime by TaskRuns or PipelineRuns. For example a Task may accept a credential via an optional Workspace and a TaskRun might supply it from a Secret. Another Task might write source code to a Workspace and a PipelineRun could bind a Persistent Volume to it so the source can be passed to other PipelineTasks.
Rephrasing this slightly: the interface that Workspaces expose caters to a number of pretty disjoint use-cases - it's general-purpose. A down-side of that is Task authors can't communicate a Workspace's intended usage in a machine-readable way. There's no way for an author to indicate "this Workspace is intended to accept a credential" or "this Workspace should be supplied with configuration". Similarly for Pipeline authors, there's no way to hint that a Workspace is used to shuttle data around between Tasks. They can write a human-readable description as part of the workspace declaration but that's essentially useless to an automated system constructing TaskRuns and PipelineRuns.
The purpose of this TEP is to allow Task and Pipeline authors to "hint" about the intended purpose of a Workspace. The idea is that if authors can mark Workspaces with a purpose then automated systems could be designed to submit reasonable default bindings for them.
- Provide a way for Tasks and Pipelines to declare the purpose of Workspaces in a machine-readable format.
- Adding constraint-checking or any other logic to Pipelines to validate bound Workspaces based on workspace hints. The potential scope related to a feature like this would be subtly massive. This TEP is trying to hold focus on the "external system" / "machine-readable" use-case. In future we may want to build higher level abstractions related to this proposal which could leverage hinting.
The Tekton Workflows project is currently exploring ways to pass Secrets from a high-level Workflow description into a PipelineRun. This is made considerably more difficult because Pipelines can't indicate which of their Workspaces might be the right one to bind those Secrets to. See the Aug 31, 2021 Workflows WG Meeting Notes.
- Hinting must be optional: we don't want to suddenly invalidate every Task or Pipeline that currently includes a Workspace.
At this stage in the proposal we're just capturing some options to consider. As we move to implementable we'll settle this design on one of them and flesh it out more fully.
cf. Pipelines#4083
Allow Task and Pipeline authors to explicitly declare some or all of a default Workspace Binding which PipelineRuns and TaskRuns can use or override:
kind: Task
spec:
workspaces:
- name: docker-json
mountPath: /wherever/docker/json/goes
default:
secret:
secretName: my-docker-json
A TaskRun referencing this Task could either provide a docker-json
Workspace or omit it. If omitted the Task's default would be used.
A Pipeline could take a similar approach with a volumeClaimTemplate
:
kind: Pipeline
spec:
workspaces:
- name: shared-data
default:
volumeClaimTemplate:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 256M
- Very explicit about the Task's expectations for the content provided.
- Pipelines can override a Tasks' expected types - so, for example, a Task might expect a ConfigMap but a Pipeline might override with a PV instead.
- Offers its own benefits beyond hinting, such as being able to offer a Pipeline that "just works" out of the box without any tricky workspace configuration in the PipelineRun.
- Doesn't preclude adding more explicit hinting later.
- Not just an API change - there will be some logic involved here on the Pipelines controller side to apply the default workspace config to runs.
- Questions around future extensions stemming from this change are quite
nuanced:
- What if an author wants the "default" to actually be a requirement and for the TaskRun to fail if it's missing? For example a deploy Task requiring a Secret with name "cluster-key" to exist in the TaskRun's namespace.
- What if an author wants to use a default ConfigMap only if that
default exists in the TaskRun's namespace but otherwise a fallback
like an
emptyDir
? - What if a Catalog Pipeline author attaches a PersistentVolume type or StorageClass that is only available on a subset of cloud providers?
The precise name of this field can be iterated on but for now let's assume "profile".
A Workspace Declaration in a Task or Pipeline can include a profile
field that is a string matching a fixed set of available options:
"cache"
to hint that the workspace will be used as a cache for performance or reproducibility (e.g. a system might require all teams to use a sharednode_modules
directory when compiling their frontends)."config"
to hint that the workspace is intended to supply some configuration or settings."credential"
to hint that the workspace will be used to perform authenticated actions."data"
to hint that the contents are arbitrary data either consumed or produced by the Task.
A third-party system can attach its own meaning to each of these
profiles. A "credential"
Workspace could be populated from a Secret or
Secret-like volume. A "cache"
Workspace might be supplied with a
long-lived read-only Persistent Volume. A "data"
Workspace might be
assumed to require an ephemeral Persistent Volume that lives only as
long as the PipelineRun. "configuration"
Workspaces could map
consistently to ConfigMaps
. Importantly: these decisions are left up
to the external / platform. Our own Workflows project may be able to
utilize these profiles, for example, to make informed choices when
creating a PipelineRun
based solely on Pipeline
YAML, supplied list
of volumes and set of Secret references.
Here's an example from a git-clone
-like Task that accepts an optional
GitHub deploy key:
workspaces:
- name: deploy-key
readOnly: true
optional: true
profile: credential
- name: output
profile: data
- It's a bit unclear what the incentive for including
profiles
would be for Catalog Task authors. How would they "figure out" the purpose and correct values to put in here?
This approach would be entirely ad-hoc: Task authors could include hash
tags in their Workspaces' description
fields. A platform could scan
for them and act accordingly. User Interfaces like Hub could be
programmed to ignore them or surface them in their own visual component.
Here's what Workspaces for a go-build
Task might look like with these:
workspaces:
- name: source-code
readOnly: true
description: "The source of a go program. #data"
- name: output
description: "Compiled binaries will be written here. #data"
- Free-form.
- The set of recognized hash-tags could be specified and validated by
Pipelines (
#cache
,#config
,#credential
,#data
).
- Syntactically different from the "profiles" alternative but otherwise not functionally all that different.
- Sets a precedent for expanding the description of a workspace to include other metadata.
Use an external JSON file or annotations on the Task to describe the extra meaning being given to workspaces.
- Non-API change.
New fields that allow volumes to be bound with different defaults. For example,
a credentials
field where the bound volumes will by default be mounted as
read-only. Example syntax:
workspaces:
- name: data
credentials:
- name: git # volumeMount will default to readOnly:true
- name: shortlivedtoken
readOnly: false
- Very clear how a volume is intended to be used.
- Not tied to one specific type of volume.
- Not "stringly typed".
- Easy to validate.
- Structurally similar to existing
workspaces
feature.
- Adding new alternative fields requires API changes.
- Expand support for hinting to include validation, fallback behaviour, a broader range of possible "hints" (e.g. minimum persistent volume size) etc.