Add user-facing doc "how to chain with pipeline"

The doc includes instruction on how to configure a pipeline/task so that Tekton Chains can generate provenance properly. Signed-off-by: Chuang Wang <[email protected]>
tektoncd · Sep 1, 2023 · fda2c70 · fda2c70
1 parent de28e92
commit fda2c70
Show file tree

Hide file tree

Showing 2 changed files with 204 additions and 0 deletions.
diff --git a/docs/how-to-chain-with-pipeline.md b/docs/how-to-chain-with-pipeline.md
@@ -0,0 +1,204 @@
+# How to chain with pipeline
+
+## Goal
+
+This doc includes instructions for how to configure a Tekton Pipeline/Task
+so that Tekton Chains can generate SLSA provenances properply.
+
+## Glossary
+- ***SLSA***: SLSA stands for Supply-chain Levels for Software Artifacts, or SLSA ("salsa"). It’s a security framework, a checklist of standards and controls to prevent tampering, improve integrity, and secure packages and infrastructure. It’s how you get from "safe enough" to being as resilient as possible, at any link in the chain. ([source](https://slsa.dev/))
+- ***Attestation*** ([in-toto attestation](https://github.com/in-toto/attestation/blob/main/spec/README.md)): An in-toto attestation is authenticated metadata about one or more software artifacts. The intended consumers are automated policy engines, such as in-toto-verify and Binary Authorization. There are [a variety of attestations](https://github.com/in-toto/attestation/tree/main/spec/predicates), and the type of attestation is determined by the [predicate](https://github.com/in-toto/attestation/blob/main/spec/v1/predicate.md).
+- ***SLSA Provenance***: SLSA Provenance is an attestation that a build platform generated to describe how an artifact or set of artifacts was produced. As of the date of this writing, there are 3 versions of SLSA provenance: [SLSA V1 (latest)](https://slsa.dev/spec/v1.0/provenance), [SLSA V0.2](https://slsa.dev/spec/v0.2/provenance) and [SLSA V0.1](https://slsa.dev/spec/v0.1/provenance).
+- ***Pipeline-level provenance***: A SLSA provenance that Tekton Chains generates to cover the whole picture of the PipelineRun execution.
+- ***Task-level provenance***: A SLSA provenance that Tekton Chains generates to only include the details of a particular TaskRun execution. It's particularly needed for a standalone TaskRun that is not spawned by a PipelineRun. By contrast, if it's a child TaskRun of a PipelineRun, Task-level provenance will miss the details of other TaskRuns within that Pipeline.
+- ***Input Artifacts***: A canonical term used in this doc to refer to the artifacts that influenced the build process such as source code repository, dependencies and so on. It's mapped to `resolvedDependences` field in SLSA v1.0, and mapped to `materials` field in SLSA v0.1 & v0.2.
+- ***Output Artifacs***: A canonical term used in this doc to refer to the artifacts that the build process produced i.e. an OCI image. This is mapped to `Subjects` field in all SLSA versions.
+- ***`Results`***: `Results` are Tekton API fields that authors can use to emit some information after a TaskRun/PipelineRun is complete. `Results` can be used to pass along information to different tasks within a pipeline or aggregate different task results to a pipeline result. Check out [Tekton official doc](https://tekton.dev/docs/pipelines/pipelines/#using-results) more information. *Note: API result field is completely different from [Tekton Results Operator](https://tekton.dev/docs/results/)*.
+- ***Type hinting***: Refer to specially named results/params that aim to enable Tekton Chains to understand the input artifacts and outputs of a PipelineRun/TaskRun.
+
+
+## How does Tekton Chains work?
+
+Tekton Chains works by reconciling the run of a task or a pipeline. Once the run is observed as `completed`, Tekton Chains will take a snapshot of the completed TaskRun/PipelineRun, and start its core works in the order of ***`formatting`*** (generate provenance json) -> ***`signing`*** (sign the payload using the key configured by user) -> ***`uploading`*** (upload the provenance and its signature to the storage configured by user).
+
+![](../images/how-chains-works.png)
+
+
+
+## How to configure Task/Pipeline
+
+As mentioned in the [Glossary](#glossary), SLSA provenance describes the build process of a particular artifact being produced. While Tekton Chains is able to capture the build process regardless of how the pipeline was configured, it is mandatory to signal Chains what the output and input artifacts are in the pipeline config. The way to do that is through the type hinting.
+- Task-level Provenance: The type hinting carrying the references of input/output artifacts should be defined in the Task Spec obviously.
+- Pipeline-level Provenance: The type hinting carrying the references of input/output artifacts can be defined either in the Task Spec or in Pipeline Spec. By default, Chains works by looking at pipeline-level type hinting only. If the feature flag [`artifacts.pipelinerun.enable-deep-inspection`](config.md#pipelinerun-configuration) is enabled, Chains will also dive deep into each child taskruns to look for those those type hintings defined in the task spec.
+  > Rule of thumb:
+  >  - If the task used in the pipeline already produces type hinting (i.e. [kaniko task](https://github.com/tektoncd/catalog/tree/main/task/kaniko/0.6#results)), there is no need to propagate these values to the pipeline level. one only needs to enable the feature flag mentioned above to let Chains figure out itself.
+  >  - If some/all tasks being reused in the pipeline do not produce type hinting and cannot be modified to do so, one needs to propagate the values to the pipeline-level type hinting.
+
+## How type hinting should be wrote exactly
+
+Type hinting is distinct from input and output artifacts, and Chains supports different options for each. However, one thing that all input and output artifacts have in common is that they must have a URI and digest pair. These are the key components of type hinting.
+
+---
+
+### Input Artifacts
+Input artifacts can be defined either in `params` or `results` using following specially named pairs. It's worth noting that the value for the digest component needs to be precise commit SHA. It can't be other mutable references i.e. tag, branch name and so on.
+
+#### Option 1: string type - `CHAINS-GIT_URL` and `CHAINS-GIT_COMMIT`
+
+In this approach, one can define the url of the source code repository and the precise commit sha digest in type hinting ***exactly named as `CHAINS-GIT_URL` and `CHAINS-GIT_COMMIT`*** respectively. This can be either params or results. If one only wants to use params to pass tag/branch name instead of precise commit sha, it's better to let the clone repo task to report the cloned repo url and commit sha digest and write them to the type hinting results.
+
+<details>
+  <summary>Click me to see an example</summary>
+
+  ```yaml
+  apiVersion: tekton.dev/v1beta1
+  kind: Task
+  metadata:
+    name: git-clone
+  spec:
+    params:
+      - name: url
+        description: Repository URL to clone from.
+        type: string
+        default: "https://github.com/tektoncd/pipeline"
+      - name: revision
+        description: Revision to checkout. (branch, tag, sha, ref, etc...)
+        type: string
+        default: "main"
+    results:
+      - name: CHAINS-GIT_URL
+        type: string
+        description: The precise URL that was fetched by this Task.
+      - name: CHAINS-GIT_COMMIT
+        type: string
+        description: The precise commit SHA that was fetched by this Task.
+    steps:
+      - name: clone
+        # the step will report cloned repo uri and the precise commit SHA and write them to type hinting results.
+        # i.e.
+        # - write `https://github.com/tektoncd/pipeline` to `CHAINS-GIT_URL`
+        # - write `7f2f46e1b97df36b2b82d1b1d87c81b8b3d21601` to `CHAINS-GIT_COMMIT`
+  ```
+
+</details>
+
+#### Option 2: object type (a.k.a dictionary) - `ARTIFACT_INPUTS` with 2 keys `uri` and `digest`
+
+In this approach, one can group the url of the source code repository and the precise commit sha into a single object type hinting. The object type hinting only needs to have the ***suffix `ARTIFACT_INPUTS`*** and have the 2 keys exactly named as `uri` and `digest`. This is particularly useful if there are multiple input artifacts. For example, one object type hinting can be `first_ARTIFACT_INPUTS` and another one is `second_ARTIFACT_INPUTS`.
+
+> Note: 
+> - The digest component must be in the format of `cryptographic hash algorithm name` + `:` + `a valid hex value` i.e. "sha1:7f2f46e1b97df36b2b82d1b1d87c81b8b3d21601".
+
+
+<details>
+  <summary>Click me to see an example</summary>
+
+  ```yaml
+  apiVersion: tekton.dev/v1beta1
+  kind: Task
+  metadata:
+    name: git-clone
+  spec:
+    params:
+      - name: url
+        description: Repository URL to clone from.
+        type: string
+        default: "https://github.com/tektoncd/pipeline"
+      - name: revision
+        description: Revision to checkout. (branch, tag, sha, ref, etc...)
+        type: string
+        default: "main"
+    results:
+      - name: source_repo_ARTIFACT_INPUTS
+        description: The source code repo artifact
+        type: object
+        properties:
+          uri: {}
+          digest: {}
+    steps:
+      - name: clone
+        # the step will report cloned repo uri and immutable revision and write to source_repo_ARTIFACT_INPUTS.uri and source_repo_ARTIFACT_INPUTS.digest respectively.
+        # i.e.
+        # - write `https://github.com/tektoncd/pipeline` to `source_repo_ARTIFACT_INPUTS.uri`
+        # - write `sha1:7f2f46e1b97df36b2b82d1b1d87c81b8b3d21601` to `source_repo_ARTIFACT_INPUTS.digest`
+  ```
+
+</details>
+
+---
+
+### Output Artifacts
+Output artifacts should be defined in `results` only, using following specially named pairs.
+
+
+#### Option 1: string type - `IMAGE_URL` and `IMAGE_DIGEST`
+In this approach, one can write the url and digest of an output OCI artifact into 2 results that have same prefix, but the one for url has suffix `IMAGE_URL` and the one for digest has suffix `IMAGE_DIGEST`.
+
+
+> Note: 
+> - The `IMAGE_URL` component must be a valid container repository URL. 
+> - The `IMAGE_DIGEST` component must be in the format of `cryptographic hash algorithm name` + `:` + `a valid hex value` i.e. "sha256:586789aa031fafc7d78a5393cdc772e0b55107ea54bb8bcf3f2cdac6c6da51ee"
+
+<details>
+  <summary>Click me to see an example</summary>
+
+  ```yaml
+  apiVersion: tekton.dev/v1beta1
+  kind: Task
+  metadata:
+    name: image-build
+  spec:
+    results:
+      - name: first-image-IMAGE_URL
+        type: string
+        description: The precise URL of the OCI image built.
+      - name: first-image-IMAGE_DIGEST
+        type: string
+        description: The algorithm and digest of the OCI image built.
+    steps:
+      - name: build
+        # the step will report the url and digest of the built image to first-image-IMAGE_URL and first-image-IMAGE_DIGEST respectively.
+        # i.e.
+        # - write `gcr.io/foo/bar` to `first-image-IMAGE_URL`
+        # - write `sha256:586789aa031fafc7d78a5393cdc772e0b55107ea54bb8bcf3f2cdac6c6da51ee` to `first-image-IMAGE_DIGEST`
+  ```
+</details>
+
+
+#### Option 2: object type (a.k.a dictionary) - `ARTIFACT_OUTPUTS` with 2 keys `uri` and `digest`
+
+In this approach, one can group the url and digest of the output artifact a single object result. The object result only needs to have the ***suffix `ARTIFACT_OUTPUTS`*** and have the 2 keys exactly named as `uri` and `digest`. This is particularly useful if there are multiple artifacts produced throughout a task. For example, one object type hinting can be `first_ARTIFACT_OUTPUTS` and another one is `second_ARTIFACT_OUTPUTS`.
+
+> Note: 
+> - The digest component must be in the format of `cryptographic hash algorithm name` + `:` + `a valid hex value` i.e. "sha256:586789aa031fafc7d78a5393cdc772e0b55107ea54bb8bcf3f2cdac6c6da51ee".
+
+
+<details>
+  <summary>Click me to see an example</summary>
+
+  ```yaml
+  apiVersion: tekton.dev/v1beta1
+  kind: Task
+  metadata:
+    name: image-build
+  spec:
+    results:
+      - name: first-ARTIFACT_OUTPUTS
+        description: The first artifact built
+        type: object
+        properties:
+          uri: {}
+          digest: {}
+    steps:
+      - name: build
+        # the step will report the url and digest of the built artifact to first-ARTIFACT_OUTPUTS.uri and first-ARTIFACT_OUTPUTS.digest respectively.
+        # i.e.
+        # - write `gcr.io/foo/bar` to `first-ARTIFACT_OUTPUTS.uri`
+        # - write `sha256:586789aa031fafc7d78a5393cdc772e0b55107ea54bb8bcf3f2cdac6c6da51ee` to `first-ARTIFACT_OUTPUTS.digest`
+  ```
+</details>
+
+
+#### Option 3: string type - `ARTIFACT_URI` and `ARTIFACT_DIGEST`
+
+Similar to option 1, but just with different names.
diff --git a/images/how-chains-works.png b/images/how-chains-works.png