diff --git a/.chloggen/1600-cicd-metrics.yaml b/.chloggen/1600-cicd-metrics.yaml new file mode 100644 index 0000000000..724de71ed3 --- /dev/null +++ b/.chloggen/1600-cicd-metrics.yaml @@ -0,0 +1,26 @@ +# Use this changelog template to create an entry for release notes. +# +# If your change doesn't affect end users you should instead start +# your pull request title with [chore] or use the "Skip Changelog" label. + +# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' +change_type: enhancement + +# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db) +component: cicd + +# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). +note: Add CICD metrics + +# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. +# The values here must be integers. +issues: [1600] + +# (Optional) One or more lines of additional information to render under the primary note. +# These lines will be padded with 2 spaces and then inserted directly into the document. +# Use pipe (|) for multiline entries. +subtext: | + Makes the following changes: + + - Add metrics `cicd.pipeline.run.duration`, `cicd.pipeline.run.executing`, `cicd.queue.latency`, `cicd.queue.length`, `cicd.worker.count`, `cicd.errors`. + - The CICD attributes `cicd.pipeline.result`, `cicd.worker.state` and `cicd.worker.class` have been added to the registry. diff --git a/.vscode/settings.json b/.vscode/settings.json index 069acb27c7..f1709edc98 100644 --- a/.vscode/settings.json +++ b/.vscode/settings.json @@ -14,5 +14,6 @@ "model/**/*.yaml" ] }, - "json.schemaDownload.enable": true + "json.schemaDownload.enable": true, + "markdown.extension.toc.levels": "2..6" } diff --git a/docs/attributes-registry/cicd.md b/docs/attributes-registry/cicd.md index 8b585d8fe4..8e7ef614b5 100644 --- a/docs/attributes-registry/cicd.md +++ b/docs/attributes-registry/cicd.md @@ -13,11 +13,27 @@ This group describes attributes specific to pipelines within a Continuous Integr | Attribute | Type | Description | Examples | Stability | |---|---|---|---|---| | `cicd.pipeline.name` | string | The human readable name of the pipeline within a CI/CD system. | `Build and Test`; `Lint`; `Deploy Go Project`; `deploy_to_environment` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `cicd.pipeline.result` | string | The result of a pipeline run. | `success`; `failure`; `timeout`; `skipped` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `cicd.pipeline.run.id` | string | The unique identifier of a pipeline run within a CI/CD system. | `120912` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `cicd.pipeline.task.name` | string | The human readable name of a task within a pipeline. Task here most closely aligns with a [computing process](https://wikipedia.org/wiki/Pipeline_(computing)) in a pipeline. Other terms for tasks include commands, steps, and procedures. | `Run GoLang Linter`; `Go Build`; `go-test`; `deploy_binary` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `cicd.pipeline.task.run.id` | string | The unique identifier of a task run within a pipeline. | `12097` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `cicd.pipeline.task.run.url.full` | string | The [URL](https://wikipedia.org/wiki/URL) of the pipeline run providing the complete address in order to locate and identify the pipeline run. | `https://github.com/open-telemetry/semantic-conventions/actions/runs/9753949763/job/26920038674?pr=1075` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `cicd.pipeline.task.type` | string | The type of the task within a pipeline. | `build`; `test`; `deploy` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `cicd.worker.class` | string | The type of worker / agent used by the CICD system. | `vm`; `pod` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `cicd.worker.state` | string | The state of a CICD worker / agent. | `idle`; `busy`; `down` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +--- + +`cicd.pipeline.result` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `cancelled` | The pipeline run was cancelled, eg. by a user manually cancelling the pipeline run. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `error` | The pipeline run failed due to an error in the CICD system, eg. due to the worker being killed. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `failure` | The pipeline run did not finish successfully, eg. due to a compile error or a failing test. Such failures are usually detected by non-zero exit codes of the tools executed in the pipeline run. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `skipped` | The pipeline run was skipped, eg. due to a precondition not being met. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `success` | The pipeline run finished successfully. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `timeout` | A timeout caused the pipeline run to be interrupted. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | --- @@ -28,3 +44,25 @@ This group describes attributes specific to pipelines within a Continuous Integr | `build` | build | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `deploy` | deploy | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `test` | test | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +--- + +`cicd.worker.class` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `container` | A single container. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `pod` | One or more containers deployed together. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `vm` | A virtual machine or baremetal host. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +--- + +`cicd.worker.state` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `busy` | The worker is performing work for the CICD system. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `down` | The worker is not available to the CICD system (disconnected / down). | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `idle` | The worker is not performing work for the CICD system. It is available to the CICD system to perform work on. [1] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +**[1]:** Pipelines might have conditions on which workers they are able to run so not every worker might be available to every pipeline. diff --git a/docs/cicd/cicd-metrics.md b/docs/cicd/cicd-metrics.md index 687eb76e43..1d34365d8a 100644 --- a/docs/cicd/cicd-metrics.md +++ b/docs/cicd/cicd-metrics.md @@ -10,6 +10,13 @@ linkTitle: CICD metrics +- [CICD Metrics](#cicd-metrics) + - [Metric: `cicd.pipeline.run.duration`](#metric-cicdpipelinerunduration) + - [Metric: `cicd.pipeline.run.executing`](#metric-cicdpipelinerunexecuting) + - [Metric: `cicd.queue.latency`](#metric-cicdqueuelatency) + - [Metric: `cicd.queue.length`](#metric-cicdqueuelength) + - [Metric: `cicd.worker.count`](#metric-cicdworkercount) + - [Metric: `cicd.errors`](#metric-cicderrors) - [VCS Metrics](#vcs-metrics) - [Metric: `vcs.change.count`](#metric-vcschangecount) - [Metric: `vcs.change.duration`](#metric-vcschangeduration) @@ -23,6 +30,222 @@ linkTitle: CICD metrics +## CICD Metrics + +The conventions described in this section are specific to Continuous Integration / Continuous Deployment (CICD) systems. + +**Disclaimer:** These are initial CICD metrics and attributes +but more may be added in the future. + +### Metric: `cicd.pipeline.run.duration` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | +| -------- | --------------- | ----------- | -------------- | --------- | +| `cicd.pipeline.run.duration` | Histogram | `s` | Duration of a pipeline run grouped by pipeline and result. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`cicd.pipeline.name`](/docs/attributes-registry/cicd.md) | string | The human readable name of the pipeline within a CI/CD system. | `Build and Test`; `Lint`; `Deploy Go Project`; `deploy_to_environment` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`cicd.pipeline.result`](/docs/attributes-registry/cicd.md) | string | The result of a pipeline run. | `success`; `failure`; `timeout`; `skipped` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +--- + +`cicd.pipeline.result` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `cancelled` | The pipeline run was cancelled, eg. by a user manually cancelling the pipeline run. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `error` | The pipeline run failed due to an error in the CICD system, eg. due to the worker being killed. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `failure` | The pipeline run did not finish successfully, eg. due to a compile error or a failing test. Such failures are usually detected by non-zero exit codes of the tools executed in the pipeline run. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `skipped` | The pipeline run was skipped, eg. due to a precondition not being met. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `success` | The pipeline run finished successfully. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `timeout` | A timeout caused the pipeline run to be interrupted. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + + + + + +### Metric: `cicd.pipeline.run.executing` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | +| -------- | --------------- | ----------- | -------------- | --------- | +| `cicd.pipeline.run.executing` | UpDownCounter | `{pipeline_run}` | The number of pipeline runs currently executing. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`cicd.pipeline.name`](/docs/attributes-registry/cicd.md) | string | The human readable name of the pipeline within a CI/CD system. | `Build and Test`; `Lint`; `Deploy Go Project`; `deploy_to_environment` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + + + + + +### Metric: `cicd.queue.latency` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | +| -------- | --------------- | ----------- | -------------- | --------- | +| `cicd.queue.latency` | Histogram | `s` | The duration a pipeline run takes from being triggered to the start of execution. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`cicd.pipeline.name`](/docs/attributes-registry/cicd.md) | string | The human readable name of the pipeline within a CI/CD system. | `Build and Test`; `Lint`; `Deploy Go Project`; `deploy_to_environment` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + + + + + +### Metric: `cicd.queue.length` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | +| -------- | --------------- | ----------- | -------------- | --------- | +| `cicd.queue.length` | UpDownCounter | `{pipeline_run}` | The number of pipeline runs waiting for their start of execution. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`cicd.pipeline.name`](/docs/attributes-registry/cicd.md) | string | The human readable name of the pipeline within a CI/CD system. | `Build and Test`; `Lint`; `Deploy Go Project`; `deploy_to_environment` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + + + + + +### Metric: `cicd.worker.count` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | +| -------- | --------------- | ----------- | -------------- | --------- | +| `cicd.worker.count` | UpDownCounter | `{count}` | The number of workers on the CICD system by class and status. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`cicd.worker.class`](/docs/attributes-registry/cicd.md) | string | The type of worker / agent used by the CICD system. | `vm`; `pod` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`cicd.worker.state`](/docs/attributes-registry/cicd.md) | string | The state of a CICD worker / agent. | `idle`; `busy`; `down` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +--- + +`cicd.worker.class` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `container` | A single container. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `pod` | One or more containers deployed together. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `vm` | A virtual machine or baremetal host. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +--- + +`cicd.worker.state` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `busy` | The worker is performing work for the CICD system. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `down` | The worker is not available to the CICD system (disconnected / down). | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `idle` | The worker is not performing work for the CICD system. It is available to the CICD system to perform work on. [1] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +**[1]:** Pipelines might have conditions on which workers they are able to run so not every worker might be available to every pipeline. + + + + + + +### Metric: `cicd.errors` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | +| -------- | --------------- | ----------- | -------------- | --------- | +| `cicd.errors` | Counter | `{error}` | The number of errors in the controller of the CICD system. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [1] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Required` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | + +**[1] `error.type`:** The `error.type` SHOULD be predictable, and SHOULD have low cardinality. + +When `error.type` is set to a type (e.g., an exception type), its +canonical class name identifying the type within the artifact SHOULD be used. + +Instrumentations SHOULD document the list of errors they report. + +The cardinality of `error.type` within one instrumentation library SHOULD be low. +Telemetry consumers that aggregate data from multiple instrumentation libraries and applications +should be prepared for `error.type` to have high cardinality at query time when no +additional filters are applied. + +If the operation has completed successfully, instrumentations SHOULD NOT set `error.type`. + +If a specific domain defines its own set of error identifiers (such as HTTP or gRPC status codes), +it's RECOMMENDED to: + +- Use a domain-specific attribute +- Set `error.type` to capture all errors, regardless of whether they are defined within the domain-specific set or not. + +--- + +`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `_OTHER` | A fallback error value to be used when the instrumentation doesn't define a custom value. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | + + + + + + ## VCS Metrics The conventions described in this section are specific to Version Control Systems. diff --git a/model/cicd/metrics.yaml b/model/cicd/metrics.yaml new file mode 100644 index 0000000000..5524fbe979 --- /dev/null +++ b/model/cicd/metrics.yaml @@ -0,0 +1,65 @@ +groups: + - id: metric.cicd.pipeline.run.duration + type: metric + metric_name: cicd.pipeline.run.duration + brief: 'Duration of a pipeline run grouped by pipeline and result.' + instrument: histogram + unit: "s" + stability: experimental + attributes: + - ref: cicd.pipeline.name + requirement_level: required + - ref: cicd.pipeline.result + requirement_level: required + - id: metric.cicd.pipeline.run.executing + type: metric + metric_name: cicd.pipeline.run.executing + brief: 'The number of pipeline runs currently executing.' + instrument: updowncounter + unit: "{pipeline_run}" + stability: experimental + attributes: + - ref: cicd.pipeline.name + requirement_level: required + - id: metric.cicd.queue.latency + type: metric + metric_name: cicd.queue.latency + brief: 'The duration a pipeline run takes from being triggered to the start of execution.' + instrument: histogram + unit: "s" + stability: experimental + attributes: + - ref: cicd.pipeline.name + requirement_level: recommended + - id: metric.cicd.queue.length + type: metric + metric_name: cicd.queue.length + brief: 'The number of pipeline runs waiting for their start of execution.' + instrument: updowncounter + unit: "{pipeline_run}" + stability: experimental + attributes: + - ref: cicd.pipeline.name + requirement_level: recommended + - id: metric.cicd.worker.count + type: metric + metric_name: cicd.worker.count + brief: 'The number of workers on the CICD system by class and status.' + instrument: updowncounter + unit: "{count}" + stability: experimental + attributes: + - ref: cicd.worker.class + requirement_level: required + - ref: cicd.worker.state + requirement_level: required + - id: metric.cicd.errors + type: metric + metric_name: cicd.errors + brief: 'The number of errors in the controller of the CICD system.' + instrument: counter + unit: "{error}" + stability: experimental + attributes: + - ref: error.type + requirement_level: required diff --git a/model/cicd/registry.yaml b/model/cicd/registry.yaml index a63903d2c6..5155b06452 100644 --- a/model/cicd/registry.yaml +++ b/model/cicd/registry.yaml @@ -75,3 +75,78 @@ groups: brief: > The type of the task within a pipeline. examples: ["build", "test", "deploy"] + - id: cicd.pipeline.result + type: + members: + - id: success + value: success + brief: "The pipeline run finished successfully." + stability: experimental + - id: failure + value: failure + brief: >- + The pipeline run did not finish successfully, eg. due to a compile error or a failing test. + Such failures are usually detected by non-zero exit codes of the tools executed in the pipeline run. + stability: experimental + - id: error + value: error + brief: >- + The pipeline run failed due to an error in the CICD system, eg. due to the worker being killed. + stability: experimental + - id: timeout + value: timeout + brief: "A timeout caused the pipeline run to be interrupted." + stability: experimental + - id: cancelled + value: cancelled + brief: "The pipeline run was cancelled, eg. by a user manually cancelling the pipeline run." + stability: experimental + - id: skipped + value: skipped + brief: "The pipeline run was skipped, eg. due to a precondition not being met." + stability: experimental + stability: experimental + brief: > + The result of a pipeline run. + examples: ["success", "failure", "timeout", "skipped"] + - id: cicd.worker.class + type: + members: + - id: vm + value: vm + brief: "A virtual machine or baremetal host." + stability: experimental + - id: container + value: container + brief: "A single container." + stability: experimental + - id: pod + value: pod + brief: "One or more containers deployed together." + stability: experimental + stability: experimental + brief: > + The type of worker / agent used by the CICD system. + examples: ["vm", "pod"] + - id: cicd.worker.state + type: + members: + - id: idle + value: idle + brief: >- + The worker is not performing work for the CICD system. + It is available to the CICD system to perform work on. + note: "Pipelines might have conditions on which workers they are able to run so not every worker might be available to every pipeline." + stability: experimental + - id: busy + value: busy + brief: "The worker is performing work for the CICD system." + stability: experimental + - id: down + value: down + brief: "The worker is not available to the CICD system (disconnected / down)." + stability: experimental + stability: experimental + brief: > + The state of a CICD worker / agent. + examples: ["idle", "busy", "down"]