Skip to content

Commit

Permalink
Add container metric fields (from ECS) (#282)
Browse files Browse the repository at this point in the history
Signed-off-by: ChrsMark <[email protected]>
Co-authored-by: Joao Grassi <[email protected]>
  • Loading branch information
ChrsMark and joaopgrassi authored Mar 27, 2024
1 parent 4368358 commit 0941ebb
Show file tree
Hide file tree
Showing 5 changed files with 203 additions and 0 deletions.
21 changes: 21 additions & 0 deletions .chloggen/add_new_container_metrics.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Use this changelog template to create an entry for release notes.
#
# If your change doesn't affect end users you should instead start
# your pull request title with [chore] or use the "Skip Changelog" label.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: "enhancement"

# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db)
component: "container"

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: "Add new container metrics for `cpu`, `memory`, `disk` and `network`"

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [282, 72]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:
9 changes: 9 additions & 0 deletions docs/attributes-registry/container.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
| `container.command` | string | The command used to run the container (i.e. the command name). [1] | `otelcontribcol` |
| `container.command_args` | string[] | All the command arguments (including the command/executable itself) run by the container. [2] | `[otelcontribcol, --config, config.yaml]` |
| `container.command_line` | string | The full command run by the container as a single string representing the full command. [2] | `otelcontribcol --config config.yaml` |
| `container.cpu.state` | string | The CPU state for this data point. | `user`; `kernel` |
| `container.id` | string | Container ID. Usually a UUID, as for example used to [identify Docker containers](https://docs.docker.com/engine/reference/run/#container-identification). The UUID might be abbreviated. | `a3bf90e006b2` |
| `container.image.id` | string | Runtime specific image identifier. Usually a hash algorithm followed by a UUID. [2] | `sha256:19c92d0a00d1b66d897bceaa7319bee0dd38a10a851c60bcec9474aa3f01e50f` |
| `container.image.name` | string | Name of the image the container was built on. | `gcr.io/opentelemetry/operator` |
Expand All @@ -27,4 +28,12 @@ K8s defines a link to the container registry repository with digest `"imageID":
The ID is assinged by the container runtime and can vary in different environments. Consider using `oci.manifest.digest` if it is important to identify the same image in different environments/runtimes.

**[3]:** [Docker](https://docs.docker.com/engine/api/v1.43/#tag/Image/operation/ImageInspect) and [CRI](https://github.com/kubernetes/cri-api/blob/c75ef5b473bbe2d0a4fc92f82235efd665ea8e9f/pkg/apis/runtime/v1/api.proto#L1237-L1238) report those under the `RepoDigests` field.

`container.cpu.state` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.

| Value | Description |
|---|---|
| `user` | When tasks of the cgroup are in user mode (Linux). When all container processes are in user mode (Windows). |
| `system` | When CPU is used by the system (host OS) |
| `kernel` | When tasks of the cgroup are in kernel mode (Linux). When all container processes are in kernel mode (Windows). |
<!-- endsemconv -->
105 changes: 105 additions & 0 deletions docs/system/container-metrics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
<!--- Hugo front matter used to generate the website version of this page:
linkTitle: Container
--->

# Semantic Conventions for Container Metrics

**Status**: [Experimental][DocumentStatus]

## Container Metrics

### Metric: `container.cpu.time`

This metric is [opt-in][MetricOptIn].

<!-- semconv metric.container.cpu.time(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `container.cpu.time` | Counter | `s` | Total CPU time consumed [1] |

**[1]:** Total CPU time consumed by the specific container on all available CPU cores
<!-- endsemconv -->

<!-- semconv metric.container.cpu.time(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| [`container.cpu.state`](../attributes-registry/container.md) | string | The CPU state for this data point. A container SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels. | `user`; `kernel` | Opt-In |

`container.cpu.state` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.

| Value | Description |
|---|---|
| `user` | When tasks of the cgroup are in user mode (Linux). When all container processes are in user mode (Windows). |
| `system` | When CPU is used by the system (host OS) |
| `kernel` | When tasks of the cgroup are in kernel mode (Linux). When all container processes are in kernel mode (Windows). |
<!-- endsemconv -->

### Metric: `container.memory.usage`

This metric is [opt-in][MetricOptIn].

<!-- semconv metric.container.memory.usage(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `container.memory.usage` | Counter | `By` | Memory usage of the container. [1] |

**[1]:** Memory usage of the container.
<!-- endsemconv -->

<!-- semconv metric.container.memory.usage(full) -->
<!-- endsemconv -->

### Metric: `container.disk.io`

This metric is [opt-in][MetricOptIn].

<!-- semconv metric.container.disk.io(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `container.disk.io` | Counter | `By` | Disk bytes for the container. [1] |

**[1]:** The total number of bytes read/written successfully (aggregated from all disks).
<!-- endsemconv -->

<!-- semconv metric.container.disk.io(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| [`disk.io.direction`](../attributes-registry/disk.md) | string | The disk IO operation direction. | `read` | Recommended |
| `system.device` | string | The device identifier | `(identifier)` | Recommended |

`disk.io.direction` MUST be one of the following:

| Value | Description |
|---|---|
| `read` | read |
| `write` | write |
<!-- endsemconv -->

### Metric: `container.network.io`

This metric is [opt-in][MetricOptIn].

<!-- semconv metric.container.network.io(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `container.network.io` | Counter | `By` | Network bytes for the container. [1] |

**[1]:** The number of bytes sent/received on all network interfaces by the container.
<!-- endsemconv -->

<!-- semconv metric.container.network.io(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| [`network.io.direction`](../attributes-registry/network.md) | string | The network IO operation direction. | `transmit` | Recommended |
| `system.device` | string | The device identifier | `(identifier)` | Recommended |

`network.io.direction` MUST be one of the following:

| Value | Description |
|---|---|
| `transmit` | transmit |
| `receive` | receive |
<!-- endsemconv -->

[DocumentStatus]: https://github.com/open-telemetry/opentelemetry-specification/tree/v1.22.0/specification/document-status.md
[MetricOptIn]: https://github.com/open-telemetry/opentelemetry-specification/blob/v1.26.0/specification/metrics/metric-requirement-level.md#opt-in
53 changes: 53 additions & 0 deletions model/metrics/container.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
groups:
# container.cpu.* metrics and attribute group
- id: metric.container.cpu.time
type: metric
metric_name: container.cpu.time
brief: "Total CPU time consumed"
note: >
Total CPU time consumed by the specific container on all available CPU cores
instrument: counter
unit: "s"
attributes:
- ref: container.cpu.state
brief: "The CPU state for this data point. A container SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels."
requirement_level: opt_in

# container.memory.* metrics and attribute group
- id: metric.container.memory.usage
type: metric
metric_name: container.memory.usage
brief: "Memory usage of the container."
note: >
Memory usage of the container.
instrument: counter
unit: "By"

# container.disk.io.* metrics and attribute group
- id: metric.container.disk.io
type: metric
metric_name: container.disk.io
brief: "Disk bytes for the container."
note: >
The total number of bytes read/written
successfully (aggregated from all disks).
instrument: counter
unit: "By"
attributes:
- ref: disk.io.direction
- ref: system.device

# container.network.io.* metrics and attribute group
- id: metric.container.network.io
type: metric
metric_name: container.network.io
brief: "Network bytes for the container."
note: >
The number of bytes sent/received
on all network interfaces
by the container.
instrument: counter
unit: "By"
attributes:
- ref: network.io.direction
- ref: system.device
15 changes: 15 additions & 0 deletions model/registry/container.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -95,3 +95,18 @@ groups:
brief: >
Container labels, `<key>` being the label name, the value being the label value.
examples: [ 'container.label.app=nginx' ]
- id: cpu.state
brief: "The CPU state for this data point."
type:
allow_custom_values: true
members:
- id: user
value: 'user'
brief: "When tasks of the cgroup are in user mode (Linux). When all container processes are in user mode (Windows)."
- id: system
value: 'system'
brief: "When CPU is used by the system (host OS)"
- id: kernel
value: 'kernel'
brief: "When tasks of the cgroup are in kernel mode (Linux). When all container processes are in kernel mode (Windows)."
examples: ["user", "kernel"]

0 comments on commit 0941ebb

Please sign in to comment.