Skip to content

Commit

Permalink
clean up
Browse files Browse the repository at this point in the history
  • Loading branch information
lmolkova committed Dec 20, 2024
1 parent b8607f2 commit a3017c2
Showing 1 changed file with 56 additions and 52 deletions.
108 changes: 56 additions & 52 deletions docs/general/how-to-define-semantic-conventions.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,33 +30,36 @@ for the new areas or make substantial changes to the existing ones.
<!-- TODO: add CI check for CODEOWNERS file (when a new area is added) -->
- New conventions SHOULD be defined in YAML files. See [YAML Model for Semantic Conventions](/model/README.md) for the details.
- New conventions SHOULD be defined with `development` stability level.
- New conventions SHOULD include attributes and telemetry signal definitions (spans, metrics, events, resources, profiles).
- New conventions SHOULD include telemetry signal definitions (spans, metrics, events, resources, profiles) and MAY include new attribute definitions.

### Best practices

#### Defining attributes

Reuse existing attributes when possible. Look through [existing conventions](/docs/attributes-registry/) for similar areas,
check out [common attributes](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/attributes.md).
Semantic conventions are encouraged to use attributes from different namespaces.
Semantic conventions authors are encouraged to use attributes from different namespaces.

Introduce new attributes when there is a clear use-case for them. Check if the most of the following applies:
Introduce new attributes when there is a clear use-case for them. Consider adding
them if most of the following apply:

- you see a clear benefit for the end users to have it on their telemetry
- you're going to use this attribute on any spans, metrics, events, resources, or other telemetry signals
- you're going to use this attribute in instrumentations
- They provide a clear benefit to end users by enhancing telemetry.
- You plan to use the attribute in spans, metrics, events, resources, or other telemetry signals.
- The attribute will be utilized in instrumentations.

Postpone adding new attributes if it's not yet clear how beneficial having it on the telemetry is.
Postpone adding new attributes if their benefit to telemetry is not yet clear.

When defining a new attribute
When defining a new attribute:

- follow the [naming guidance](/docs/general/naming.md)
- make sure to provide descriptive `brief` and `note` - it should be clear what this attribute represents.
- If it represents some common concept documented externally, make sure to provide links. For example,
always provide links to attributes describing notions defined in RFCs or other standards.
- If attribute value is likely to contain PII or other sensitive information, make sure to capture it in the `note`.
- Follow the [naming guidance](/docs/general/naming.md)
- Provide descriptive `brief` and `note` sections to clearly explain what the attribute represents.
- If the attribute represents a common concept documented externally, include relevant links.
For example, always link to attributes related to concepts defined in RFCs or other standards.
- If the attribute's value might contain PII or other sensitive information, explicitly call this out in
the `note`.

Include a warning similar to the following: <!-- TODO: update existing semconv -->

Include the following warning <!-- TODO: update existing semconv -->
```yaml
- id: user.full_name
...
Expand All @@ -65,53 +68,54 @@ When defining a new attribute
> [!WARNING]
>
> This field contains sensitive (PII) information.
> This attribute contains sensitive (PII) information.
```
- use appropriate [attribute type](https://github.com/open-telemetry/weaver/blob/main/schemas/semconv-syntax.md#type)
- If value has a reasonably short (open or closed) set of possible values, it should be an enum.
- If value is a timestamp, it should be recorded as a string in ISO 8601 format.
- If value is an array of primitives, use array type. Avoid recording arrays as a string
- Use template type to define attributes with variable name (only the last segment of the name is dynamic). It's
useful to record user-defined set of key=value pairs such as HTTP headers.
- Capture complex values as a set of flat attributes. <!-- This may change, check out https://github.com/open-telemetry/semantic-conventions/issues/1669 to monitor the progress -->
- new attributes should always be defined with `development` stability
- provide realistic examples
- Avoid defining attributes with potentially unbound values. For example, strings that are longer than 1KB
or arrays with more than a thousand elements. Such value should be recoded in log/event body instead.

Consider the scope attribute should be applicable in and how it may evolve in the future

- when defining an attribute for a narrow use-case, consider other possible use-cases.
For example, when defining system-specific attribute, check if other systems in this domain would need
a similar attribute in the future.
Or, when defining a boolean flag such as `foo.is_error`, consider if you can represent it, along with
additional details, in a more extensible way, for example, with `foo.status_code` attribute.

- when defining a very broad attribute applicable to multiple domains or systems, check if there are
standards or common best practices in the industry to rely on.
Avoid defining generic attributes that are not grounded by some existing standard.
- Use the appropriate [attribute type](https://github.com/open-telemetry/weaver/blob/main/schemas/semconv-syntax.md#type)
- If the value has a reasonably short (open or closed) set of possible values, define it as an enum.
- If the value is a timestamp, record it as a string in ISO 8601 format.
- For arrays of primitives, use the array type. Avoid recording arrays as a single string.
- Use the template type to define attributes with variable names (only the last segment of the name should be dynamic).
This is useful for capturing user-defined key-value pairs, such as HTTP headers.
- Represent complex values as a set of flat attributes. <!-- This may change, check out https://github.com/open-telemetry/semantic-conventions/issues/1669 to monitor the progress -->
- Define new attributes with `development` stability.
- Provide realistic examples
- Avoid defining attributes with potentially unbounded values, such as strings longer than
1 KB or arrays with more than 1,000 elements. Such values should be recorded in the log or event body instead.

Consider the scope of the attribute and how it may evolve in the future:

- When defining an attribute for a narrow use case, think about potential broader use cases.
For example, if creating a system-specific attribute, evaluate whether other systems
in the same domain might need a similar attribute in the future.

Similarly, instead of defining a simple boolean flag like `foo.is_error`, consider a
more extensible approach, such as using a `foo.status_code` attribute to include additional details.

- When defining a broad attribute applicable across multiple domains or systems,
check for existing standards or widely accepted best practices in the industry.
Avoid creating generic attributes that are not based on established standards.

> [!NOTE]
>
> When defining conventions for an area with multiple implementations or systems such as databases, identity providers,
> or cloud providers it takes some time to find the right balance between being overly generic vs not generic enough.
> When defining conventions for areas with multiple implementations or systems — such as databases,
> or cloud providers — it can take time to strike the right balance between being
> overly generic and not generic enough.
>
> It's essential to start with experimental conventions, document how these conventions apply to a diverse set
> of provides/systems/libraries, and prototype instrumentations.
> Start with experimental conventions, document how they apply to a diverse range
> of providers, systems, or libraries, and prototype instrumentations.
>
> The end-user experience should be used as the main guiding principle:
> The end-user experience should serve as the primary guiding principle:
>
> - if the attribute is expected to be used on general-purpose metrics for this area,
> consider introducing common attribute.
> - If the attribute is expected to be used in general-purpose metrics for the area,
> consider introducing a common attribute.
>
> For example, almost every messaging system has a notion of queue or topic. The
> queue or topic name is essential on latency or throughput metrics and equally
> important on spans to debug and visualize message flow. This is a good sign
> that we need a generic attribute that represents any type of messaging destination.
> For example, most messaging systems have a concept like a queue or topic.
> Queue or topic names are critical for latency and throughput metrics and
> equally important for spans to debug and visualize message flow.
> This indicates the need for a generic attribute representing any type of messaging destination.
>
> - if the attribute represents something that would be useful in a narrow set of scenarios
> or only on a system-specific metrics/spans/events, it's usually a sign that this
> attribute does not need to be generic.
> - If the attribute represents something useful in a narrow set of scenarios or
> is specific to certain system metrics, spans, or events, it likely does not need to be generic.

#### Defining spans

Expand Down

0 comments on commit a3017c2

Please sign in to comment.