From a3017c26de9ee2014a0c99cced74fb83be399ecd Mon Sep 17 00:00:00 2001 From: Liudmila Molkova Date: Fri, 20 Dec 2024 14:46:51 -0800 Subject: [PATCH] clean up --- .../how-to-define-semantic-conventions.md | 108 +++++++++--------- 1 file changed, 56 insertions(+), 52 deletions(-) diff --git a/docs/general/how-to-define-semantic-conventions.md b/docs/general/how-to-define-semantic-conventions.md index 0395913c88..a62e2004f0 100644 --- a/docs/general/how-to-define-semantic-conventions.md +++ b/docs/general/how-to-define-semantic-conventions.md @@ -30,7 +30,7 @@ for the new areas or make substantial changes to the existing ones. - New conventions SHOULD be defined in YAML files. See [YAML Model for Semantic Conventions](/model/README.md) for the details. - New conventions SHOULD be defined with `development` stability level. -- New conventions SHOULD include attributes and telemetry signal definitions (spans, metrics, events, resources, profiles). +- New conventions SHOULD include telemetry signal definitions (spans, metrics, events, resources, profiles) and MAY include new attribute definitions. ### Best practices @@ -38,25 +38,28 @@ for the new areas or make substantial changes to the existing ones. Reuse existing attributes when possible. Look through [existing conventions](/docs/attributes-registry/) for similar areas, check out [common attributes](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/attributes.md). -Semantic conventions are encouraged to use attributes from different namespaces. +Semantic conventions authors are encouraged to use attributes from different namespaces. -Introduce new attributes when there is a clear use-case for them. Check if the most of the following applies: +Introduce new attributes when there is a clear use-case for them. Consider adding +them if most of the following apply: -- you see a clear benefit for the end users to have it on their telemetry -- you're going to use this attribute on any spans, metrics, events, resources, or other telemetry signals -- you're going to use this attribute in instrumentations +- They provide a clear benefit to end users by enhancing telemetry. +- You plan to use the attribute in spans, metrics, events, resources, or other telemetry signals. +- The attribute will be utilized in instrumentations. -Postpone adding new attributes if it's not yet clear how beneficial having it on the telemetry is. +Postpone adding new attributes if their benefit to telemetry is not yet clear. -When defining a new attribute +When defining a new attribute: -- follow the [naming guidance](/docs/general/naming.md) -- make sure to provide descriptive `brief` and `note` - it should be clear what this attribute represents. - - If it represents some common concept documented externally, make sure to provide links. For example, - always provide links to attributes describing notions defined in RFCs or other standards. - - If attribute value is likely to contain PII or other sensitive information, make sure to capture it in the `note`. +- Follow the [naming guidance](/docs/general/naming.md) +- Provide descriptive `brief` and `note` sections to clearly explain what the attribute represents. + - If the attribute represents a common concept documented externally, include relevant links. + For example, always link to attributes related to concepts defined in RFCs or other standards. + - If the attribute's value might contain PII or other sensitive information, explicitly call this out in + the `note`. + + Include a warning similar to the following: - Include the following warning ```yaml - id: user.full_name ... @@ -65,53 +68,54 @@ When defining a new attribute > [!WARNING] > - > This field contains sensitive (PII) information. + > This attribute contains sensitive (PII) information. ``` -- use appropriate [attribute type](https://github.com/open-telemetry/weaver/blob/main/schemas/semconv-syntax.md#type) - - If value has a reasonably short (open or closed) set of possible values, it should be an enum. - - If value is a timestamp, it should be recorded as a string in ISO 8601 format. - - If value is an array of primitives, use array type. Avoid recording arrays as a string - - Use template type to define attributes with variable name (only the last segment of the name is dynamic). It's - useful to record user-defined set of key=value pairs such as HTTP headers. - - Capture complex values as a set of flat attributes. -- new attributes should always be defined with `development` stability -- provide realistic examples -- Avoid defining attributes with potentially unbound values. For example, strings that are longer than 1KB - or arrays with more than a thousand elements. Such value should be recoded in log/event body instead. - -Consider the scope attribute should be applicable in and how it may evolve in the future - -- when defining an attribute for a narrow use-case, consider other possible use-cases. - For example, when defining system-specific attribute, check if other systems in this domain would need - a similar attribute in the future. - Or, when defining a boolean flag such as `foo.is_error`, consider if you can represent it, along with - additional details, in a more extensible way, for example, with `foo.status_code` attribute. - -- when defining a very broad attribute applicable to multiple domains or systems, check if there are - standards or common best practices in the industry to rely on. - Avoid defining generic attributes that are not grounded by some existing standard. +- Use the appropriate [attribute type](https://github.com/open-telemetry/weaver/blob/main/schemas/semconv-syntax.md#type) + - If the value has a reasonably short (open or closed) set of possible values, define it as an enum. + - If the value is a timestamp, record it as a string in ISO 8601 format. + - For arrays of primitives, use the array type. Avoid recording arrays as a single string. + - Use the template type to define attributes with variable names (only the last segment of the name should be dynamic). + This is useful for capturing user-defined key-value pairs, such as HTTP headers. + - Represent complex values as a set of flat attributes. +- Define new attributes with `development` stability. +- Provide realistic examples +- Avoid defining attributes with potentially unbounded values, such as strings longer than + 1 KB or arrays with more than 1,000 elements. Such values should be recorded in the log or event body instead. + +Consider the scope of the attribute and how it may evolve in the future: + +- When defining an attribute for a narrow use case, think about potential broader use cases. + For example, if creating a system-specific attribute, evaluate whether other systems + in the same domain might need a similar attribute in the future. + + Similarly, instead of defining a simple boolean flag like `foo.is_error`, consider a + more extensible approach, such as using a `foo.status_code` attribute to include additional details. + +- When defining a broad attribute applicable across multiple domains or systems, + check for existing standards or widely accepted best practices in the industry. + Avoid creating generic attributes that are not based on established standards. > [!NOTE] > -> When defining conventions for an area with multiple implementations or systems such as databases, identity providers, -> or cloud providers it takes some time to find the right balance between being overly generic vs not generic enough. +> When defining conventions for areas with multiple implementations or systems — such as databases, +> or cloud providers — it can take time to strike the right balance between being +> overly generic and not generic enough. > -> It's essential to start with experimental conventions, document how these conventions apply to a diverse set -> of provides/systems/libraries, and prototype instrumentations. +> Start with experimental conventions, document how they apply to a diverse range +> of providers, systems, or libraries, and prototype instrumentations. > -> The end-user experience should be used as the main guiding principle: +> The end-user experience should serve as the primary guiding principle: > -> - if the attribute is expected to be used on general-purpose metrics for this area, -> consider introducing common attribute. +> - If the attribute is expected to be used in general-purpose metrics for the area, +> consider introducing a common attribute. > -> For example, almost every messaging system has a notion of queue or topic. The -> queue or topic name is essential on latency or throughput metrics and equally -> important on spans to debug and visualize message flow. This is a good sign -> that we need a generic attribute that represents any type of messaging destination. +> For example, most messaging systems have a concept like a queue or topic. +> Queue or topic names are critical for latency and throughput metrics and +> equally important for spans to debug and visualize message flow. +> This indicates the need for a generic attribute representing any type of messaging destination. > -> - if the attribute represents something that would be useful in a narrow set of scenarios -> or only on a system-specific metrics/spans/events, it's usually a sign that this -> attribute does not need to be generic. +> - If the attribute represents something useful in a narrow set of scenarios or +> is specific to certain system metrics, spans, or events, it likely does not need to be generic. #### Defining spans