Skip to content

Commit

Permalink
BREAKING: Introduce common url.* attributes, and improve use of nam…
Browse files Browse the repository at this point in the history
…espacing under `http.*` (#3355)
  • Loading branch information
lmolkova authored and AlexanderWert committed Nov 10, 2023
1 parent be88cc9 commit 6f6e042
Show file tree
Hide file tree
Showing 3 changed files with 50 additions and 5 deletions.
6 changes: 3 additions & 3 deletions specification/common/attribute-naming.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Names SHOULD follow these rules:
purpose should primarily drive the decision about forming nested namespaces.

- For each multi-word dot-delimited component of the attribute name separate the
words by underscores (i.e. use snake_case). For example `http.status_code`
words by underscores (i.e. use snake_case). For example `http.response.status_code`
denotes the status code in the http namespace.

- Names SHOULD NOT coincide with namespaces. For example if
Expand Down Expand Up @@ -96,8 +96,8 @@ denote old attribute names in rename operations).
- Semantic conventions exist for four areas: for Resource, Span, Log, and Metric
attribute names. In addition, for spans we have two more areas: Event and Link
attribute names. Identical namespaces or names in all these areas MUST have
identical meanings. For example the `http.method` span attribute name denotes
exactly the same concept as the `http.method` metric attribute, has the same
identical meanings. For example the `http.request.method` span attribute name denotes
exactly the same concept as the `http.request.method` metric attribute, has the same
data type and the same set of possible values (in both cases it records the
value of the HTTP protocol's request method as a string).

Expand Down
4 changes: 2 additions & 2 deletions specification/common/attribute-requirement-level.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ For example, [Database semantic convention](../trace/semantic_conventions/databa

## Required

All instrumentations MUST populate the attribute. A semantic convention defining a Required attribute expects an absolute majority of instrumentation libraries and applications are able to efficiently retrieve and populate it, and can additionally meet requirements for cardinality, security, and any others specific to the signal defined by the convention. `http.method` is an example of a Required attribute.
All instrumentations MUST populate the attribute. A semantic convention defining a Required attribute expects an absolute majority of instrumentation libraries and applications are able to efficiently retrieve and populate it, and can additionally meet requirements for cardinality, security, and any others specific to the signal defined by the convention. `http.request.method` is an example of a Required attribute.

_Note: Consumers of telemetry can detect if a telemetry item follows a specific semantic convention by checking for the presence of a `Required` attribute defined by such convention. For example, the presence of the `db.system` attribute on a span can be used as an indication that the span follows database semantics._

Expand Down Expand Up @@ -71,4 +71,4 @@ Here are several examples of expensive operations to be avoided by default:

- DNS lookups to populate `server.address` when only an IP address is available to the instrumentation. Caching lookup results does not solve the issue for all possible cases and should be avoided by default too.
- forcing an `http.route` calculation before the HTTP framework calculates it
- reading response stream to find `http.response_content_length` when `Content-Length` header is not available
- reading response stream to find `http.response.body.size` when `Content-Length` header is not available
45 changes: 45 additions & 0 deletions specification/common/url.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Semantic conventions for URL

**Status**: [Experimental](../document-status.md)

This document defines semantic conventions that describe URL and its components.

<details>
<summary>Table of Contents</summary>

<!-- toc -->

- [Attributes](#attributes)
- [Sensitive information](#sensitive-information)

<!-- tocstop -->

</details>

## Attributes

<!-- semconv url -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| `url.scheme` | string | The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | `https`; `ftp`; `telnet` | Recommended |
| `url.full` | string | Absolute URL describing a network resource according to [RFC3986](https://www.rfc-editor.org/rfc/rfc3986) [1] | `https://www.foo.bar/search?q=OpenTelemetry#SemConv`; `//localhost` | Recommended |
| `url.path` | string | The [URI path](https://www.rfc-editor.org/rfc/rfc3986#section-3.3) component [2] | `/search` | Recommended |
| `url.query` | string | The [URI query](https://www.rfc-editor.org/rfc/rfc3986#section-3.4) component [3] | `q=OpenTelemetry` | Recommended |
| `url.fragment` | string | The [URI fragment](https://www.rfc-editor.org/rfc/rfc3986#section-3.5) component | `SemConv` | Recommended |

**[1]:** For network calls, URL usually has `scheme://host[:port][path][?query][#fragment]` format, where the fragment is not transmitted over HTTP, but if it is known, it should be included nevertheless.
`url.full` MUST NOT contain credentials passed via URL in form of `https://username:[email protected]/`. In such case username and password should be redacted and attribute's value should be `https://REDACTED:[email protected]/`.
`url.full` SHOULD capture the absolute URL when it is available (or can be reconstructed) and SHOULD NOT be validated or modified except for sanitizing purposes.

**[2]:** When missing, the value is assumed to be `/`

**[3]:** Sensitive content provided in query string SHOULD be scrubbed when instrumentations can identify it.
<!-- endsemconv -->

## Sensitive information

Capturing URL and its components MAY impose security risk. User and password information, when they are provided in [User Information](https://datatracker.ietf.org/doc/html/rfc3986#section-3.2.1) subcomponent, MUST NOT be recorded.

Instrumentations that are aware of specific sensitive query string parameters MUST scrub their values before capturing `url.query` attribute. For example, native instrumentation of a client library that passes credentials or user location in URL, must scrub corresponding properties.

_Note: Applications and telemetry consumers should scrub sensitive information from URL attributes on collected telemetry. In systems unable to identify sensitive information, certain attribute values may be redacted entirely._

0 comments on commit 6f6e042

Please sign in to comment.