-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[processor/deltatocumulative] enhancements for slow-moving ("sparse") counters #36485
Comments
hi! thanks for opening this issue! I have a lot of ideas for this and will write them down when I find the time. In the meantime, wdyt about removing the "new component" from this? I'm fairly sure we can find a place within deltatocumulative / interval for the needed functionality, especially because it sounds rather common and likely happens for a lot of users. New component issues are afaict concrete proposals and we are not quite there yet :) |
Done. I agree, but created it that way since it was suggested to use the "proposal" template. Can you please update the labels as needed? I don't have access to add/remove labels. |
My vote would be to extend
For 1 (the current behavior), we accumulate datapoints over the interval, replacing older matching datapoints with newer ones. Then at the interval time, we send all our state downstream, and clear our state to empty. Repeat. For 2, we again would accumulate datapoints over the interval, replacing older matching datapoints with newer ones. At the interval time, we again would send all our state downstream, but we wouldn't clear our state to empty. IE, we continue re-exporting our state at each interval time. However, this risks OOM, so we would also need a mechanic to remove datapoints that haven't been updated in X time. For example, if a data series hasn't been updated in an hour, remove it. |
We are using the
deltatocumulative
processor in production but are facing issues with slow-moving (or "sparse") counters.In the current implementation, the processor only emits the cumulative when an upstream delta is received. If a counter has not been incremented for several reporting intervals this means that no datapoint is emitted downstream for those intervening time windows. Contrast this with a true cumulative counter in Prometheus, where an unchanged value will be sampled on each successive scrape and emitted downstream.
Current behavior
This behavior causes issues when trying to use
rate()
andincrease()
in PromQL since there is no previous datapoint within the standard 5 minute lookback window to compare with.This could be addressed by an alternative implementation where the cumulative datapoints were instead flushed periodically from a background thread on a fixed interval. This would have the benefit of continuing to emit cumulative counters that have not been recently incremented, but are not yet stale.
Desired behavior
Stale timeseries would still be expired much like the current implementation.
There are several ways to incorporate this into the existing codebase
deltatocumulativeprocessor
intervalprocessor
deltatocumulativeasyncprocessor
?)The implementation can get fairly tricky because you'd need to retain the resource/scope attributes, etc. from the original metric.
Example configuration
Telemetry data types supported
Metrics
Code Owner(s)
@RichieSams @sh0rez
Sponsor (optional)
@RichieSams @sh0rez
Additional context
CNCF Slack Thread
The text was updated successfully, but these errors were encountered: