Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking: Visualize stream graph bottleneck #18176

Closed
6 of 11 tasks
kwannoel opened this issue Aug 22, 2024 · 0 comments
Closed
6 of 11 tasks

Tracking: Visualize stream graph bottleneck #18176

kwannoel opened this issue Aug 22, 2024 · 0 comments
Assignees
Milestone

Comments

@kwannoel
Copy link
Contributor

kwannoel commented Aug 22, 2024

Tracking for: #13481

  • Fix the output blocking ratio metrics. feat(dashboard): visualize average backpressure rather than spot backpressure #18219
    • format!("avg(rate(stream_actor_output_buffer_blocking_duration_ns{{{}}}[60s])) by (fragment_id, downstream_fragment_id) / 1000000000", srv.prometheus_selector);
    • let start_time = Instant::now();
      dispatcher.dispatch_data(chunk.clone()).await?;
      dispatcher
      .actor_output_buffer_blocking_duration_ns
      .inc_by(start_time.elapsed().as_nanos() as u64);
    • Perhaps we can consider the "no metrics" case as though blocking is 100%.
    • We must be careful to consider the case where there's just not much data going through the graph.
  • dashboard: fetch fragment graph for each streaming job on demand #17510 perf(dashboard): only fetch ids and specified fragments for a relation #18272
  • Support DDL level graph, with output blocking ratio. This is basically the same as fragment graph. We can calculate output blocking ratio by just looking at the metrics for the dispatcher side between MVs.
  • Support DDL level graph, per schema, with output blocking ratio. This is basically the same as fragment graph. We can calculate output blocking ratio by just looking at the metrics for the dispatcher side between MVs.
  • Support throughput in stream graph. We can just track the max throughput seen so far, and let that be the max, then throughout the whole graph scale color codes between edges according to the max throughput.
  • Support moving from an MV into its fragment graph.
  • Verify it works for join amplification.
  • Verify it works for dirty agg group scenario.
  • Migrate the cloud dashboard.
  • Prometheus data source.
  • Support backpressure graph for creating MVs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant