Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compute: feed MV sink frontier from persist write handle #30662

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

teskje
Copy link
Contributor

@teskje teskje commented Nov 29, 2024

This PR changes the way the MV sink produces updates to the shared sink frontier. Previously it would inspect the persist stream and forward that stream's frontier. The issue with this approach is that, in the face of ingestion spikes, the persist stream could fall behind, which would also make the MV sink's reported write frontier fall behind.

This led to a regression in the CrossJoin feature benchmark. In this benchmark, a large constant MV is created. As soon as the MV dataflow has produced its snapshot and has reported the empty frontier, the controller instructs the replica to drop it. If the reporting of the empty frontier is delayed because the MV sink is busy reading back the entire snapshot from persist, the controller's drop command also gets delayed. The result is that queries against the cluster get slowed down by the MV dataflow's processing, more than they would have been had the MV dataflow been dropped quickly.

Attaching the shared write frontier to the WriteHandle-based frontier stream used by the mint operator avoids the issue.

This PR also makes two minor changes to the MV sink (commits 1 and 3):

  • adjusts trace logging to include the sink ID
  • re-applies changes I originally made to address Dan's review comments in compute: MV sink refresh #26745 but then lost in a rebase

Motivation

  • This PR fixes a recognized bug.

Fixes https://github.com/MaterializeInc/database-issues/issues/8795

Tips for reviewer

Commits are meaningful and can be reviewed separately.

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@teskje teskje marked this pull request as ready for review November 30, 2024 08:15
@teskje teskje requested a review from a team as a code owner November 30, 2024 08:15
@teskje teskje requested a review from antiguru November 30, 2024 08:16
teskje and others added 4 commits November 30, 2024 12:21
This commit improves the trace logging in the MV sink v2 by adding a
`sink_id` field.
This commit changes the way the MV sink produces updates to the shared
sink frontier. Previously it would inspect the `persist` stream and
forward that stream's frontier. The issue with this approach is that, in
the face of ingestion spikes, the `persist` stream could fall behind,
which would also make the MV sink's reported write frontier fall behind.

This lead to a regression in the `CrossJoin` feature benchmark. In this
benchmark, a large constant MV is created. As soon as the MV dataflow
has produced its snapshot and has reported the empty frontier, the
controller instructs the replica to drop it. If the reporting of the
empty frontier is delayed because the MV sink is busy reading back the
entire snapshot from persist, the controller's drop command also gets
delayed. The result is that queries against the cluster get slowed down
by the MV dataflow's processing, more than they would have been had the
MV dataflow been dropped quickly.

Attaching the shared write frontier to the `WriteHandle`-based frontier
stream used by the `mint` operator avoids the issue.
This commit re-applies three minor changes that were the result of Dan's
review comments on MaterializeInc#26745 but that somehow got lost in a rebase before
that PR merged.

* Replace `debug_assert`s with `assert`s.
* Don't unnecessarily box operator tokens.
* Add a clarifying comment about valid batch descriptions.
@teskje teskje force-pushed the faster-mv-sink-frontier-reporting branch from 234d320 to ec965a9 Compare November 30, 2024 11:22
@teskje
Copy link
Contributor Author

teskje commented Dec 1, 2024

All nightlies failures are unrelated, afaict. In particular the upsert test failure also occurred in https://buildkite.com/materialize/nightly/builds/10582#_, which ran on a different PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants