Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KafkaSinkCluster - routing error handling #1659

Merged
merged 3 commits into from
Jun 16, 2024

Conversation

rukai
Copy link
Member

@rukai rukai commented Jun 12, 2024

The kafka cluster has a lot of state that shotover must keep track of for routing purposes.
This includes things like which brokers are holding which partitions.
Currently shotover populates its internal records of this state but it has no way to invalidate these records.
This PR implements this missing invalidation.

To handle changes to the cluster we need to handle routing errors as these indicate that the cluster has changed.

This PR handles the 3 kinds of routing errors as follows:

  • NOT_CONTROLLER set controller_broker to BrokerId(-1)
  • NOT_COORDINATOR remove group from group_to_coordinator_broker
  • NOT_LEADER_OR_FOLLOWER
    • remove topic from topic_by_name and topic_by_id
    • alternatively, if supported on this api version, immediately update topic_by_name/topic_by_id as per KIP-951
      • if the produce response is NOT_LEADER_OR_FOLLOWER and includes a newer leader epoch then we can update the topic entry with the provided broker ids

This PR ensures that every request type that we perform routing also has a response handler that invalidates the routing state when we get a routing error.

Copy link

codspeed-hq bot commented Jun 12, 2024

CodSpeed Performance Report

Merging #1659 will not alter performance

Comparing rukai:routing_error_handling (aa66e59) with main (caa3453)

Summary

✅ 38 untouched benchmarks

@rukai rukai force-pushed the routing_error_handling branch 5 times, most recently from 58181c3 to a0b5e51 Compare June 14, 2024 00:26
@rukai rukai mentioned this pull request Jun 14, 2024
12 tasks
@rukai rukai force-pushed the routing_error_handling branch 5 times, most recently from a13058e to 95257a4 Compare June 14, 2024 01:59
@rukai rukai marked this pull request as ready for review June 14, 2024 03:37
@rukai rukai force-pushed the routing_error_handling branch from 95257a4 to 4301236 Compare June 14, 2024 04:36
@rukai rukai force-pushed the routing_error_handling branch from 4301236 to 6322f6e Compare June 14, 2024 04:38
@rukai rukai merged commit b540be8 into shotover:main Jun 16, 2024
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants