Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dispatch schema-level ddl to different topic #11882

Open
CharlesCheung96 opened this issue Dec 16, 2024 · 1 comment
Open

Support dispatch schema-level ddl to different topic #11882

CharlesCheung96 opened this issue Dec 16, 2024 · 1 comment
Labels
area/ticdc Issues or PRs related to TiCDC. type/bug The issue is confirmed as a bug.

Comments

@CharlesCheung96
Copy link
Contributor

CharlesCheung96 commented Dec 16, 2024

What did you do?

  • Create changefeed with the following command
./cdc cli changefeed create --server=127.0.0.1:8300 --changefeed-id="kafka-test" --sink-uri="kafka://127.0.0.1:9092/default-topic?&protocol=open-protocol" --config changefeed.toml
  • The configuration is
[sink]
dispatchers = [
    {matcher = ['marvin.*'], topic = "cdc_{schema}_topic", partition = "index-value" },
]
  • Run create database marvin in TiDB

What did you expect to see?

Dispatch create database ddl to cdc_marvin_topic

What did you see instead?

TiCDC dispatch schema-level ddl to default-topic

From the official doc, it is by design. However, it makes more sense to allow users to distribute schema-level ddl to different topics based on the schema name.

@CharlesCheung96 CharlesCheung96 added area/ticdc Issues or PRs related to TiCDC. type/bug The issue is confirmed as a bug. labels Dec 16, 2024
@wentaojin
Copy link
Contributor

Based on the open protocol, the data is consumed, and the default topic is distributed to the database-level DDL. The table-level DDL and DML are in the same topic. The coordination of the two DDLs is difficult to control. It is difficult to determine whether the default topic DDL that is dependent on has arrived, and whether the table-level DDL should wait. The same topic can be sent uniformly, so that the same logic can be used as the table level, and any <= DDL commitTs event can be guaranteed to have been sent.

Suggestion:
Considering that dispatchers may have multiple different schema or table distribution requirements, it is recommended to embed the database-level DDL in a certain dispatcher to control the database level to be distributed to the corresponding controlled topic. As for whether to consume, it is controlled by the program, for example: add the parameter enable_dispatcher_database to control it?


[sink]
dispatchers = [
{matcher = ['marvin.*'], topic = "cdc_{schema}_topic", partition = "index-value", enable_dispatcher_database= true},
]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ticdc Issues or PRs related to TiCDC. type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

2 participants