Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-36813][cdc-connectors][mysql] support mysql sync part columns #3767

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

JNSimba
Copy link
Member

@JNSimba JNSimba commented Nov 28, 2024

Background
In some scenarios, MySQL synchronization only expects to synchronize specified fields instead of all fields in the table.

  1. The user only has the permission for some fields in MySQL
  2. The user has too many fields in a single table and only wants to synchronize some fields, for example, here flink-cdc在全量阶段使用了SELECT * FROM TABLE读取数据。 #3058

Current situation
For the incremental stage, you only need to configure the column.include.list property of debezium to support the synchronization of some fields in the incremental stage, refer to: https://debezium.io/documentation/reference/1.9/connectors/mysql.html#mysql-property-column-include-list

For the full snapshot stage, * is currently used in MySqlSnapshotSplitReadTask, refer to

if (isScanningData) {
return buildSelectWithRowLimits(
tableId, limitSize, "*", Optional.ofNullable(condition), Optional.empty()); 

Solution
We can refer to debezium RelationalSnapshotChangeEventSource, The user configures column.include.list, and then captures the specific columns in MySqlSnapshotSplitReadTask, and splices them when constructing Scan SQL.

@JNSimba
Copy link
Member Author

JNSimba commented Dec 3, 2024

@leonardBang @ruanhang1993 PTAL, Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant