You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A large number of failures in the later stage of task execution,all ingest are retry again and again, until failure.
2. Minimal reproduce step (Required)
datum: 500000000 keys, and 1KB per keys.
In the initial stage of the task, each partition can be imported normally and the import speed is within 3-5 seconds. However, in the later stage of the task, each task (partition) needs to be executed for 30-40 minute, and is in the failure and retry stage. Finally, it fails.
3. What did you see instead (Required)
4. What did you expect to see? (Required)
5. What is your migration tool and TiKV version? (Required)
TiKV Online Bulk Load:
The text was updated successfully, but these errors were encountered:
Here is my personal analysis:
Although PD scheduling (merge / split) is suspended before import, tikv will still trigger split check and split region by itself
The importer only obtains the topology of the region that overlaps with the imported data once before the start of ingest.
With the import of data, the topology of the region has changed after the internal split of tikv self. However, the importer does not deal with this situation and simply tries again.
When encountering errors like EpochNotMatch or NotLeader, tikv-client will update the topology of the region itself, but the TiRegion passed to importerclient was first obtained before ingest, and it has long been invalid.
Bug Report
1. Describe the bug
A large number of failures in the later stage of task execution,all ingest are retry again and again, until failure.
2. Minimal reproduce step (Required)
datum: 500000000 keys, and 1KB per keys.
In the initial stage of the task, each partition can be imported normally and the import speed is within 3-5 seconds. However, in the later stage of the task, each task (partition) needs to be executed for 30-40 minute, and is in the failure and retry stage. Finally, it fails.
3. What did you see instead (Required)
4. What did you expect to see? (Required)
5. What is your migration tool and TiKV version? (Required)
The text was updated successfully, but these errors were encountered: