RecoveryHelper to speed up recovery after restart #471

justpresident · 2019-11-18T12:22:37Z

This patch introduces a change in WAL and recovery process.
The idea is to avoid expensive scan of all table files that could take longer than one hour for large tables and maintain a recovery record in the WAL instead. This record is written in the beginning of WAL log and it is not surrounded with 'begin and 'end' markers.
There are following situations possible:
a. There is no recovery record, there are normal records in WAL
b. There is no recovery record, no other records in WAL
c. There is a recovery record, there are normal records in WAL
d. There is a recovery record, no other records in WAL
Since recovery record is written in the beginning, then it contains the latest offset only in a case when there is nothing else in the log, or other records are invalid(temp files are deleted). So in cases a,c and d recovery process will pick the committed file from WAL with highest offset - either from recovery record or from normal records.
In case (b) when WAL log is empty or doesn't exist - latest offset will be discovered through full recursive folder scan.

This patch introduces a change in WAL and recovery process. The idea is to avoid expensive scan of all table files and maintain a recovery record in the WAL instead. This record is written in the beginning of WAL log and it is not surrounded with 'begin and 'end' markers. There are following situations possible: a. There is no recovery record, there are normal records in WAL b. There is no recovery record, no other records in WAL c. There is a recovery record, there are normal records in WAL d. There is a recovery record, no other records in WAL Since recovery record is writted in the beginning, then it contains the latest offset only in a case when there is nothing else in the log, or other records are invalid(temp files are deleted). So in cases a,c and d recovery process will pick the committed file from WAL with highest offset - either from recovery record or from normal records. In case b when WAL log is empty or doesn't exist - latest offset will be discovered through full recoursive folder scan.

ghost · 2019-11-18T12:22:39Z

@confluentinc It looks like @justpresident just signed our Contributor License Agreement. 👍

Always at your service,

clabot

alexandrfox · 2019-12-02T13:12:45Z

@kkonstantine would you please check this out? we've been running this in production for a while now

ncliang

@justpresident Thanks for making this PR. I'm curious what sort of speedup you are seeing in your environment?

ncliang · 2019-12-10T17:46:39Z

src/main/java/io/confluent/connect/hdfs/RecoveryHelper.java

+    return instance;
+  }
+
+  private final Map<TopicPartition, List<String>> files = new HashMap<>();


I think this singleton instance will be accessed potentially by multiple threads when we have multiple tasks running on single worker. The map probably needs to be a ConcurrentHashMap or access protected by locks.

Ah, correct. The way we launch it is quite special. We launch all instances in standalone mode in kubernetes, one worker per pod. So I have overlooked possibility of having multiple workers on the same machine. Fixed

ncliang · 2019-12-13T23:22:56Z

src/main/java/io/confluent/connect/hdfs/wal/FSWAL.java

@@ -128,6 +130,9 @@ public void apply() throws ConnectException {
          WALEntry mapKey = new WALEntry(key.getName());
          WALEntry mapValue = new WALEntry(value.getName());
          entries.put(mapKey, mapValue);
+          if (value.getName().equals(RecoveryHelper.RECOVERY_RECORD_KEY)) {


Does this do anything? RECOVERY_RECORD_KEY is written to the key here - wal.append(RecoveryHelper.RECOVERY_RECORD_KEY, fileStatusWithMaxOffset.getPath().toString()).

That was a mistake. Fixed it.

ncliang · 2019-12-14T00:13:03Z

src/main/java/io/confluent/connect/hdfs/wal/FSWAL.java

@@ -120,6 +121,7 @@ public void apply() throws ConnectException {
          for (Map.Entry<WALEntry, WALEntry> entry: entries.entrySet()) {
            String tempFile = entry.getKey().getName();
            String committedFile = entry.getValue().getName();
+            RecoveryHelper.getInstance().addFile(committedFile);


I am concerned that the lists in the map grow without bounds and eventually may cause OOM if the process is run long enough.

justpresident · 2019-12-23T15:47:18Z

@justpresident Thanks for making this PR. I'm curious what sort of speedup you are seeing in your environment?

The speedup of course depends on the number of existing files in the table. The initial scan, that usually takes around 1 hour for large tables is eliminated completely. The startup is now instant

pedro93 · 2022-03-02T10:32:02Z

Hello,
Any updates on this PR?

justpresident · 2022-12-27T22:02:30Z

I don't work with kafka-connect anymore and don't have such a setup with thousands of hdfs files to test, but it seems like the problem was solved in a very similar way in #556
Can someone please test and if there is no problem, this PR can be closed

cla-assistant · 2023-08-27T12:05:25Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Roman Studenikin seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

kkonstantine requested a review from a team December 10, 2019 17:25

ncliang reviewed Dec 14, 2019

View reviewed changes

Fix after code review

7961295

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RecoveryHelper to speed up recovery after restart #471

RecoveryHelper to speed up recovery after restart #471

justpresident commented Nov 18, 2019

ghost commented Nov 18, 2019

alexandrfox commented Dec 2, 2019

ncliang left a comment

ncliang Dec 10, 2019

justpresident Dec 23, 2019

ncliang Dec 13, 2019

justpresident Dec 23, 2019

ncliang Dec 14, 2019

justpresident commented Dec 23, 2019

pedro93 commented Mar 2, 2022

justpresident commented Dec 27, 2022

cla-assistant bot commented Aug 27, 2023

RecoveryHelper to speed up recovery after restart #471

Are you sure you want to change the base?

RecoveryHelper to speed up recovery after restart #471

Conversation

justpresident commented Nov 18, 2019

ghost commented Nov 18, 2019

alexandrfox commented Dec 2, 2019

ncliang left a comment

Choose a reason for hiding this comment

ncliang Dec 10, 2019

Choose a reason for hiding this comment

justpresident Dec 23, 2019

Choose a reason for hiding this comment

ncliang Dec 13, 2019

Choose a reason for hiding this comment

justpresident Dec 23, 2019

Choose a reason for hiding this comment

ncliang Dec 14, 2019

Choose a reason for hiding this comment

justpresident commented Dec 23, 2019

pedro93 commented Mar 2, 2022

justpresident commented Dec 27, 2022

cla-assistant bot commented Aug 27, 2023