-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC:TiKV RawKV MVCC GC #90
base: master
Are you sure you want to change the base?
Changes from 2 commits
ed3ba92
04c39e3
66a5cb8
92b6b54
2464d0c
2186c22
15573ff
a86c475
552fbd6
03e416f
dcf16a3
89194ba
3489396
b211769
27c71e8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,44 @@ | ||||||
# RFC: TiKV RawKV MVCC GC | ||||||
|
||||||
|
||||||
## Summary | ||||||
Move TiKV MVCC GC worker from TiDB into a group of independent GC worker node role and implement a new GC process in TiKV for RawKV. | ||||||
|
||||||
## Motivation | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The description of GC worker would be better to place in "Background" section.
|
||||||
GC worker is an important component for TiKV that deletes outdated MVCC data so as to not explode the storage. But currently, the GC worker is implemented in TiDB, which makes TiKV not usable without TiDB.And current GC process is just for transaction of TiDB,it's not usable for RawKV. | ||||||
|
||||||
## Background | ||||||
According to the documentation for the current GC worker in a TiDB cluster, the GC process is as follows: | ||||||
|
||||||
In TiDB GC worker leader: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently, TiDB GC worker is not standardized use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have read this document and discussed with @MyonKeminta .For implement this proposal, it need a clearer design and more development. |
||||||
1. Regularly calculates a new timestamp called "safe point", and push the safe point to PD. | ||||||
2. Get the minimal Service safe point among all services from the response of step 2, which is GC safe point . | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
3. Txn GC process: resolve locks and record delete ranges information. | ||||||
|
||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Refer to RunDistributedGCJob, where are two more steps:
|
||||||
In PD leader: | ||||||
1. Receive update safe point requests from TiDB or other tools (e.g. CDC, BR). | ||||||
2. Calculate the minimal timestamp = min(all service safe point, now - gc_life_time). | ||||||
|
||||||
In every TiKV nodes: | ||||||
1. Get GC safe point from PD regularly. | ||||||
2. Deletion will be triggered in CompactionFilter and GcTask thread; | ||||||
|
||||||
## New GC worker architecture | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe it is not a new architecture. We are utilizing the same architecture but another implementation. |
||||||
In a TiKV cluster without TiDB nodes , there are a few different points as follows: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
1. We need to move GC worker into another node role. | ||||||
2. For [API V2](https://github.com/tikv/rfcs/blob/master/text/0069-api-v2.md) .It need gc the earlier version in default cf. But Txn GC worker process will be triggered by WriteCompactionFilter of write cf. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
3. RawKV encoded code of RawValue is different with Txn in TiDB. | ||||||
|
||||||
So we designed a new GC architecture and process for TiKV cluster. | ||||||
|
||||||
## Detailed design | ||||||
For support TiKV cluster deploy without TiDB nodes. | ||||||
1. Add a new node role instead of GC worker in TiDB nodes. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need to decide exactly which role before merging the RFC. |
||||||
The code of new GC worker,It will be added into [tikv/migration](https://github.com/tikv/migration) | ||||||
2. And for API V2, we need add new CompactionFilter which is named RawGCcompactionFilter, and add a new GCTask type implementation. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is already a CompactionFilter for RawKV. Are we going to add a new one or just modify it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. TTLCompactionFilter is just for APIV1TTL,In API V2 ,There are several differences with APIV1TTL.I think add a new CompactionFilter is better, @pingyu What do you think? |
||||||
3. GC conditions in RawGCcompactionFilter is: (ts < GCSafePoint) && ( ttl-expired || deleted-mark || not the newest version ). | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Need to clarify the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We may want to remove a key that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should keep such entries. Expired entries are just the same as logical deleted. Miss to capture these entries will cause downstream to return an earlier version, and being inconsistent to upstream. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @pingyu So there will be this case: the CDC may sync a key that is already expired with downstream, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (2) Who will push GCSafePoint forward? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes. |
||||||
1. If the newest version is earlier than GC safe point and it's delete marked or expired ttl,those keys and earlier versions of the same userkey will be sent to a gc scheduler thread to gc asynchronous. | ||||||
|
||||||
## Reference | ||||||
https://docs.google.com/document/d/1jA3lK9QbYlwsvn67wGsSuusD1Dzx7ANq_vya384RBIg/edit#heading=h.rr3hcmc7ejb8 | ||||||
https://docs.pingcap.com/tidb/stable/garbage-collection-overview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the subject of this RFC is for RawKV, I think we are not discussing about "move GC worker from TiDB".
I suggest the main contents to be:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"node role" seems to be not easy to understand.
Suggest to use "component", as TiUP using this word. See tiup.io.