-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
invoices: migrate KV invoices to native SQL for users of KV SQL backends #8831
base: master
Are you sure you want to change the base?
Conversation
Important Review skippedAuto reviews are limited to specific labels. 🏷️ Labels to auto review (1)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
6682b50
to
338e1f0
Compare
d2a329f
to
6379a8b
Compare
5fe92e2
to
a7bf598
Compare
b6f0ac8
to
b983851
Compare
b983851
to
706b444
Compare
96f0cbe
to
bfe4ad5
Compare
Please hold off with the next round of reviews as I'm still investigating some performance issues with larger databases. |
0c5dd72
to
a124788
Compare
Thank you for your patience. Tested the PR with large KV invoice datasets and I believe migration performance is adequate. There's no slowdown and memory use remains constant given batch size. PTAL. |
f9842ec
to
1c0b28a
Compare
This commit adds the migration_tracker table which we'll use to track if a custom migration has already been done.
This commit introduces support for custom, in-code migrations, allowing a specific Go function to be executed at a designated database version during sqlc migrations. If the current database version surpasses the specified version, the migration will be skipped.
This commit separates the execution of SQL and in-code migrations from their construction. This change is necessary because, currently, the SQL schema is migrated during the construction phase in the lncfg package. However, migrations are typically executed when individual stores are constructed within the configuration builder.
Previously we intentially did not set settled_at and settle_index when inserting a new invoice as those fields are set when we settle an invoice through the usual invoice update. As migration requires that we set these nullable fields, we can safely add them.
Certain invoices may not have a deterministic payment hash. For such invoices we still store the payment hashes in our KV database, but we do not have a sufficient index to retrieve them. This PR adds such index to the SQL database that will be used during migration to retrieve payment hashes.
…a hash The current sqlc GetInvoice query experiences incremental slowdowns during the migration of large invoice databases, primarily due to its complex predicate set. For this specific use case, a streamlined GetInvoiceByHash function provides a more efficient solution, maintaining near-constant lookup times even with extensive table sizes.
This commit runs the invoice migration if the user has a KV SQL backend configured.
1c0b28a
to
f3ae54b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates! Just one or two questions about the strategy here before I do a final, detail orientated review round. I think perhaps the one question re leaning towards duplication rather than adding migration queries to the interface is open for discussion and so very happy to give in there if others disagree with me!
-- migration_id is the id of the migration. | ||
migration_id TEXT PRIMARY KEY, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still getting to that part of the PR but assuming that the order of migrations is kept track of at a code level then if this is text based?
// Version is the schema version at which the migration is applied. | ||
Version int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"schema version" as in up
file number yeah? if so, what about if we have 2 code-level migrations in a row that depend on each-other/where ordering is important?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm ok I see the order is gleaned implicitly from the order in which it is passed to ApplyMigrations
// Sort migrations by version to ensure they are applied in order. | ||
sort.SliceStable(migrations, func(i, j int) bool { | ||
return migrations[i].Version < migrations[j].Version | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slightly scary to me cause I think this doesnt account for the case where the versions are equal. I think maybe we should have an explicit order for these code-level migrations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
basically i think it would be cool if there was an overall, explicit version for each migration as one day these can diverge quite a bit but there will always be 1 single absolute DB version that we are talking about then. Thinking like a 1:1 map from: Overall Version to migration:
map[OverallVersionNum] -> Migration
where Migrations
has fields: type = sql/code
and then a versionNum
where that versionNum is the sql level version number or code level version number. We can persist this overall version and use it to know where to start from.
}, | ||
{ | ||
// We use this special case to test that a migration | ||
// will never be aplied in case the current version is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/aplied/applied
// Some migrations to use for both the failure and success tests. Note | ||
// that the migrations are not in order to test that they are executed | ||
// in the correct order. | ||
migrations := []MigrationConfig{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think we should cover the case of having 2 code level migrations applied directly after eachother on same sql level version
-- invoice_payment_hashes table contains the hash of the invoices. This table | ||
-- is used during KV to SQL invoice migration as in our KV representation we | ||
-- don't have a mapping from hash to add index. | ||
CREATE TABLE IF NOT EXISTS invoice_payment_hashes ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we go with duplicating DB state (sql files) and other codecs etc per migration (like we do for our channeldb migrations today) then we would be able to do this no? it would just mean having some duplication. but it might be worth it so that we dont have to have migration methods on the interface and so that we actually can drop these DBs and keep this "live" version clean.
-- name: GetInvoicePaymentHashByAddIndex :one | ||
SELECT hash | ||
FROM invoice_payment_hashes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just confirming understanding: we might have invoices with no add index right? but that is only the case for invoices that defs have preimages and so we would never need to actually call this method for those?
// Clean up the hash index as it's no longer needed. | ||
err = tx.ClearInvoiceHashIndex(ctx) | ||
if err != nil { | ||
return fmt.Errorf("unable to clear invoice hash "+ | ||
"index: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can't drop the table right now as there are queries depending on it and the query files are not versioned like the migration files.
but we can technically have copied query files per migration like we do today for channeldb migrations yeah? ie, introduce some duplication in order to keep the live version of the interface clean?
|
||
ClearInvoiceHashIndex(ctx context.Context) error | ||
|
||
GetMigration(ctx context.Context, migrationID string) ( | ||
sqlc.MigrationTracker, error) | ||
|
||
UpdateMigration(ctx context.Context, | ||
arg sqlc.UpdateMigrationParams) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might just keep them here and remove in the next version we we also remove the temp table.
I think im maybe struggling to picture this move - can you maybe just explain a bit more what we will do in the next version?
Change Description
This pull request adds the migration of old key-value (KV) invoices to the new native SQL schema when the --db.use-native-sql flag is set, unless the --db.skip-sql-invoice-migration flag is also specified.
Please note that since we currently do not support running on mixed database backends for users of
bbolt
oretcd
, an additional step is required to migrate their KV database to SQL first. For more context, please see lightninglabs/lndinit#21.