Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

invoices: migrate KV invoices to native SQL for users of KV SQL backends #8831

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

bhandras
Copy link
Collaborator

@bhandras bhandras commented Jun 12, 2024

Change Description

This pull request adds the migration of old key-value (KV) invoices to the new native SQL schema when the --db.use-native-sql flag is set, unless the --db.skip-sql-invoice-migration flag is also specified.

Please note that since we currently do not support running on mixed database backends for users of bbolt or etcd, an additional step is required to migrate their KV database to SQL first. For more context, please see lightninglabs/lndinit#21.

Copy link
Contributor

coderabbitai bot commented Jun 12, 2024

Important

Review skipped

Auto reviews are limited to specific labels.

🏷️ Labels to auto review (1)
  • llm-review

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@bhandras bhandras self-assigned this Jun 12, 2024
@bhandras bhandras added database Related to the database/storage of LND migration labels Jun 12, 2024
@bhandras bhandras added this to the 0.19.0 milestone Jun 12, 2024
@bhandras bhandras force-pushed the sql-invoice-migration branch 3 times, most recently from 6682b50 to 338e1f0 Compare June 12, 2024 15:30
@bhandras bhandras force-pushed the sql-invoice-migration branch 3 times, most recently from d2a329f to 6379a8b Compare June 21, 2024 17:37
@bhandras bhandras force-pushed the sql-invoice-migration branch 2 times, most recently from 5fe92e2 to a7bf598 Compare August 14, 2024 09:38
@bhandras bhandras force-pushed the sql-invoice-migration branch 5 times, most recently from b6f0ac8 to b983851 Compare September 17, 2024 14:42
@bhandras bhandras changed the title [wip] invoices: migrate KV invoices to native SQL invoices: migrate KV invoices to native SQL for users of KV SQL backends Sep 17, 2024
@bhandras bhandras marked this pull request as ready for review September 17, 2024 14:51
@bhandras bhandras force-pushed the sql-invoice-migration branch 7 times, most recently from 96f0cbe to bfe4ad5 Compare September 19, 2024 15:08
@bhandras
Copy link
Collaborator Author

Please hold off with the next round of reviews as I'm still investigating some performance issues with larger databases.

@bhandras bhandras force-pushed the sql-invoice-migration branch 3 times, most recently from 0c5dd72 to a124788 Compare December 2, 2024 19:35
@bhandras
Copy link
Collaborator Author

bhandras commented Dec 2, 2024

Thank you for your patience. Tested the PR with large KV invoice datasets and I believe migration performance is adequate. There's no slowdown and memory use remains constant given batch size. PTAL.

@bhandras bhandras force-pushed the sql-invoice-migration branch 2 times, most recently from f9842ec to 1c0b28a Compare December 2, 2024 20:18
This commit adds the migration_tracker table which we'll use to track if
a custom migration has already been done.
This commit introduces support for custom, in-code migrations, allowing
a specific Go function to be executed at a designated database version
during sqlc migrations. If the current database version surpasses the
specified version, the migration will be skipped.
This commit separates the execution of SQL and in-code migrations
from their construction. This change is necessary because,
currently, the SQL schema is migrated during the construction
phase in the lncfg package. However, migrations are typically
executed when individual stores are constructed within the
configuration builder.
Previously we intentially did not set settled_at and settle_index when
inserting a new invoice as those fields are set when we settle an
invoice through the usual invoice update. As migration requires that we
set these nullable fields, we can safely add them.
Certain invoices may not have a deterministic payment hash. For such
invoices we still store the payment hashes in our KV database, but we do
not have a sufficient index to retrieve them. This PR adds such index to
the SQL database that will be used during migration to retrieve payment
hashes.
…a hash

The current sqlc GetInvoice query experiences incremental slowdowns during
the migration of large invoice databases, primarily due to its complex
predicate set. For this specific use case, a streamlined GetInvoiceByHash
function provides a more efficient solution, maintaining near-constant
lookup times even with extensive table sizes.
This commit runs the invoice migration if the user has a KV SQL backend
configured.
Copy link
Collaborator

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! Just one or two questions about the strategy here before I do a final, detail orientated review round. I think perhaps the one question re leaning towards duplication rather than adding migration queries to the interface is open for discussion and so very happy to give in there if others disagree with me!

Comment on lines +5 to +6
-- migration_id is the id of the migration.
migration_id TEXT PRIMARY KEY,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still getting to that part of the PR but assuming that the order of migrations is kept track of at a code level then if this is text based?

Comment on lines +26 to +27
// Version is the schema version at which the migration is applied.
Version int
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"schema version" as in up file number yeah? if so, what about if we have 2 code-level migrations in a row that depend on each-other/where ordering is important?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm ok I see the order is gleaned implicitly from the order in which it is passed to ApplyMigrations

Comment on lines +255 to +258
// Sort migrations by version to ensure they are applied in order.
sort.SliceStable(migrations, func(i, j int) bool {
return migrations[i].Version < migrations[j].Version
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slightly scary to me cause I think this doesnt account for the case where the versions are equal. I think maybe we should have an explicit order for these code-level migrations

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically i think it would be cool if there was an overall, explicit version for each migration as one day these can diverge quite a bit but there will always be 1 single absolute DB version that we are talking about then. Thinking like a 1:1 map from: Overall Version to migration:

map[OverallVersionNum] -> Migration

where Migrations has fields: type = sql/code and then a versionNum where that versionNum is the sql level version number or code level version number. We can persist this overall version and use it to know where to start from.

},
{
// We use this special case to test that a migration
// will never be aplied in case the current version is
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/aplied/applied

// Some migrations to use for both the failure and success tests. Note
// that the migrations are not in order to test that they are executed
// in the correct order.
migrations := []MigrationConfig{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think we should cover the case of having 2 code level migrations applied directly after eachother on same sql level version

-- invoice_payment_hashes table contains the hash of the invoices. This table
-- is used during KV to SQL invoice migration as in our KV representation we
-- don't have a mapping from hash to add index.
CREATE TABLE IF NOT EXISTS invoice_payment_hashes (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we go with duplicating DB state (sql files) and other codecs etc per migration (like we do for our channeldb migrations today) then we would be able to do this no? it would just mean having some duplication. but it might be worth it so that we dont have to have migration methods on the interface and so that we actually can drop these DBs and keep this "live" version clean.

Comment on lines +185 to +187
-- name: GetInvoicePaymentHashByAddIndex :one
SELECT hash
FROM invoice_payment_hashes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just confirming understanding: we might have invoices with no add index right? but that is only the case for invoices that defs have preimages and so we would never need to actually call this method for those?

Comment on lines +454 to +461
// Clean up the hash index as it's no longer needed.
err = tx.ClearInvoiceHashIndex(ctx)
if err != nil {
return fmt.Errorf("unable to clear invoice hash "+
"index: %w", err)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can't drop the table right now as there are queries depending on it and the query files are not versioned like the migration files.

but we can technically have copied query files per migration like we do today for channeldb migrations yeah? ie, introduce some duplication in order to keep the live version of the interface clean?

Comment on lines +135 to +145

ClearInvoiceHashIndex(ctx context.Context) error

GetMigration(ctx context.Context, migrationID string) (
sqlc.MigrationTracker, error)

UpdateMigration(ctx context.Context,
arg sqlc.UpdateMigrationParams) error
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might just keep them here and remove in the next version we we also remove the temp table.

I think im maybe struggling to picture this move - can you maybe just explain a bit more what we will do in the next version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database Related to the database/storage of LND migration
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

7 participants