[Perf] Add Relationship Index Storing Caveat #1988

lalalalatt · 2024-07-18T17:21:36Z

No description provided.

github-actions · 2024-07-18T17:22:25Z

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

internal/datastore/crdb/migrations/zz_migration.0007_add_relation_index_storing_caveat.go

vroldanbet · 2024-07-23T09:22:28Z

I believe this PR was opened to address #1923

vroldanbet

I have concerns about this PR. The reason the issue was created in the first place is because the CRDB UI shows it as recommended. This is likely so that the query becomes served from the index, instead of having to go to disk to fetch the rest of the columns.

I have three concerns about the proposal:

it drops the previous indexes. This means that until the new indexes are created, the system is left without an index. This has the potential to cause an incident on a production system, as it can push the database over saturation level.
further increases the cost on the write path, as the system now has to write more information to disk, potentially making writes slower.
increases the amount of storage needed. While it is reasonable to trade disk for query speed, customers running CRDB Dedicated can see their bill explode, as Cockroach Labs puts a hefty price tag on the disk capacity of their managed offerings.

I don't think we can move forward with this without:

running a load test to assess how much queries improve and if it's worth the tradeoff
determining the % increase in disk used
finding a way to add those indexes without causing an incident (the easiest solution being to consider running 2 migrations, or exploring ALTER the indexes)

lalalalatt · 2024-07-23T10:13:35Z

I believe this PR was opened to address #1923

Yes, you're right, thanks. I forgot to add the description.

lalalalatt · 2024-07-23T10:21:21Z

I have concerns about this PR. The reason the issue was created in the first place is because the CRDB UI shows it as recommended. This is likely so that the query becomes served from the index, instead of having to go to disk to fetch the rest of the columns.

I have three concerns about the proposal:

it drops the previous indexes. This means that until the new indexes are created, the system is left without an index. This has the potential to cause an incident on a production system, as it can push the database over saturation level.

further increases the cost on the write path, as the system now has to write more information to disk, potentially making writes slower.

increases the amount of storage needed. While it is reasonable to trade disk for query speed, customers running CRDB Dedicated can see their bill explode, as Cockroach Labs puts a hefty price tag on the disk capacity of their managed offerings.

I don't think we can move forward with this without:

running a load test to assess how much queries improve and if it's worth the tradeoff

determining the % increase in disk used

finding a way to add those indexes without causing an incident (the easiest solution being to consider running 2 migrations, or exploring ALTER the indexes)

I would try to work on the stress test to test the two metrics you mentioned.

For the index drop and create issue, I have searched on that, (currently) it seems it is impossible to alter the standard index to the storing column index.
Moreover, maybe require to contribute this feature to cockroachDB upstream.
But, For the current workaround, I think we can create the storing column index first, after finishing indexing, then it would be safe to drop the original index.

vroldanbet · 2024-08-19T12:48:28Z

But, For the current workaround, I think we can create the storing column index first, after finishing indexing, then it would be safe to drop the original index.

I'm not entirely sure this would work either. There are some caveats:

The docs indicate that index creation and drop are executed concurrently: see https://www.cockroachlabs.com/docs/stable/create-index and https://www.cockroachlabs.com/docs/stable/drop-index. Thus there is a risk the old indexes are dropped and the new indexes are not ready yet.
on top of that, there are some gotchas with multi-line DDL, see https://www.cockroachlabs.com/docs/stable/online-schema-changes#known-limitations, which could lead partially run migrations.

This is unfortunately a difficult migration to run without some careful planning. E.g. we have customers with Terabytes worth of SpiceDB relationships in their Cockroach clusters. This migration would:

increase the load on the cluster for a non-trivial amount of time (proportional to the amount of data stored)
increase write transaction latency
increase the disk capacity required to operate, as a new index covering all tuple columns is added.

I'd like to see these optimizations make it into main, but it's a risky change as-is. In an ideal world:

The code can estimate if there is enough disk available to run the migration
the old indexes are only dropped when the new indexes have been validated to be selected by the query planner and provide a visible performance gain
the index creation runs with low priority and limits the impact on the cluster

Perhaps a conservative approach would be:

To introduce logic that determines the size of the indexes to be dropped, and checks if at least the same amount plus some headroom (to account for the newly stored caveat columns) is available on cluster
the indexes are made NOT_VISIBLE to begin with, and a feature flag in SpiceDB enables / disables the use of new index. It's enabled by default, but can be disabled if users detect performance regressions
We can explore if using transaction priority buys us anything when it comes to running DDL statements like CREATE INDEX or DROP INDEX.
On a different subsequent SpiceDB release the new indexes can be made visible, and the old indexes can be dropped safely.

Would like to hear your thoughts on this issue @josephschorr

lalalalatt · 2024-08-23T11:14:44Z

spicedb/pkg/datastore/test/watch.go

Lines 49 to 53 in d77601b

    
           { 
        
           	numTuples:        256, 
        
           	bufferTimeout:    1 * time.Nanosecond, 
        
           	expectFallBehind: true, 
        
           },

If I change tuple num to 512, then the test 256-true would pass, while its testing name would be 512-true

vroldanbet · 2024-08-30T09:19:05Z

If I change tuple num to 512, then the test 256-true would pass, while its testing name would be 512-true

I think that's fine, the goal of the test is to push so many changes the consumer cannot retrieve them for the channel fast enough, a timeout will be triggered if the channel is at capacity, and if enough time elapses the channel Watch API connection will be aborted. Feel free to increase it, id say this is likely a flake.

github-actions bot added the area/datastore Affects the storage system label Jul 18, 2024

This comment was marked as resolved.

Sign in to view

chore: add relationship index storing caveat to migrations

821f59b

lalalalatt force-pushed the perf/crdb-caveat-index branch from 908fd62 to 821f59b Compare July 18, 2024 17:25

authzedbot added a commit to authzed/cla that referenced this pull request Jul 18, 2024

@lalalalatt has signed the CLA in authzed/spicedb#1988

785ee09

lalalalatt mentioned this pull request Jul 18, 2024

CockroachDB index should to store caveats #1923

Open

lalalalatt marked this pull request as ready for review July 19, 2024 15:05

lalalalatt requested a review from a team July 19, 2024 15:05

ecordell reviewed Jul 22, 2024

View reviewed changes

internal/datastore/crdb/migrations/zz_migration.0007_add_relation_index_storing_caveat.go Outdated Show resolved Hide resolved

chore: update relationship index storing caveat in migrations

e5e03b0

authzedbot added a commit to authzed/cla that referenced this pull request Jul 23, 2024

@peterxcli has signed the CLA in authzed/spicedb#1988

8052b99

vroldanbet requested changes Jul 23, 2024

View reviewed changes

Merge branch 'main' into perf/crdb-caveat-index

9656e32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Perf] Add Relationship Index Storing Caveat #1988

[Perf] Add Relationship Index Storing Caveat #1988

lalalalatt commented Jul 18, 2024

github-actions bot commented Jul 18, 2024 •

edited

Loading

This comment was marked as resolved.

vroldanbet commented Jul 23, 2024

vroldanbet left a comment •

edited

Loading

lalalalatt commented Jul 23, 2024

lalalalatt commented Jul 23, 2024

vroldanbet commented Aug 19, 2024

lalalalatt commented Aug 23, 2024

vroldanbet commented Aug 30, 2024

[Perf] Add Relationship Index Storing Caveat #1988

Are you sure you want to change the base?

[Perf] Add Relationship Index Storing Caveat #1988

Conversation

lalalalatt commented Jul 18, 2024

github-actions bot commented Jul 18, 2024 • edited Loading

This comment was marked as resolved.

vroldanbet commented Jul 23, 2024

vroldanbet left a comment • edited Loading

Choose a reason for hiding this comment

lalalalatt commented Jul 23, 2024

lalalalatt commented Jul 23, 2024

vroldanbet commented Aug 19, 2024

lalalalatt commented Aug 23, 2024

vroldanbet commented Aug 30, 2024

github-actions bot commented Jul 18, 2024 •

edited

Loading

vroldanbet left a comment •

edited

Loading