Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: Refine scalar filtering with better understanding of sort and index #28123

Closed
1 task done
xiaofan-luan opened this issue Nov 2, 2023 · 4 comments
Closed
1 task done
Assignees
Labels
kind/enhancement Issues or changes related to enhancement stale indicates no udpates for 30 days

Comments

@xiaofan-luan
Copy link
Collaborator

Is there an existing issue for this?

  • I have searched the existing issues

What would you like to be added?

  1. Data in Milvus should be resort in PK
  2. For many operations like range and like a%, we should respect sort order if the operation is happened on PK
  3. For term operation, think of using BF to avoid extra filtering
  4. For range operation, think of using ordered index
  5. Delete/MVCC should be handled with less cost

Why is this needed?

No response

Anything else?

No response

@yah01
Copy link
Member

yah01 commented Nov 6, 2023

The data sort by pk will be done after L0 segment work as it contains the refactor of compaction, we will sort the data while flushing/compacting, not loading (this will break the offset to vector index, it's possible to work with this but would introduce overhead).

@yah01
Copy link
Member

yah01 commented Nov 6, 2023

Query related impl plan:

  • Split the default pk index with insert record, and move it into the index records, to make the behavior consistent while processing filters
  • Leverage the ordered index to process range filter and prefix matching
  • Split the data into multiple chunks, and filter the chunks by min/max stats (or BF)

@yah01
Copy link
Member

yah01 commented Nov 6, 2023

Delete related impl plan:

Copy link

stale bot commented Dec 7, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Dec 7, 2023
@stale stale bot closed this as completed Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Issues or changes related to enhancement stale indicates no udpates for 30 days
Projects
None yet
Development

No branches or pull requests

2 participants