-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement]: Milvus String Scalar filter is very slow #33538
Comments
milvus string is highly optimized with simd and usually much faster than any other vector DB on the planet. If you can give more information that would help on the investigation(If you can give advice on how to optimize it that would be even better)
|
well, can you give the code link of string filter? I hope I can check the core. cc @xiaofan-luan |
The data size is 330922 rows. And I use multi vector search with string filter (0.9 selectivity), but get slow performance. It costs me 3.23s, and the pure multi vector search (without filter) will cost 2.484s. Wait me to upload the script.(btw, the avg string size is 254 of data size),and I use string filter as |
@xiaofan-luan Hi, Can you tell me the core code of string filter in Milvus Project? I would like to study it and try optimize it. |
Pure multi vector search (without filter) will cost 2.484s this doesn't seem to make any sense. If you create the right index, milvus usually takes 10ms in memory or 50 ms with disk. My suggestion is to not start from code, but from profiling, see what part takes most of your cpu. And may I know what cpu you are running on? intel or arm? how many cores? what index did you use? |
what is the milvus version? Honestly speaking I don't expect anyone with knowhere can optimize milvus in 1 months. |
you can try to create an tantivy index, which may help to improve the like expression. |
well, I use not like '%xxxxxx%', and I can't find the api to create tantivy index in milvus doc. Does milvus generate bitmap to do filter search?? cc @xiaofan-luan |
like is always slow, no matter what database you are using, espeicially when you don't specify prefix |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Is there an existing issue for this?
What would you like to be added?
I need to know where the scalar string filter code is.
Why is this needed?
Performance improvement
Anything else?
No response
The text was updated successfully, but these errors were encountered: