Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Support HNSW SQ #23232

Open
1 task done
xiaofan-luan opened this issue Apr 4, 2023 · 42 comments
Open
1 task done

[Feature]: Support HNSW SQ #23232

xiaofan-luan opened this issue Apr 4, 2023 · 42 comments
Assignees
Labels
hacktoberfest Issues picked by hacktoberfest kind/feature Issues related to feature request from users

Comments

@xiaofan-luan
Copy link
Collaborator

Is there an existing issue for this?

  • I have searched the existing issues

Is your feature request related to a problem? Please describe.

SQ8 and PQ are widely used in ANN search. If you want to understand more about quantization, Faiss is probably one of the best code bases to explore.

HNSW is the fastest index in the open source world, so why not make it work together with SQ and PQ to accelerate it further?

Let me know if anyone is interested and we can offer more help on it

Describe the solution you'd like.

No response

Describe an alternate solution.

No response

Anything else? (Additional Context)

No response

@xiaofan-luan xiaofan-luan added good first issue Good for newcomers kind/feature Issues related to feature request from users labels Apr 4, 2023
@xiaofan-luan xiaofan-luan self-assigned this Apr 4, 2023
@noble-8
Copy link

noble-8 commented Apr 8, 2023

Hi @xiaofan-luan
I am interested in contributing!
and would like to know how to help! let me know how to get started.

@xiaofan-luan
Copy link
Collaborator Author

cool man! I though @liliu-z could offer you some help

@xiaofan-luan
Copy link
Collaborator Author

you have any experience on cpp and some any idea about HSNW algorithm yet?

@noble-8
Copy link

noble-8 commented Apr 8, 2023

Hi
I had taken a cpp course in college, I have primarily worked as a Java developer ~3 ish years, so i feel that I can onboard quickly. I feel that I am comfortable working on cpp.
I am really new to this algorithm but getting up to speed. I am going through this documentation here:https://www.pinecone.io/learn/hnsw/
feel free to point me to other resources.
I hope this is not a dealbreaker!

@xiaofan-luan
Copy link
Collaborator Author

  1. you can start from read code from HNSW in knowhere https://github.com/milvus-io/knowhere/tree/c05c8767f43eaa855c13654804d0bea9cc42c7de/src/index/hnsw
  2. Once you understand hnsw, then the next step would be under stand how to do PQ, SQ. Faiss document will give your general ideas and milvus has all faiss code you can utilize https://github.com/facebookresearch/faiss/wiki.
  3. Add index parameters for milvus to support PQ, SQ, which will be a trivial task

@liliu-z
Copy link
Member

liliu-z commented Apr 12, 2023

Hi @noble-8 this is on our roadmap and please feel free to make a PR for https://github.com/milvus-io/knowhere . I suggest we can start from SQ8 which is easier to implement. And more than welcome to open another issue in Knowhere for further detailed communication.

@liliu-z
Copy link
Member

liliu-z commented Apr 12, 2023

/assign @liliu-z

@noble-8
Copy link

noble-8 commented Apr 12, 2023

Sounds good. Will do!

@xiaofan-luan xiaofan-luan added the hacktoberfest Issues picked by hacktoberfest label Oct 3, 2023
@xiaofan-luan
Copy link
Collaborator Author

@noble-8
any progress on it?

@noble-8
Copy link

noble-8 commented Oct 5, 2023

i could not make any progress. i shall try again, however feel free to reassign this if i do not make a commit

@xiaofan-luan
Copy link
Collaborator Author

i could not make any progress. i shall try again, however feel free to reassign this if i do not make a commit

Sure, still thanks for the interest!
I would also like to help if you are intersted

@LaPetiteSouris
Copy link

LaPetiteSouris commented Oct 6, 2023

I just saw that https://github.com/milvus-io/knowhere is now archived. Wondering if this issue is still open ?

Or should the PR be addressed to https://github.com/zilliztech/Knowhere instead from now on ?

To summarize to make sure that I understand this correctly:

  • (1) Knowhere already implements HNSW, which runs on top of high-dimensional vectors
  • (2) The PR needs to implement SQ8 using FAISS to quantize high-dimensional vectors into compressed forms.
  • (3) Somehow, make HNSW to be able tow work with quantized/compressed vectors given at (2)

Am I correct ?

@xiaofan-luan
Copy link
Collaborator Author

https://github.com/milvus-io/knowhere

It has been archived and moved to https://github.com/zilliztech/knowhere, sorry for the misunderstanding.

You are correct my man. we want to add quantization support for HNSW index and integrate with Milvus

@LaPetiteSouris
Copy link

Thanks. How urgent do you folks need this ? My C++ is rusty 😭 so it may take a while ( I have CoPilot so that helps 😭 ).

But I love this challenge.

@LaPetiteSouris
Copy link

LaPetiteSouris commented Oct 6, 2023

@xiaofan-luan if you folks have patience to spare, then assign this to me

Edit: I tried to hack around and it seems that it's a bit too much for me to take this time. I'll pick another good first issue to ramp up.

@jiaoew1991
Copy link
Contributor

@xiaofan-luan I think the issue is not easy for beginners, it needs lots of knowledge 😅

@xiaofan-luan
Copy link
Collaborator Author

@xiaofan-luan I think the issue is not easy for beginners, it needs lots of knowledge 😅

Agreed you might be correct.

For SQ might be ok?

@xiaofan-luan xiaofan-luan changed the title [Feature]: Support HNSW SQ/PQ [Feature]: Support HNSW SQ Oct 13, 2023
@xiaofan-luan
Copy link
Collaborator Author

But true it has to be fully understand milvus

@xiaofan-luan xiaofan-luan removed the good first issue Good for newcomers label Oct 13, 2023
@xiaofan-luan
Copy link
Collaborator Author

remove the good first issue

@zaobao
Copy link

zaobao commented Nov 3, 2023

I sound that HNSW-SQ8 has been available on Ziili Cloud. Is that true?

@xiaofan-luan
Copy link
Collaborator Author

Zilliz cloud don't use HSNW. we have an internal index named Cardinal~

@Monster880
Copy link

Monster880 commented May 16, 2024

whether milvus can use hnsw pq index now ? @xiaofan-luan

@xiaofan-luan
Copy link
Collaborator Author

/assign @liliu-z

@xiaofan-luan
Copy link
Collaborator Author

@liliu-z
do we have plan to support hnsw pq and sq index?

@Monster880
Copy link

@xiaofan-luan Can you provide some guidance on where modifications are needed to support HNSW PQ index??

@xiaofan-luan
Copy link
Collaborator Author

NP, I thought Li @liliu-z can help on that.

@Monster880
Copy link

@liliu-z @xiaofan-luan emmm.... where is liliu-z

@liliu-z
Copy link
Member

liliu-z commented May 30, 2024

@liliu-z @xiaofan-luan emmm.... where is liliu-z

Sure, there are two ways to support HNSW + Quantization:

  1. Make it a new index type for Milvus
  2. Treat it as HNSW with a special config.

We are adopting the first way. So the work including:

  1. Add the quantization support in algorithm side and expose it as a new index type. Code should be in Knowhere
  2. Let Milvus know this new Index Type

Here is an example PR for the first step. It support SQ8 for HNSW in Knowhere side.

@Monster880
Copy link

@liliu-z it seems HNSW_PQ is not using faiss but using hnswlib after quantization...

@xiaofan-luan
Copy link
Collaborator Author

Now we prefer to use hnswlib rather than faiss for hnsw, so we need to backport pq and sq feature

@Monster880
Copy link

@liliu-z @xiaofan-luan I find it is too hard for me to support pq feature for the backport to hnswlib but the hnsw pq is very beneficial and important for me so that I hope milvus can support hnsw pq index as soon as possible..

@Monster880
Copy link

Monster880 commented May 31, 2024

@xiaofan-luan I think maybe it can temporarily support hnsw pq index by faiss

@liliu-z
Copy link
Member

liliu-z commented Jun 3, 2024

@xiaofan-luan I think maybe it can temporarily support hnsw pq index by faiss

Yes, we are discussing with faiss team about the possibility to switch to faiss' HNSW. There are still some gaps like performance, features and APIs. And we will support pq in hnswlib if we finally decide not to go with faiss.

Will keep this post updated if any progress.

@Monster880
Copy link

@liliu-z I get it and want to know when can we use hnsw pq index by milvus

@xiaofan-luan
Copy link
Collaborator Author

@alexanderguzhva
please help on this.
we want make sure faiss HNSW has similar performance and exact same functionality with current knowhere implementation

@Monster880
Copy link

Monster880 commented Jun 4, 2024

@xiaofan-luan @liliu-z Since knowhere already supports hnsw_sq index, when will milvus support it..

@alexanderguzhva
Copy link
Contributor

@xiaofan-luan I'm in the middle of deprecating hwnslib in favor of faiss already, work has been in the progress for some time

@Monster880
Copy link

@xiaofan-luan Actually, I want to know when will milvus support hnsw_sq index, since knowhere already supports hnsw_sq index..

@liliu-z
Copy link
Member

liliu-z commented Jun 5, 2024

@xiaofan-luan Actually, I want to know when will milvus support hnsw_sq index, since knowhere already supports hnsw_sq index..

It has not been fully tested yet, we can try to support it out as a beta function in the next release, which is 1-2 weeks from now. What do you think @xiaofan-luan

@xiaofan-luan
Copy link
Collaborator Author

@liliu-z please make sure knowhere side is ready

@tedxu could you assign someone to support HSNW PQ/SQ and did some test

@Monster880
Copy link

Monster880 commented Jun 16, 2024

Acutally, offline data pipeline can load index fasterly and avoid train index process. Is there some solution for me to train index offlinely and load index at container(stand-alone milvus) startup. @xiaofan-luan

@xiaofan-luan
Copy link
Collaborator Author

this might be our goal to do so.

using milvus with more offline index node should help.

if you already index by your self, why not simply serve it with faiss or hnsw?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hacktoberfest Issues picked by hacktoberfest kind/feature Issues related to feature request from users
Projects
None yet
Development

No branches or pull requests

8 participants