[Feature]: Join two collections #35500

xiaofan-luan · 2024-08-15T20:43:54Z

Is there an existing issue for this?

I have searched the existing issues

Is your feature request related to a problem? Please describe.

Under some use cases, user need to search for topk for each entity of the other collections.

This can be called as a Knn Join or semantic join.

Simply list it here and wait for more discussion

Describe the solution you'd like.

No response

Describe an alternate solution.

No response

Anything else? (Additional Context)

No response

chasingegg · 2024-08-16T07:45:54Z

We could have something like batching search in vector search engine, this is helpful when we use IVF related indexes, we can group the same posting lists for different queries and do the matrix computation to improve qps.

xiaofan-luan · 2024-08-16T17:09:13Z

That is exactly what I'm thinking.
To implement this, we need

LRU on segments (usuaully we don't need to load everything into main memory)
Batch search on all segments (typically NQ == 100k)
Using GPU or other batch optimizations in index.
Under this mode, we don't really need to do batch insertion

xiaofan-luan · 2024-08-16T17:09:22Z

@liliu-z @chasingegg thoughts on it?

liliu-z · 2024-08-19T04:32:07Z

An async/cron job API is needed.
It is a general operation that can apply to any indexes and cache strategies (Segment LRU, all Memory, etc.). But we have some prefer combination.
It can be a Map-Reduce pattern, we first do batch searches and store results on a cronjob leader node (maybe delegator). And then do a reduce work upon it.

xiaofan-luan added the kind/feature Issues related to feature request from users label Aug 15, 2024

xiaofan-luan self-assigned this Aug 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Join two collections #35500

[Feature]: Join two collections #35500

xiaofan-luan commented Aug 15, 2024

chasingegg commented Aug 16, 2024

xiaofan-luan commented Aug 16, 2024

xiaofan-luan commented Aug 16, 2024

liliu-z commented Aug 19, 2024

[Feature]: Join two collections #35500

[Feature]: Join two collections #35500

Comments

xiaofan-luan commented Aug 15, 2024

Is there an existing issue for this?

Is your feature request related to a problem? Please describe.

Describe the solution you'd like.

Describe an alternate solution.

Anything else? (Additional Context)

chasingegg commented Aug 16, 2024

xiaofan-luan commented Aug 16, 2024

xiaofan-luan commented Aug 16, 2024

liliu-z commented Aug 19, 2024