Skip to content

Commit

Permalink
Release new docs to master
Browse files Browse the repository at this point in the history
  • Loading branch information
Milvus-doc-bot authored and Milvus-doc-bot committed Nov 27, 2024
1 parent d34542f commit 8e6688b
Showing 1 changed file with 44 additions and 0 deletions.
44 changes: 44 additions & 0 deletions v2.5.x/site/en/userGuide/schema/sparse_vector.md
Original file line number Diff line number Diff line change
Expand Up @@ -559,3 +559,47 @@ curl --request POST \​

For more information on similarity search parameters, refer to [​Basic ANN Search](single-vector-search.md).​

## Limits

When using sparse vectors in Milvus, consider the following limits:

- Currently, only the __IP__ distance metric is supported for sparse vectors.

- For sparse vector fields, only the __SPARSE_INVERTED_INDEX__ and __SPARSE_WAND__ index types are supported.

- Currently, [range search](range-search.md), [grouping search](grouping-search.md), and [search iterator](with-iterators.md) are not supported for sparse vectors.

## FAQ

- __What distance metric is supported for sparse vectors?__

Sparse vectors only support the Inner Product (IP) distance metric due to the high dimensionality of sparse vectors, which makes L2 distance and cosine distance impractical.

- __Can you explain the difference between SPARSE_INVERTED_INDEX and SPARSE_WAND, and how do I choose between them?__

__SPARSE_INVERTED_INDEX__ is a traditional inverted index, while __SPARSE_WAND__ uses the [Weak-AND](https://dl.acm.org/doi/10.1145/956863.956944) algorithm to reduce the number of full IP distance evaluations during search. __SPARSE_WAND__ is typically faster, but its performance can decline with increasing vector density. To choose between them, conduct experiments and benchmarks based on your specific dataset and use case.

- __How should I choose the drop_ratio_build and drop_ratio_search parameters?__

The choice of __drop_ratio_build__ and __drop_ratio_search__ depends on the characteristics of your data and your requirements for search latency/throughput and accuracy.

- __What data types are supported for sparse embeddings?__

The dimension part must be an unsigned 32-bit integer, and the value part can be a non-negative 32-bit floating-point number.

- __Can the dimension of a sparse embedding be any discrete value within the uint32 space?__

Yes, with one exception. The dimension of a sparse embedding can be any value in the range of `[0, maximum of uint32)`. This means you cannot use the maximum value of uint32.
- __Are searches on growing segments conducted through an index or by brute force?__
Searches on growing segments are conducted through an index of the same type as the sealed segment index. For new growing segments before the index is built, a brute force search is used.
- __Is it possible to have both sparse and dense vectors in a single collection?__
Yes, with multiple vector type support, you can create collections with both sparse and dense vector columns and perform hybrid searches on them.
- __What are the requirements for sparse embeddings to be inserted or searched?__
Sparse embeddings must have at least one non-zero value, and vector indices must be non-negative.

0 comments on commit 8e6688b

Please sign in to comment.