diff --git a/v2.5.x/site/en/userGuide/schema/sparse_vector.md b/v2.5.x/site/en/userGuide/schema/sparse_vector.md index fee399fa2..e0062c03c 100644 --- a/v2.5.x/site/en/userGuide/schema/sparse_vector.md +++ b/v2.5.x/site/en/userGuide/schema/sparse_vector.md @@ -559,3 +559,47 @@ curl --request POST \​ For more information on similarity search parameters, refer to [​Basic ANN Search](single-vector-search.md).​ +## Limits + +When using sparse vectors in Milvus, consider the following limits: + +- Currently, only the __IP__ distance metric is supported for sparse vectors. + +- For sparse vector fields, only the __SPARSE_INVERTED_INDEX__ and __SPARSE_WAND__ index types are supported. + +- Currently, [range search](range-search.md), [grouping search](grouping-search.md), and [search iterator](with-iterators.md) are not supported for sparse vectors. + +## FAQ + +- __What distance metric is supported for sparse vectors?__ + + Sparse vectors only support the Inner Product (IP) distance metric due to the high dimensionality of sparse vectors, which makes L2 distance and cosine distance impractical. + +- __Can you explain the difference between SPARSE_INVERTED_INDEX and SPARSE_WAND, and how do I choose between them?__ + + __SPARSE_INVERTED_INDEX__ is a traditional inverted index, while __SPARSE_WAND__ uses the [Weak-AND](https://dl.acm.org/doi/10.1145/956863.956944) algorithm to reduce the number of full IP distance evaluations during search. __SPARSE_WAND__ is typically faster, but its performance can decline with increasing vector density. To choose between them, conduct experiments and benchmarks based on your specific dataset and use case. + +- __How should I choose the drop_ratio_build and drop_ratio_search parameters?__ + + The choice of __drop_ratio_build__ and __drop_ratio_search__ depends on the characteristics of your data and your requirements for search latency/throughput and accuracy. + +- __What data types are supported for sparse embeddings?__ + + The dimension part must be an unsigned 32-bit integer, and the value part can be a non-negative 32-bit floating-point number. + +- __Can the dimension of a sparse embedding be any discrete value within the uint32 space?__ + + Yes, with one exception. The dimension of a sparse embedding can be any value in the range of `[0, maximum of uint32)`. This means you cannot use the maximum value of uint32. + +- __Are searches on growing segments conducted through an index or by brute force?__ + + Searches on growing segments are conducted through an index of the same type as the sealed segment index. For new growing segments before the index is built, a brute force search is used. + +- __Is it possible to have both sparse and dense vectors in a single collection?__ + + Yes, with multiple vector type support, you can create collections with both sparse and dense vector columns and perform hybrid searches on them. + +- __What are the requirements for sparse embeddings to be inserted or searched?__ + + Sparse embeddings must have at least one non-zero value, and vector indices must be non-negative. +