-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Special operations on indexes
There are a few additional operations that are not supported for all types of indexes.
The methods reconstruct
and reconstruct_n
reconstruct one or several vector(s) from an index, given their ids.
Example usage: test_index_composite.py
Supported for IndexFlat
, IndexIVFFlat
(call make_direct_map
first), IndexIVFPQ
(same), IndexPreTransform
(provided the underlying transform supports it)
The method remove_ids
removes a subset of vectors from an index. It takes an IDSelector
object that is called for every element in the index to decide whether it should be removed. IDSelectorBatch
will do this for a list of indices. The Python interface constructs this efficiently.
NB that since it does a pass over the whole database, this is efficient only when a significant number of vectors needs to be removed.
Example: test_index_composite.py
Supported by IndexFlat
, IndexIVFFlat
, IndexIVFPQ
, IDMap
.
Note that there is a semantic difference when removing ids from sequential indexes vs. when removing them from an IndexIVF
:
-
for sequential indexes (
IndexFlat
,IndexPQ
,IndexLSH
), the removal operation shifts the ids of vectors above the removed vector id. -
the
IndexIVF
s store the ids of vectors explicitly, so the ids of other vectors are not changed.
The method range_search
returns all vectors within a radius around the query point (as opposed to the k nearest ones). Since the result lists for each query are of different sizes, it must be handled specially:
-
in C++ it returns the results in a pre-allocated
RangeSearchResult
[https://github.com/facebookresearch/faiss/blob/master/AuxIndexStructures.h#L35] structure -
in Python, the results are returned as a triplet of 1D arrays
lims, D, I
, where result for query i is inI[lims[i]:lims[i+1]]
(indices of neighbors),D[lims[i]:lims[i+1]]
(distances).
Supported by (CPU only): IndexFlat
, IndexIVFFlat
, IndexScalarQuantizer
, IndexIVFScalarQuantizer
.
The methods:
-
merge_from
copies another index to this and deallocates it on-the-fly. You can useivflib::merge_into
forIndexIVF
s wrapped in a pre-transform. -
copy_subset_to
copies a subset of this codes to another index. Example usage: to build indexes on a GPU and move them to CPU afterwards
The functions are implemented only for IndexIVF
subclasses because they are mainly interesting for large indexes.
Faiss building blocks: clustering, PCA, quantization
Index IO, cloning and hyper parameter tuning
Threads and asynchronous calls
Inverted list objects and scanners
Indexes that do not fit in RAM
Brute force search without an index
Fast accumulation of PQ and AQ codes (FastScan)
Setting search parameters for one query
Binary hashing index benchmark