You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.
As I see #264 that add Hamming distance in custom scoring it is a great functionality. i see there is bit_hamming space space_bit_hamming in nmslib. i think may be we could add this into plugin.
i refer to the code space_bit_hamming and space_bit_hamming_test, may be we could add "SpaceBitVector" into plugin and support bit_hamming space which is no optimized index.
i also refer to PR: #161 which add no optimized index for "negdotprod", i see the nmslib's python_binding code python_binding_nmslib, may be we could add a "save_data" into plugin and can store index and dataset for "no optimized index".
so i submit a PR for this.
The text was updated successfully, but these errors were encountered:
We have few concerns with the non optimized index for Hamming distance in nmslib. In Elasticsearch we would store serialized graph per segment which means one additional file per knn_vector field . For non optimized indices like Hamming we will end up having 2 files per segment, one to store the graph and one for the data(elements in the graph). So for large data set, it is possible to end up with large number of segments which could potentially exhaust file descriptors and run into issues of no available file descriptors. The Pr you mentioned #161 is put into hold for the very same reason. We worked with nmslib team to make optimized index for negative dot product to have one file per segment. We will have a new PR which would enable negative dot product with optimized index.
There are couple of suggestions
Enable optimized index support for Hamming in nmslib and then incorporated the changes in k-NN plugin
Make use of custom scoring feature for Hamming.
How about you start with the 2nd approach and let us know if you see any performance concerns with custom scoring for Hamming?
We could then take a call about having optimized/non optimized Hamming index? Till then we would like to keep your PR(#284) for hamming support on hold.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
As I see #264 that add Hamming distance in custom scoring it is a great functionality. i see there is bit_hamming space space_bit_hamming in nmslib. i think may be we could add this into plugin.
i refer to the code space_bit_hamming and space_bit_hamming_test, may be we could add "SpaceBitVector" into plugin and support bit_hamming space which is no optimized index.
i also refer to PR: #161 which add no optimized index for "negdotprod", i see the nmslib's python_binding code python_binding_nmslib, may be we could add a "save_data" into plugin and can store index and dataset for "no optimized index".
so i submit a PR for this.
The text was updated successfully, but these errors were encountered: