-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smarter bulk & unordered scans #43
Milestone
Comments
Those changes should preceed #17 to have a finalized scan interface. |
ashvardanian
added a commit
that referenced
this issue
Sep 5, 2022
Fix: Python build with new scans #43 Fix: retrieving the gist of document fields.
If the |
DarvinHarutyunyan
pushed a commit
that referenced
this issue
Dec 9, 2022
Fix: Python build with new scans #43 Fix: retrieving the gist of document fields.
DarvinHarutyunyan
pushed a commit
that referenced
this issue
Dec 9, 2022
DarvinHarutyunyan
pushed a commit
that referenced
this issue
Dec 9, 2022
ashvardanian
modified the milestones:
v0.5: Snapshots, NetworkX, Vector Search,
v0.6: Sampling, Replication, Schema Validation, JS
Jan 21, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Currently
ukv_scan
is only working for fully consistent sorted exported of keys from collections.With the
bulk
flag we allow prioritizing throughput over consistency, but a point can be made, that ML-like pipelines don’t need any dependency in operations whatsoever. Instead they may use scans to uniformly random-sample entries, which would in turn require a full scan of keys. If the user leavesstart_key
unset, we can perform the bulk sampling behind the curtains ourselves.It will make the interface more ugly by making a function dual-use, but will keep the interface short. Worth considering.
The text was updated successfully, but these errors were encountered: