We benchmarked lookups (keys within the set and keys not in the set) and
insert/delete pairs for ARTSet
(our ART implementation), CachingARTSet
(an
ART with a prefix cache), rust's BTreeSet
and rust's HashSet
. We use random
integer keys where the keys are chosen from 0 to the size of the set ("dense")
and where they are chosen from all possible 64-bit integers ("sparse"). We also
include benchmarks for random UTF-8 strings.
Here we see that the ART generally does somewhere between the performance of the BTree and the hash table. The cache is little help for small tables or dense keys. This makes sense, as the dense keys will often share a prefix, making the likely depth of the tree fairly short, while prefix compression will ensure the absolute depth of the tree is quite low when there are few elements. Prefix caching does, however, make a substantial difference for sparse integers in larger tables.
There is a similar story here as to the integer workloads above. The benefit of caching here is, however, more pronounced for both lookups and mutations.