Use Pastva and Henzinger's Unique Node Table #697
Labels
📁 internal
This is where the ✨magic✨happens
✨ optimisation
It's all about speed / space
🎓 student project
Work, work... but academic!
Milestone
In #444 , we added a unique node table for Adiar to skip serializing small BDDs to and from the disk. For simplicity, we reused the node table design from BuDDy. Yet, in [Pastva2023] a different design was suggested. This one has multiple benefits:
free
chain to be created on-the-fly. This solves BuDDy's high initialization time.The design in [Pastva2023] is based on an extremely interesting insight: the number of BDD nodes with a lot of parents are guaranteed to be very few in practice. In fact, 70% of all nodes have a single parent, 20% have two parents, 3% have three, and so on.
Simplified: Linear Buckets
Based on the observation above, we can do as follows:
entry<T>
with a parents pointer. Here, parents points to the start of the bucket of all nodes with this as its child. Specifically, where this node is its minimum child (the use of minimum avoids terminals for all but the bottom-most nodes).In practice, we should expect the
parent
pointer to immediately provide the desired node. This leaves the bottom-most nodes with only terminal children. Since they are not explicit nodes in the table (?), then we need to store two additional parent pointers next to the table. Yet, these terminals can have up to 3n parents (for n variables) when we use the table for both BDDs and ZDDs. Here, my simplified linear bucket above would have terrible performance.Proper: Trie-buckets
Pastva and Henzinger proposed a more sophisticated version of this idea that also solves the above issue with terminals:
Also, replace the next pointer in
entry<T>
withnext_0
, andnext_1
. These two values correspond to edges in something quite similar to a prefix-trie. This trie is traversed bit-by-bit based on the hash-value of the variable label and the index of the other child, i.e. the maximum one. While evaluating this requires a few more CPU cycles (and an extra if-statement) it completely collapses the depth of each buckets.Note: I am unsure how the deal with hash collisions in [Pastva2023]. A possible hash function that is guaranteed to not create collisions is the following: zip the maximum node index with the minimum node level. This uniquely identifies the node in question (together with the
parent
we started from).Note: Note that, unlike a prefix-trie one does not traverse up to a unique prefix before being given the result. Instead, nodes are placed as early as possible on a path that matches its prefix. This is a vital insight to be able to include the next_0 and the next_1 pointers within
entry<T>
.Note: One can maybe draw inspiration from the Anatree I implemented implemented (for an entirely different project) a few years ago.
Alternative : Tree-buckets
A simple tree is also a possibility; if the hash function is "random" enough, then the tree is guaranteed to be relatively balanced.
References
[Lovato2014] Alberto Lovato, Damiano Macedonio, and Fausto Spoto. “A Thread-Safe Library for Binary Decision Diagrams”. In: Software Engineering and Formal Methods (2014)
[Pastva2023] Samuel Pastva and Thomas Henzinger. “Binary Decision Diagrams on Modern Hardware”. In: Proceedings of the 23rd Conference on Formal Methods in Computer-Aided Design (2023)
The text was updated successfully, but these errors were encountered: