Use Pastva and Henzinger's Unique Node Table #697

SSoelvsten · 2024-11-08T09:56:22Z

In #444 , we added a unique node table for Adiar to skip serializing small BDDs to and from the disk. For simplicity, we reused the node table design from BuDDy. Yet, in [Pastva2023] a different design was suggested. This one has multiple benefits:

It is much more cache-friendly, i.e. it decreases (in practice) the number of look-ups in RAM. This makes it much faster in practice.
The buckets are solely stored in allocated nodes, i.e. it allows for the free chain to be created on-the-fly. This solves BuDDy's high initialization time.
The table seems relatively easy to make thread-safe (see also Thread Safety #650). Here, we can also learn quite a bit from the design in [Lovato2014]

The design in [Pastva2023] is based on an extremely interesting insight: the number of BDD nodes with a lot of parents are guaranteed to be very few in practice. In fact, 70% of all nodes have a single parent, 20% have two parents, 3% have three, and so on.

Simplified: Linear Buckets

Based on the observation above, we can do as follows:

Replace the hash pointer in entry<T> with a parents pointer. Here, parents points to the start of the bucket of all nodes with this as its child. Specifically, where this node is its minimum child (the use of minimum avoids terminals for all but the bottom-most nodes).
The next pointer stores a singly-linked list of all other parents with this as its minimum child. Again, whether a new node should be added to the beginning, the end, or should be kept sorted with respect to the nodes and/or the table is up for experiments.

In practice, we should expect the parent pointer to immediately provide the desired node. This leaves the bottom-most nodes with only terminal children. Since they are not explicit nodes in the table (?), then we need to store two additional parent pointers next to the table. Yet, these terminals can have up to 3n parents (for n variables) when we use the table for both BDDs and ZDDs. Here, my simplified linear bucket above would have terrible performance.

Proper: Trie-buckets

Pastva and Henzinger proposed a more sophisticated version of this idea that also solves the above issue with terminals:

Also, replace the next pointer in entry<T> with next_0, and next_1. These two values correspond to edges in something quite similar to a prefix-trie. This trie is traversed bit-by-bit based on the hash-value of the variable label and the index of the other child, i.e. the maximum one. While evaluating this requires a few more CPU cycles (and an extra if-statement) it completely collapses the depth of each buckets.

Note: I am unsure how the deal with hash collisions in [Pastva2023]. A possible hash function that is guaranteed to not create collisions is the following: zip the maximum node index with the minimum node level. This uniquely identifies the node in question (together with the parent we started from).

Note: Note that, unlike a prefix-trie one does not traverse up to a unique prefix before being given the result. Instead, nodes are placed as early as possible on a path that matches its prefix. This is a vital insight to be able to include the next_0 and the next_1 pointers within entry<T>.

Note: One can maybe draw inspiration from the Anatree I implemented implemented (for an entirely different project) a few years ago.

Alternative : Tree-buckets

A simple tree is also a possibility; if the hash function is "random" enough, then the tree is guaranteed to be relatively balanced.

References

[Lovato2014] Alberto Lovato, Damiano Macedonio, and Fausto Spoto. “A Thread-Safe Library for Binary Decision Diagrams”. In: Software Engineering and Formal Methods (2014)
[Pastva2023] Samuel Pastva and Thomas Henzinger. “Binary Decision Diagrams on Modern Hardware”. In: Proceedings of the 23rd Conference on Formal Methods in Computer-Aided Design (2023)

The text was updated successfully, but these errors were encountered:

SSoelvsten added ✨ optimisation It's all about speed / space 📁 internal This is where the ✨magic✨happens 🎓 student project Work, work... but academic! labels Nov 8, 2024

SSoelvsten added this to the v3.0 : Higher Performance for Small Instances milestone Nov 8, 2024

SSoelvsten mentioned this issue Nov 8, 2024

Add a Unique Node Table #444

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Pastva and Henzinger's Unique Node Table #697

Use Pastva and Henzinger's Unique Node Table #697

SSoelvsten commented Nov 8, 2024 •

edited

Loading

Use Pastva and Henzinger's Unique Node Table #697

Use Pastva and Henzinger's Unique Node Table #697

Comments

SSoelvsten commented Nov 8, 2024 • edited Loading

Simplified: Linear Buckets

Proper: Trie-buckets

Alternative : Tree-buckets

References

SSoelvsten commented Nov 8, 2024 •

edited

Loading