Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Pastva and Henzinger's Unique Node Table #697

Open
SSoelvsten opened this issue Nov 8, 2024 · 0 comments
Open

Use Pastva and Henzinger's Unique Node Table #697

SSoelvsten opened this issue Nov 8, 2024 · 0 comments
Labels
📁 internal This is where the ✨magic✨happens ✨ optimisation It's all about speed / space 🎓 student project Work, work... but academic!

Comments

@SSoelvsten
Copy link
Owner

SSoelvsten commented Nov 8, 2024

In #444 , we added a unique node table for Adiar to skip serializing small BDDs to and from the disk. For simplicity, we reused the node table design from BuDDy. Yet, in [Pastva2023] a different design was suggested. This one has multiple benefits:

  • It is much more cache-friendly, i.e. it decreases (in practice) the number of look-ups in RAM. This makes it much faster in practice.
  • The buckets are solely stored in allocated nodes, i.e. it allows for the free chain to be created on-the-fly. This solves BuDDy's high initialization time.
  • The table seems relatively easy to make thread-safe (see also Thread Safety #650). Here, we can also learn quite a bit from the design in [Lovato2014]

The design in [Pastva2023] is based on an extremely interesting insight: the number of BDD nodes with a lot of parents are guaranteed to be very few in practice. In fact, 70% of all nodes have a single parent, 20% have two parents, 3% have three, and so on.

Simplified: Linear Buckets

Based on the observation above, we can do as follows:

  • Replace the hash pointer in entry<T> with a parents pointer. Here, parents points to the start of the bucket of all nodes with this as its child. Specifically, where this node is its minimum child (the use of minimum avoids terminals for all but the bottom-most nodes).
  • The next pointer stores a singly-linked list of all other parents with this as its minimum child. Again, whether a new node should be added to the beginning, the end, or should be kept sorted with respect to the nodes and/or the table is up for experiments.

In practice, we should expect the parent pointer to immediately provide the desired node. This leaves the bottom-most nodes with only terminal children. Since they are not explicit nodes in the table (?), then we need to store two additional parent pointers next to the table. Yet, these terminals can have up to 3n parents (for n variables) when we use the table for both BDDs and ZDDs. Here, my simplified linear bucket above would have terrible performance.

Proper: Trie-buckets

Pastva and Henzinger proposed a more sophisticated version of this idea that also solves the above issue with terminals:

  • Also, replace the next pointer in entry<T> with next_0, and next_1. These two values correspond to edges in something quite similar to a prefix-trie. This trie is traversed bit-by-bit based on the hash-value of the variable label and the index of the other child, i.e. the maximum one. While evaluating this requires a few more CPU cycles (and an extra if-statement) it completely collapses the depth of each buckets.

    Note: I am unsure how the deal with hash collisions in [Pastva2023]. A possible hash function that is guaranteed to not create collisions is the following: zip the maximum node index with the minimum node level. This uniquely identifies the node in question (together with the parent we started from).

    Note: Note that, unlike a prefix-trie one does not traverse up to a unique prefix before being given the result. Instead, nodes are placed as early as possible on a path that matches its prefix. This is a vital insight to be able to include the next_0 and the next_1 pointers within entry<T>.

    Note: One can maybe draw inspiration from the Anatree I implemented implemented (for an entirely different project) a few years ago.

Alternative : Tree-buckets

A simple tree is also a possibility; if the hash function is "random" enough, then the tree is guaranteed to be relatively balanced.

References

  • [Lovato2014] Alberto Lovato, Damiano Macedonio, and Fausto Spoto. “A Thread-Safe Library for Binary Decision Diagrams”. In: Software Engineering and Formal Methods (2014)

  • [Pastva2023] Samuel Pastva and Thomas Henzinger. “Binary Decision Diagrams on Modern Hardware”. In: Proceedings of the 23rd Conference on Formal Methods in Computer-Aided Design (2023)

@SSoelvsten SSoelvsten added ✨ optimisation It's all about speed / space 📁 internal This is where the ✨magic✨happens 🎓 student project Work, work... but academic! labels Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📁 internal This is where the ✨magic✨happens ✨ optimisation It's all about speed / space 🎓 student project Work, work... but academic!
Projects
None yet
Development

No branches or pull requests

1 participant