Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make AtomTable fully safe and prepare for garbage collection #2736

Draft
wants to merge 14 commits into
base: rebis-dev
Choose a base branch
from

Conversation

adri326
Copy link

@adri326 adri326 commented Dec 31, 2024

This is a followup to #2727 that gives AtomTable the ability to verify that Atoms are safe to dereference and the ability to shuffle their layout for garbage collection purposes, by adding another layer of indirection.


Prior to this change, Atoms would contain the offset of their AtomHeader within a buffer. This lets us resize the buffer to add new atoms, but it is prone to issues caused by atoms with invalid offsets and it doesn't let us defragment the AtomTable once garbage collection is introduced.

                 |   buffer   |
                 +------------+
                 |    ...     |
Atom(0x12580) -> | AtomHeader | @12580
                 | "Hello wo" | @12588
                 | "rld!"____ | @12590
                 |    ...     |

With this change, the Atoms now store an index into an array of indices, meaning that verifying the validity of an atom is as simple as checking that it is within this array:

            | offsets |    |   buffer   |
            +---------+    +------------+
            |   ...   |    |    ...     |
Atom(13) -> | 0x12580 | -> | AtomHeader | @12580
            |   ...   |    | "Hello wo" | @12588
            |   ...   |    | "rld!"____ | @12590
            |   ...   |    |    ...     |

This also lets us safely change the order of atoms within the buffer, by changing the offsets, without needing to modify the existing Atoms, which paves the way towards garbage collection within AtomTable.

One important detail with my implementation is that the offsets array is stored within an Arcu, which means that the read from the buffer in Atom::as_ptr now depends on atom_table.inner.offsets.read(), which is an acquire atomic operation. When a new atom is added to the table with AtomTable::build_with, the write to the buffer is sequenced before atom_table.inner.offsets.replace(new_offsets): Atom::as_ptr is now guaranteed (by my limited experience with atomics) to observe the changes as expected.

Instances where the lack of this property cause actual issues are exceedingly rare. A thread would need to somehow have access to an atom offset created from another thread, without synchronization, and immediately try to read its data.

The Sync-ness of AtomTable is now proved, making it (hopefully) safe :)


Please don't be scared by the number of commits and the number of lines changed, they will reduce once #2727 gets merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant