Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intptrcast: remove information about dead allocations #3122

Merged
merged 4 commits into from
Oct 19, 2023

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Oct 14, 2023

The discussion in #3103 convinced me we don't have to keep int_to_ptr_map around for dead allocations. But we should not make that depend on the GC, we can just tie it to when the allocation gets freed. That means everything still behaves deterministically, if anything weird happens (but it shouldn't).

r? @saethlin
Only the first and last commit contain logic changes, the 2nd commit just moves code around a bit.

@rustbot
Copy link
Collaborator

rustbot commented Oct 14, 2023

Error: The feature assign is not enabled in this repository.
To enable it add its section in the triagebot.toml in the root of the repository.

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #t-infra on Zulip.

@RalfJung RalfJung changed the title intrptrcast: remove information about dead allocations intptrcast: remove information about dead allocations Oct 14, 2023
@RalfJung
Copy link
Member Author

Uh, what... I was sure I ran the tests locally...

@RalfJung
Copy link
Member Author

Okay, base_addr needs to stay, but int_to_ptr_map can be removed.

@RalfJung RalfJung force-pushed the intrptrcast-clean branch 2 times, most recently from cf83d11 to 75b92dc Compare October 14, 2023 10:47
@saethlin
Copy link
Member

Okay, base_addr needs to stay, but int_to_ptr_map can be removed.

That's because we need the GC to be able to delete entries from base_addr, right?

@RalfJung
Copy link
Member Author

Yes.

Comment on lines 268 to 280
// We can *not* remove this from `base_addr`, since `addr_from_alloc_id` is called on each
// attempt at a memory access to determine the allocation ID and offset -- and there can
// still be pointers with `dead_id` that one can attempt to use for a memory access.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this is problematic for accesses. Even if we remove an allocation which still has pointers to it, the pointer is still dangling right? I expected the problem here to be converting pointers to freed allocations to integers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When such a dangling pointer gets used, the first thing that happens is that we call ptr_get_alloc, which is expected to return the AllocId and the offset inside the allocation. Once we remove an allocation from base_addr we can no longer determine the offset, so we have a problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what case would we need to know the offset inside the allocation and also be okay with using a pointer to a deallocated allocation?

Copy link
Member Author

@RalfJung RalfJung Oct 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are implicitly suggesting a rework of the Machine trait here. And it's not super trivial since in error messages we want to distinguish "ptr with no provenance" from "ptr with provenance of a dangling allocation", so just returning None when asked for "what alloc_id + offset is this provenance" would not work.

So this can probably be done but it's a bigger change than what this PR does. So far the API is designed around the idea that "get alloc_id + offset" and "determine whether alloc is live and offset inbounds" are two separate independent steps. That's a nice and simple interface; we'd have to make it more complicated if we want to be able to remove pointers from base_addr on deallocation.

Copy link
Member

@saethlin saethlin Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are implicitly suggesting a rework of the Machine trait here.

I didn't mean to suggest that. I think this whole situation you're describing is part of why I was so confused, I don't know the interpreter internals well enough to reason from them as opposed to from the language. And to this specific question of mine above, I was half expecting you to produce some obscure Rust code in answer.

Can you adjust this comment a bit to indicate that the design of the interpreter requires us to be able to do "get alloc_id + offset" before we do UB detection? Perhaps something like

        // We can *not* remove this from `base_addr`, since the interpreter design requires that we be able to retrieve
        // an AllocId + offset for any memory access *before* we check if the access is valid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me with any such tweak

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay. :) I incorporated that into the comment.

@RalfJung RalfJung force-pushed the intrptrcast-clean branch 2 times, most recently from ca60134 to f22fede Compare October 15, 2023 17:15
@RalfJung
Copy link
Member Author

@saethlin any objections to landing this, any remaining ideas for how to improve this?

I think making base_addr cleanable is a larger refactor and I'm not yet convinced it's worth the impact on the rest of the code. It's definitely not something I want to do in this PR.

@bors
Copy link
Contributor

bors commented Oct 19, 2023

☔ The latest upstream changes (presumably #3130) made this pull request unmergeable. Please resolve the merge conflicts.

@RalfJung
Copy link
Member Author

@bors r=saethlin

@bors
Copy link
Contributor

bors commented Oct 19, 2023

📌 Commit f2cb752 has been approved by saethlin

It is now in the queue for this repository.

@bors
Copy link
Contributor

bors commented Oct 19, 2023

⌛ Testing commit f2cb752 with merge e9e1b3d...

@bors
Copy link
Contributor

bors commented Oct 19, 2023

☀️ Test successful - checks-actions
Approved by: saethlin
Pushing e9e1b3d to master...

@bors bors merged commit e9e1b3d into rust-lang:master Oct 19, 2023
7 checks passed
@RalfJung RalfJung deleted the intrptrcast-clean branch October 20, 2023 06:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants