Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect handling of brk syscall when shrinking the heap #2776

Merged
merged 5 commits into from
Dec 19, 2024

Conversation

cube0x8
Copy link
Contributor

@cube0x8 cube0x8 commented Dec 17, 2024

Previously, the QEMU snapshot hook for the brk syscall only handled cases where the heap was growing. Each brk call would incrementally update the mapping in the snapshot by assuming that the new brk value was always larger than the previous one.

This is not always the case tho, because the heap can also shrink.

Now:

  • The initial_target_brk is stored in SnapshotModule
  • When updating the mapping (change_mapped), initial_target_brk is used as start_addr, and the new brk value as end_addr.
  • If the new brk value is less than initial_target_brk, the mapping remains unchanged, aligning with QEMU's behavior in such cases.

@tokatoka tokatoka requested a review from rmalmain December 17, 2024 20:09
@domenukk
Copy link
Member

Won't this mask some potential bugs that access mem out of the page? I guess they could be found with QAsan or similar, so may not be a huge issue, just pointing it out.

@cube0x8
Copy link
Contributor Author

cube0x8 commented Dec 18, 2024

Hey @domenukk

Hmmm, I don't see how this could mask potential bugs.

QEMU is responsible for intercepting the target's brk syscall and making the heap segment to grow and shrink by calling target_mmap and target_munmap respectively. If a process accesses an address out of the mapped region, then it will crash.

LibAFL side, we are just complying with QEMU behaviour to keep the snapshot healthy: we intercept brk syscalls and make the snapshotted brk interval to grow or decrease.

@domenukk
Copy link
Member

Well you don't shrink the page, right? Ergo, anything accessing data on the shrinked part of the page will continue to work in our case, but crash in a real target, or am I missing something?

@domenukk
Copy link
Member

Ah qemu never shrinks anyway? Well ...

@cube0x8
Copy link
Contributor Author

cube0x8 commented Dec 18, 2024

Ah qemu never shrinks anyway? Well ...

why? yes, QEMU shrinks: https://github.com/AFLplusplus/qemu-libafl-bridge/blob/b01a0bc334cf11bfc5e8f121d9520ef7f47dbcd1/linux-user/syscall.c#L843

And that's the point, we weren't shrinking snapshot side. Before it was like this:

h.change_mapped(h.brk, (result - h.brk) as usize, Some(MmapPerms::ReadWrite));

with h.brk == QEMU's old_brk and result == QEMU's new_brk. This means that we always considered new_brk > old_brk. But the latter is not always the case, so what was happening if:
old_brk == 0x4
new_brk == 0x3
initial_brk == 0x1

QEMU was handling it correctly: target_munmap(new_brk, old_brk - new_brk);, but we were doing:

h.change_mapped(0x4, -0x1 as usize, Some(MmapPerms::ReadWrite));

do you see the error now?

Now we keep initial_brk in the SnapshotModule and we just do new_brk - initial_brk:

h.change_mapped(0x1,  0x2 as usize, Some(MmapPerms::ReadWrite));

Well you don't shrink the page, right? Ergo, anything accessing data on the shrinked part of the page will continue to work in our case, but crash in a real target, or am I missing something?

Yes we are shrinking now, but I don't fully understand your "will continue working in our case". If a target was accessing an unmapped region (the part that shrinked), it would crash and LibAFL would have caught the crash and reported it. Here we are talking about snapshot stuff tho, so this problem caused the snapshot Intervals in LibAFL to not correspond to the real target's mapping, causing errors (and SIGSEGVs) during snapshot restore.

I hope I could explain myself better this time. I'm here if you have other questions :)

Copy link
Collaborator

@rmalmain rmalmain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks ok to me.
@cube0x8 is right imo, we need to take into account brk update when it shrinks as well.

@@ -11,7 +11,7 @@ use crate::cargo_add_rpath;

pub const QEMU_URL: &str = "https://github.com/AFLplusplus/qemu-libafl-bridge";
pub const QEMU_DIRNAME: &str = "qemu-libafl-bridge";
pub const QEMU_REVISION: &str = "b01a0bc334cf11bfc5e8f121d9520ef7f47dbcd1";
pub const QEMU_REVISION: &str = "e558cafe7cf565465e9850ccb310c9d40eecc723";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update this to 06bf8facec33548840413fba1b20858f58e38e2d please?

@domenukk domenukk merged commit e46cf8a into AFLplusplus:main Dec 19, 2024
103 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants