Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overlay handling with fuse-overlayfs #1062

Merged
merged 36 commits into from
Sep 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
be28a5f
feat: Implement FUSE-based overlay mount for containerexec and runexec
Jul 2, 2024
0facbcf
feat: Support FUSE-based overlay mount for benchexec
Jul 2, 2024
85f02ca
Only if the kernel overlay fails, try using fuse-overlayfs
Jul 7, 2024
69e929e
refactor: added user-visible messages and some refactoring
younghojan Jul 10, 2024
9818716
feat: Clear ambient capabilities in drop_capabilities() and add const…
Aug 4, 2024
3f69c41
Merge branch 'main' into gsoc-overlay-handling-with-fuse-overlayfs-dev
younghojan Aug 4, 2024
4ee99a3
Merge branch 'main' into gsoc-overlay-handling-with-fuse-overlayfs-dev
PhilippWendler Aug 6, 2024
a699b2f
chore: Fix bug in cap_permitted_to_ambient function
Aug 11, 2024
8006c20
fix: Use single fusermount for all fuse-based overlays, and avoid mix…
Aug 11, 2024
195e4d0
chore: Add functions and extracted some code into functions, add comm…
Aug 13, 2024
00f9cb8
chore: Refactor some functions related to fuse-based overlay mounts a…
Aug 13, 2024
328aad4
chore: Refactor functions related to fuse-based overlay mounts and im…
Aug 13, 2024
de86749
chore: Refactor functions related to fuse-based overlay mounts and im…
Aug 14, 2024
a308c46
chore: Replace f-string in logging.debug with %s formatting
Aug 15, 2024
b941329
Add fuse-overlayfs to our recommended dependencies
PhilippWendler Aug 16, 2024
2529120
Update documentation on kernel overlayfs vs. fuse-overlayfs
PhilippWendler Aug 16, 2024
dde34ea
test: Add tests for checking fuse-overlayfs functionality and triple-…
younghojan Aug 17, 2024
b1a02d6
fix: Specify stdin=subprocess.DEVNULL when launching the fuse-overlay…
Aug 22, 2024
1f6d696
feat: Check if fuse-overlayfs meets the minimum version requirement, …
Aug 26, 2024
ba6bb91
Merge branch 'main' into gsoc-overlay-handling-with-fuse-overlayfs-dev
PhilippWendler Aug 26, 2024
dc482b2
fix: fix issue of checking for fuse-overlayfs functionality outside o…
Aug 28, 2024
e0aec8c
chore: Refactor and improve test_triple_nested_runexec
Aug 29, 2024
e0833b3
chore: Refactor fuse-overlayfs setup and error handling
Aug 29, 2024
147b4e2
Merge 'main' into gsoc-overlay-handling-with-fuse-overlayfs-dev
PhilippWendler Sep 2, 2024
b63db00
Refactor and improve fuse-overlay related tests
Sep 2, 2024
38a0508
Omit test_triple_nested_runexec when coverage testing
Sep 4, 2024
5d2a349
Refactor COV_CORE_SOURCE environment variable handling
Sep 4, 2024
a8a3516
Safely encode string for fuse-overlayfs paths
Sep 5, 2024
34f57f1
Refactor determine_directory_mode function for fuse-overlayfs compati…
Sep 5, 2024
2fd26ff
Refactor file handling in test_runexecutor.py for better readability
Sep 5, 2024
88db419
Refactor overlay mount error handling for better compatibility
Sep 5, 2024
2f9d52e
Fix typo
Sep 15, 2024
1c49af2
Refactor handling of COV_CORE_SOURCE environment variable in TestRunE…
Sep 15, 2024
ea92000
Change internal paths used for fuse-overlayfs mounts
PhilippWendler Sep 19, 2024
ec11b7f
Add logging about why fuse-overlayfs is used
PhilippWendler Sep 19, 2024
33249f1
Detect and error out if temp is not hidden and we use fuse-overlayfs
PhilippWendler Sep 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
335 changes: 298 additions & 37 deletions benchexec/container.py

Large diffs are not rendered by default.

5 changes: 4 additions & 1 deletion benchexec/containerized_tool.py
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,6 @@ def _init_container(

# Container config
container.setup_user_mapping(os.getpid(), uid, gid)
_setup_container_filesystem(temp_dir, dir_modes, container_system_config)
if container_system_config:
socket.sethostname(container.CONTAINER_HOSTNAME)
if not network_access:
Expand All @@ -225,6 +224,10 @@ def _init_container(
os.waitpid(pid, 0)
os._exit(0)

# We setup the container's filesystem in the child process.
# Delaying this until after the fork can avoid "Transport endpoint not connected" issue.
_setup_container_filesystem(temp_dir, dir_modes, container_system_config)

# Finalize container setup in child
container.mount_proc(container_system_config) # only possible in child
container.drop_capabilities()
Expand Down
13 changes: 13 additions & 0 deletions benchexec/libc.py
Original file line number Diff line number Diff line change
Expand Up @@ -184,14 +184,27 @@ class CapData(_ctypes.Structure):
_ctypes.POINTER(CapData * 2),
]

capget = _libc.capget
"""Get the capabilities of the current thread."""
capget.errcheck = _check_errno
capget.argtypes = [
_ctypes.POINTER(CapHeader),
_ctypes.POINTER(CapData * 2),
]

LINUX_CAPABILITY_VERSION_3 = 0x20080522 # /usr/include/linux/capability.h
LINUX_CAPABILITY_U32S_3 = 2 # /usr/include/linux/capability.h
CAP_SYS_ADMIN = 21 # /usr/include/linux/capability.h
PR_CAP_AMBIENT = 47 # /usr/include/linux/prctl.h
PR_CAP_AMBIENT_RAISE = 2 # /usr/include/linux/prctl.h
PR_CAP_AMBIENT_CLEAR_ALL = 4 # /usr/include/linux/prctl.h

prctl = _libc.prctl
"""Modify options of processes: http://man7.org/linux/man-pages/man2/prctl.2.html"""
prctl.errcheck = _check_errno
prctl.argtypes = [c_int, c_ulong, c_ulong, c_ulong, c_ulong]


# /usr/include/linux/prctl.h
PR_SET_DUMPABLE = 4
PR_GET_SECCOMP = 21
Expand Down
124 changes: 124 additions & 0 deletions benchexec/test_runexecutor.py
Original file line number Diff line number Diff line change
Expand Up @@ -1199,6 +1199,130 @@ def test_uptime_without_lxcfs(self):
uptime, 10, f"Uptime {uptime}s unexpectedly low in container"
)

def test_fuse_overlay(self):
PhilippWendler marked this conversation as resolved.
Show resolved Hide resolved
if not container.get_fuse_overlayfs_executable():
self.skipTest("fuse-overlayfs not available")
with tempfile.TemporaryDirectory(prefix="BenchExec_test_") as temp_dir:
test_file_path = os.path.join(temp_dir, "test_file")
with open(test_file_path, "wb") as test_file:
test_file.write(b"TEST_TOKEN")

self.setUp(
dir_modes={
"/": containerexecutor.DIR_READ_ONLY,
"/home": containerexecutor.DIR_HIDDEN,
"/tmp": containerexecutor.DIR_HIDDEN,
temp_dir: containerexecutor.DIR_OVERLAY,
},
)
result, output = self.execute_run(
"/bin/sh",
"-c",
f"if [ $({self.cat} {test_file_path}) != TEST_TOKEN ]; then exit 1; fi; \
{self.echo} TOKEN_CHANGED >{test_file_path}",
)
self.check_result_keys(result, "returnvalue")
self.check_exitcode(result, 0, "exit code of inner runexec is not zero")
self.assertTrue(
os.path.exists(test_file_path),
f"File '{test_file_path}' removed, output was:\n" + "\n".join(output),
)
with open(test_file_path, "rb") as test_file:
test_token = test_file.read()
self.assertEqual(
test_token.strip(),
b"TEST_TOKEN",
f"File '{test_file_path}' content is incorrect. Expected 'TEST_TOKEN', but got:\n{test_token}",
)

def test_triple_nested_runexec(self):
if not container.get_fuse_overlayfs_executable():
self.skipTest("missing fuse-overlayfs")

# Check if COV_CORE_SOURCE environment variable is set and remove it.
# This is necessary because the coverage tool will not work in the nested runexec.
coverage_env_var = os.environ.pop("COV_CORE_SOURCE", None)

with tempfile.TemporaryDirectory(prefix="BenchExec_test_") as temp_dir:
overlay_dir = os.path.join(temp_dir, "overlay")
os.makedirs(overlay_dir)
test_file = os.path.join(overlay_dir, "TEST_FILE")
output_dir = os.path.join(temp_dir, "output")
os.makedirs(output_dir)
mid_output_file = os.path.join(output_dir, "mid_output.log")
inner_output_file = os.path.join(output_dir, "inner_output.log")
with open(test_file, "w") as f:
f.write("TEST_TOKEN")
f.seek(0)

outer_cmd = [
"python3",
runexec,
"--full-access-dir",
"/",
"--overlay-dir",
overlay_dir,
"--full-access-dir",
output_dir,
"--hidden-dir",
"/tmp",
PhilippWendler marked this conversation as resolved.
Show resolved Hide resolved
"--output",
mid_output_file,
"--",
]
mid_cmd = [
"python3",
runexec,
"--full-access-dir",
"/",
"--overlay-dir",
overlay_dir,
"--full-access-dir",
output_dir,
"--hidden-dir",
"/tmp",
"--output",
inner_output_file,
"--",
]
inner_cmd = [
"/bin/sh",
"-c",
f"if [ $({self.cat} {test_file}) != TEST_TOKEN ]; then exit 1; fi; {self.echo} TOKEN_CHANGED >{test_file}",
]
combined_cmd = outer_cmd + mid_cmd + inner_cmd

self.setUp(
dir_modes={
"/": containerexecutor.DIR_FULL_ACCESS,
"/tmp": containerexecutor.DIR_HIDDEN,
overlay_dir: containerexecutor.DIR_OVERLAY,
output_dir: containerexecutor.DIR_FULL_ACCESS,
},
)
outer_result, outer_output = self.execute_run(*combined_cmd)
self.check_result_keys(outer_result, "returnvalue")
self.check_exitcode(
outer_result, 0, "exit code of outer runexec is not zero"
)
with open(mid_output_file, "r") as f:
self.assertIn("returnvalue=0", f.read().strip().splitlines())
self.assertTrue(
os.path.exists(test_file),
f"File '{test_file}' removed, output was:\n" + "\n".join(outer_output),
)
PhilippWendler marked this conversation as resolved.
Show resolved Hide resolved
with open(test_file, "r") as f:
test_token = f.read()
self.assertEqual(
test_token.strip(),
"TEST_TOKEN",
f"File '{test_file}' content is incorrect. Expected 'TEST_TOKEN', but got:\n{test_token}",
)

# Restore COV_CORE_SOURCE environment variable
if coverage_env_var is not None:
os.environ["COV_CORE_SOURCE"] = coverage_env_var


class _StopRunThread(threading.Thread):
def __init__(self, delay, runexecutor):
Expand Down
2 changes: 1 addition & 1 deletion debian/control
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Package: benchexec
Architecture: all
Pre-Depends: ${misc:Pre-Depends}
Depends: ${python3:Depends}, python3-pkg-resources, ${misc:Depends}, ucf
Recommends: cpu-energy-meter, libseccomp2, lxcfs, python3-coloredlogs, python3-pystemd
Recommends: cpu-energy-meter, fuse-overlayfs (>= 1.10), libseccomp2, lxcfs, python3-coloredlogs, python3-pystemd
Description: Framework for Reliable Benchmarking and Resource Measurement
BenchExec allows benchmarking non-interactive tools on Linux systems.
It measures CPU time, wall time, and memory usage of a tool,
Expand Down
17 changes: 12 additions & 5 deletions doc/INSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ SPDX-License-Identifier: Apache-2.0

The following packages are optional but recommended dependencies:
- [cpu-energy-meter] will let BenchExec measure energy consumption on Intel CPUs.
- [fuse-overlayfs] (version 1.10 or newer) allows to use the overlay directory mode for containers in cases where the kernel-based overlayfs does not work.
- [libseccomp2] provides better container isolation.
- [LXCFS] provides better container isolation.
- [coloredlogs] provides nicer log output.
Expand Down Expand Up @@ -115,7 +116,7 @@ Of course you can also install BenchExec in a virtualenv if you are familiar wit
On systems without systemd you can omit the `[systemd]` part.

Please make sure to configure cgroups as [described below](#setting-up-cgroups)
and install [cpu-energy-meter], [libseccomp2], [LXCFS], and [pqos_wrapper] if desired.
and install [cpu-energy-meter], [fuse-overlayfs], [libseccomp2], [LXCFS], and [pqos_wrapper] if desired.

### Containerized Environments

Expand All @@ -137,7 +138,7 @@ otherwise pip will try to download and build this module,
which needs a compiler and several development header packages.

Please make sure to configure cgroups as [described below](#setting-up-cgroups)
and install [cpu-energy-meter], [libseccomp2], [LXCFS], and [pqos_wrapper] if desired.
and install [cpu-energy-meter], [fuse-overlayfs], [libseccomp2], [LXCFS], and [pqos_wrapper] if desired.


## Kernel Requirements
Expand All @@ -155,7 +156,7 @@ on **Linux 5.11 or newer**, so we suggest at least this kernel version.
And if your system is using cgroups v2 (cf. below),
the full feature set requires **Linux 5.19 or newer**.

On kernels than 5.11, you need to avoid using the overlay filesystem (cf. below),
On kernels older than 5.11, you need to avoid using the kernel-based overlay filesystem (cf. below),
all other features are supported.
However, we strongly recommend to use at least **Linux 4.14 or newer**
because it reduces the overhead of BenchExec's memory measurements and limits.
Expand Down Expand Up @@ -188,8 +189,13 @@ that are not usable on all distributions by default:
- **Unprivileged Overlay Filesystem**: This is only available since Linux 5.11
(kernel option `CONFIG_OVERLAY_FS`),
but also present in all Ubuntu kernels, even older ones.
Users of older kernels on other distributions can still use container mode, but have to choose a different mode
of mounting the file systems in the container, e.g., with `--read-only-dir /` (see below).
Users of older kernels on other distributions can still use container mode,
but have to install [fuse-overlayfs] or choose a different mode
of mounting the file systems in the container, e.g., with `--read-only-dir /`
(cf. [container configuration](container.md#directory-access-modes)).
Note that the kernel-based overlayfs does not support some specific configurations
(such as the default mode of overlay for `/`),
so [fuse-overlayfs] is often useful or required anyway.

If container mode does not work, please check the [common problems](container.md#common-problems).

Expand Down Expand Up @@ -382,6 +388,7 @@ Please refer to the [development instructions](DEVELOPMENT.md).

[coloredlogs]: https://pypi.org/project/coloredlogs/
[cpu-energy-meter]: https://github.com/sosy-lab/cpu-energy-meter
[fuse-overlayfs]: https://github.com/containers/fuse-overlayfs
[libseccomp2]: https://github.com/seccomp/libseccomp
[LXCFS]: https://github.com/lxc/lxcfs
[pqos]: https://github.com/intel/intel-cmt-cat/tree/master/pqos
Expand Down
3 changes: 2 additions & 1 deletion doc/benchexec-in-container.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,8 @@ or
```
docker run --privileged --cap-drop=all -t my-container benchexec <arguments>
```

If you want BenchExec to use `fuse-overlayfs` in the container,
also specify `--device /dev/fuse`.

## BenchExec in Interactive Containers

Expand Down
15 changes: 11 additions & 4 deletions doc/container.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,9 @@ For each directory in the container one of the following four access modes can b
Writes to this directory will not be visible on the host.
- **read-only**: This directory is visible in the container, but read-only.
- **overlay**: This directory is visible in the container and
an [overlay filesystem](https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt)
an overlay filesystem (either from the
[kernel](https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt)
or [fuse-overlayfs])
is layered on top of it that redirects all write accesses.
This means that write accesses are possible in the container, but the effect of any write
is not visible on the host, only inside the container, and not written to disk.
Expand Down Expand Up @@ -205,13 +207,13 @@ You can still use BenchExec if you completely disable the container mode with `-
#### `Failed to configure container: [Errno 19] Creating overlay mount for '...' failed: No such device`
Your kernel does not support the overlay filesystem,
please check the [system requirements](INSTALL.md#kernel-requirements).
You can use a different access mode for directories, e.g., with `--read-only-dir /`.
You can use [fuse-overlayfs] or a different access mode for directories, e.g., with `--read-only-dir /`.
If some directories need to be writable, specify other directory modes for these directories as described above.

#### `Failed to configure container: [Errno 1] Creating overlay mount for '...' failed: Operation not permitted`
Your kernel does not allow mounting the overlay filesystem inside a container.
For this you need either Ubuntu or kernel version 5.11 or newer.
Alternatively, if you cannot use either,
For this you need either Ubuntu, [fuse-overlayfs], or kernel version 5.11 or newer.
Alternatively, if you cannot use any of these,
you can use a different access mode for directories, e.g., with `--read-only-dir /`.
If some directories need to be writable, specify other directory modes for these directories as described above.

Expand All @@ -226,6 +228,9 @@ Another limitation of the kernel is that one can only nest overlays twice,
so if you want to run a container inside a container inside a container,
at least one of these needs to use a non-overlay mode for this path.

We recommend the installation of [fuse-overlayfs] in version 1.10 or newer,
which supports all of these use cases.

#### `Cannot change into working directory inside container: [Errno 2] No such file or directory`
Either you have specified an invalid directory as working directory with `--dir`,
or your current directory on the host is hidden inside the container
Expand Down Expand Up @@ -253,3 +258,5 @@ If it still occurs, please attach to all child process of BenchExec
with `sudo gdb -p <PID>`, get a stack trace with `bt`,
and [report an issue](https://github.com/sosy-lab/benchexec/issues/new) with as much information as possible.
BenchExec will usually be able to continue if the hanging child process is killed.

[fuse-overlayfs]: https://github.com/containers/fuse-overlayfs