Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an internal erofs writer implementation #56

Open
allisonkarlitskaya opened this issue Dec 9, 2024 · 2 comments · May be fixed by #57
Open

Add an internal erofs writer implementation #56

allisonkarlitskaya opened this issue Dec 9, 2024 · 2 comments · May be fixed by #57

Comments

@allisonkarlitskaya
Copy link
Collaborator

We're starting to get increasing friction with using mkcomposefs, mostly related to Debian/Ubuntu:

  • we have to install it from git in GitHub Actions because Ubuntu doesn't have a package
  • for the same reason, it's blocking Add ubuntu example image #45
  • it would be nice if we could take our main rust workflow back out of the container and use it directly on GitHub

At the same time I've been worried about the specification of the containers.composefs.fsverity attribute, since it's basically currently defined in terms of "run mkcomposefs and measure the output". Particularly because erofs is rather sparsely documented...

As such, a dual goal:

  • add an in-tree erofs writer
  • produce a detailed document describing how to create a composefs erofs image such that someone doing an independent implementation would have a decent chance of getting the same output as us

I have a WIP branch here, and have made quite a bit of progress: https://github.com/allisonkarlitskaya/composefs-rs/tree/erofs/src/erofs

I think it would be extremely nice if this writer produced the same output as libcomposefs. Testing the two implementations against each other would be amazing.

We should try to do one of these two things:

  • make our erofs writer exactly compatible with the erofs writer in composefs
  • use our independent re-implementation as input to the discussion in Game out a plan for a 1.1 format composefs#198 with the goal of modifying libcomposefs to output a new format

Or, maybe some mix of the two.

@cgwalters
Copy link
Collaborator

we have to install it from git in GitHub Actions because Ubuntu doesn't have a package

One thing I might ask here is for podman to try taking a hard dependency on composefs, which would help become a forcing function.

make our erofs writer exactly compatible with the erofs writer in composefs

Let's please please not bifurcate our small ecosystem more. I obviously love the Rust usage, and there's clear value in having a new independent implementation indeed, but if it bifurcates that's a problem.

There's a lot of subtle details indeed when you get into the erofs writing especially around corner cases like "quoting" overlayfs xattrs (see containers/composefs#288 for example).

I have no problem with continuing experimentation in this direction but I'd really like to just push harder to ship composefs in more distributions as the short and even medium term baseline.

@allisonkarlitskaya
Copy link
Collaborator Author

Let's please please not bifurcate our small ecosystem more. I obviously love the Rust usage, and there's clear value in having a new independent implementation indeed, but if it bifurcates that's a problem.

I'm very cognizant of this concern and it's why I've waited so long to do this. It finally got to the point where the amount of efforts I'm putting into the workarounds were more effort than just doing the thing (and having a Rust implementation of the writer is, itself, quite valuable). It's definitely my goal that the two implementations end up being functionally equivalent, and I think the fact that we will then be able to test them against each other is extremely valuable.

There's a lot of subtle details indeed when you get into the erofs writing especially around corner cases like "quoting" overlayfs xattrs (see containers/composefs#288 for example).

Ya. I've learned an awful lot...

https://github.com/allisonkarlitskaya/composefs-rs/blob/e724ba3c2628c7c7924ad1287cc6bf3e4df87cf5/src/erofs/COMPOSEFS.md#special-handling-for-overlayfs
https://github.com/allisonkarlitskaya/composefs-rs/blob/e724ba3c2628c7c7924ad1287cc6bf3e4df87cf5/src/erofs/mod.rs#L399

Another example is what to do with (0, 0) character devices so that overlayfs doesn't interpret them as whiteouts. Those are currently handled with a very elegant todo!() :)

https://github.com/allisonkarlitskaya/composefs-rs/blob/e724ba3c2628c7c7924ad1287cc6bf3e4df87cf5/src/erofs/mod.rs#L390

All of these weird "special cases" are a big part of my motivation here. We really need to document this stuff, because currently the only documentation is a large file written in C. This is why I'm being very deliberate about writing the documentation and the new code at the same time: it makes it less likely to miss something. And if we can get the new code to produce bit-for-bit identical results to the version in libcomposefs then it's further assurance that the documentation is at least kinda matching reality.

@allisonkarlitskaya allisonkarlitskaya linked a pull request Dec 16, 2024 that will close this issue
allisonkarlitskaya added a commit to allisonkarlitskaya/composefs-rs that referenced this issue Dec 16, 2024
This introduces experimental code for writing erofs images ourselves,
instead of using the external mkcomposefs CLI.

It's currently disabled by default.  You can test it by setting the
`COMPOSEFS_FORMAT=new` environment variable.

This currently produces a different output than the output of
mkcomposefs, which is why it's gated behind an environment variable.
The plan is to add a compatibility mode to our internal writer code so
that it produces as similar of an output as possible and then switch
over to using it once we are convinced that it's equivalent.  Then the
`COMPOSEFS_FORMAT=` variable will disable this compatibility mode.

There's also a `COMPOSEFS_DUMP_EROFS=1` environment variable (which
works with both `mkcomposefs` and our internal code) which will dump the
erofs layout for diffing.  There's also a standalone `erofs-debug`
binary that will do the same.

Additionally, this introduces two new files in docs:
 - a detailed description of the parts of erofs that we use
 - a document which attempts to describe the decisions made in creating
   an erofs composefs image (in terms of which order the files are in,
   etc).

The main idea here is to start a serious effort towards standardizing
the composefs label we want to start adding to container images: it
should be possible to define what will be in that label by way of
documentation instead of saying "run this software and use the output".
These two new documents, taken together with the existing "oci.md" form
a rough (and still incomplete) outline for that.

Many thanks to Gao Xiang <[email protected]> for helping clarify many
points about the erofs file format for the documentation.

Closes containers#56

Signed-off-by: Allison Karlitskaya <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants