-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unpacking tar archive with long link in it results in truncated filenames #369
Comments
tar::Archive.unpack()
an archive with long link in itThe tar crate is currently truncating filenames that are using the GNU "long link" feature alexcrichton/tar-rs#369. The fix is to shell out to system tar instead.
The tar crate is currently truncating filenames that are using the GNU "long link" feature alexcrichton/tar-rs#369. The fix is to shell out to system tar instead.
I found several reports online of problems with the maven-assembly-plugin or the plexus-archiver that it seems to use to produce a tarball, but nothing exactly matches this. The fact that other tar libraries work properly would seem to implicate tar-rs. However, the It would not seem to be a JRuby bug, since we don't do the tarball construction ourselves, but if there's a problem with an upstream plugin or some configuration change needed to omit |
I found out what's happening here while trying to debug a separate issue, the reproducer was very useful to rule out my issue as distinct and to find what is wrong here. the entry header is not what tar-rs is expecting, this patch makes it work correctly: diff --git a/src/header.rs b/src/header.rs
index 8e39ab6..a2ea05b 100644
--- a/src/header.rs
+++ b/src/header.rs
@@ -193,11 +193,18 @@ impl Header {
fn is_ustar(&self) -> bool {
let ustar = unsafe { cast::<_, UstarHeader>(self) };
- ustar.magic[..] == b"ustar\0"[..] && ustar.version[..] == b"00"[..]
+ ustar.magic[..] == b"ustar\0"[..]
+ && (ustar.version[..] == b"00"[..] || ustar.version[..] == b"\0\0"[..])
} I was comparing the implementation to CPython's, and from what I can see (unless I missed some indirection), CPython doesn't actually check the version bytes. I'm also not sure what the best way forward is for handling this is. I can't find anything about why these bytes might be different than expected, if not just a mixup of |
@RazerM, thanks a lot for posting that patch. I does solve my problem. I have created a PR for the solution, and hopefully it will merged in some form. |
Context
I'm using
tar
to extract the contents downloaded from https://repo1.maven.org/maven2/org/jruby/jruby-dist/9.4.8.0/jruby-dist-9.4.8.0-bin.tar.gz for re-packaging it and uploading it to S3 https://github.com/heroku/docker-heroku-ruby-builder/blob/9e64b4401be4df7c158c9253290f6a3248927023/shared/src/lib.rs#L24-L31.Unfortunately I've learned that this file uses ././@LongLink and it seems that the rust tar archive truncates the file by default. To demonstrate the issue I made a reproduction https://github.com/schneems/tar_long_link_repro.
Reproduction
Then run it:
Expected
I expect that
tmp/rust-extracted/jruby-9.4.8.0/lib/ruby/stdlib/bundler/vendor/molinillo/lib/molinillo/delegates
will contain a filespecification_provider.rb
.Actual
It does not, the command fails an assertion:
Note the filename seems truncated:
More
I see there that the header and entry inside of tar are aware of the concept of long names (https://docs.rs/tar/0.4.41/tar/struct.Header.html?search=long), but I'm not sure how to unpack a Gnu tar file that has them. If the feature exists, we could possibly update the documentation to make it clearer.
tar-rs
seems to not show the same issue with it https://github.com/projg2/tar-test-inputs/tree/a2110a6ffec66690123335f34a84626d62a7b541/long-pathsThe text was updated successfully, but these errors were encountered: