Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very basic Pax header writer #382

Merged
merged 2 commits into from
Oct 24, 2024

Conversation

gwitrand-ovh
Copy link
Contributor

Basic implem of PaxBuilder that can add the header with the data provided, implem on tar::builder

Today the lib is able to read the pax headers, even if they have arbitrary values but can't write some headers back. This update lets you do that very simply.

Usage:

let mut pax_builder: PaxBuilder = PaxBuilder::new();

let value: String = "a_value".to_string();
let key: String = "a_key".to_string();

pax_builder.add(&key, &value);

let res_write_pax: Result<(), io::Error> = my_open_tar_archive.append_pax_extensions(&pax_builder);

Copy link
Collaborator

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution! Procedural notes:

  • If you wouldn't mind please squash your commits, I don't think any of the 4 stand alone
  • Can you rework your commit message to be "Linux kernel style" which I try to use, see e.g. https://cbea.ms/git-commit/

I'd propose something like this:

Add support for writing PAX headers

While this crate has always supported reading these attributes,
writing them was up to the user. Add an API to do this.

src/pax.rs Outdated Show resolved Hide resolved
src/pax.rs Outdated Show resolved Hide resolved
src/pax.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking requested changes per above

@gwitrand-ovh gwitrand-ovh force-pushed the pax_headers_basic branch 3 times, most recently from 47f0045 to 92af5e5 Compare October 22, 2024 08:29
@gwitrand-ovh
Copy link
Contributor Author

Thanks for your contribution! Procedural notes:

* If you wouldn't mind please squash your commits, I don't think any of the 4 stand alone

* Can you rework your commit message to be "Linux kernel style" which I try to use, see e.g. https://cbea.ms/git-commit/

I'd propose something like this:

Add support for writing PAX headers
While this crate has always supported reading these attributes,
writing them was up to the user. Add an API to do this.

Marking requested changes per above

All comments fixed.
Also updated the tests and code doc to comply.

This crate currently supports reading the PAX headers. This adds a way to write
some PAX headers in a Builder.
Also implemented tests to showcase usage
@cgwalters
Copy link
Collaborator

Can you add these changes? First we don't need to require passing owned data, and according to the spec the value can be arbitrary binary data.

diff --git a/src/pax.rs b/src/pax.rs
index 5a06967..d149428 100644
--- a/src/pax.rs
+++ b/src/pax.rs
@@ -154,9 +154,9 @@ impl<T: Write> crate::Builder<T> {
     /// Takes in an iterator over the list of headers to add to convert it into a header set formatted.
     ///
     /// Returns io::Error if an error occurs, else it returns ()
-    pub fn append_pax_extensions<'a>(
+    pub fn append_pax_extensions<'key, 'value>(
         &mut self,
-        headers: impl IntoIterator<Item = (String, String)>,
+        headers: impl IntoIterator<Item = (&'key str, &'value [u8])>,
     ) -> Result<(), io::Error> {
         // Store the headers formatted before write
         let mut data: Vec<u8> = Vec::new();
@@ -172,7 +172,9 @@ impl<T: Write> crate::Builder<T> {
                 max_len *= 10;
             }
             let len = rest_len + len_len;
-            write!(&mut data, "{} {}={}\n", len, key, value)?;
+            write!(&mut data, "{} {}=", len, key)?;
+            data.extend_from_slice(value);
+            data.push(b'\n');
         }
 
         // Ignore the header append if it's empty.
diff --git a/tests/all.rs b/tests/all.rs
index 9f93de4..3caff4b 100644
--- a/tests/all.rs
+++ b/tests/all.rs
@@ -932,12 +932,12 @@ fn pax_simple_write() {
     let file: File = t!(File::create(&pax_path));
     let mut ar: Builder<BufWriter<File>> = Builder::new(BufWriter::new(file));
 
-    let mut pax_builder: Vec<(String, String)> = Vec::new();
-    let key: String = "arbitrary_pax_key".to_string();
-    let value: String = "arbitrary_pax_value".to_string();
-    pax_builder.push((key, value));
+    let pax_extensions = [
+        ("arbitrary_pax_key", b"arbitrary_pax_value".as_slice()),
+        ("SCHILY.xattr.security.selinux", b"foo_t"),
+    ];
 
-    t!(ar.append_pax_extensions(pax_builder.into_iter()));
+    t!(ar.append_pax_extensions(pax_extensions));
     t!(ar.append_file("test2", &mut t!(File::open(&pax_path))));
     t!(ar.finish());
     drop(ar);
@@ -950,9 +950,11 @@ fn pax_simple_write() {
     assert!(pax_headers.is_some(), "pax_headers is None");
     let mut pax_headers = pax_headers.unwrap();
     let pax_arbitrary = t!(pax_headers.next().unwrap());
-
     assert_eq!(pax_arbitrary.key(), Ok("arbitrary_pax_key"));
     assert_eq!(pax_arbitrary.value(), Ok("arbitrary_pax_value"));
+    let xattr = t!(pax_headers.next().unwrap());
+    assert_eq!(xattr.key().unwrap(), pax_extensions[1].0);
+    assert_eq!(xattr.value_bytes(), pax_extensions[1].1);
 
     assert!(entries.next().is_none());
 }

Since we can keep the ownership of the data when calling append_pax_extensions, we can use &str and &[u8] according to the lib documentation
Copy link
Collaborator

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two minor nits but they're not important so merging, we can do followup fixes if we want.

Thanks for your contribution!

@@ -1,7 +1,8 @@
extern crate tar;
extern crate tempfile;

use std::fs::{create_dir, File};
use std::fs::create_dir;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't care too much but this change seem spurious i.e. unnecessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't realize at the time, not intended. Won't fix since it's not that important

@@ -145,3 +146,50 @@ impl<'entry> PaxExtension<'entry> {
self.value
}
}

/// Extension trait for `Builder` to append PAX extended headers.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't an extension trait anymore

@cgwalters cgwalters merged commit adbbc29 into alexcrichton:main Oct 24, 2024
7 checks passed
@cgwalters
Copy link
Collaborator

When I was playing with this I noticed that my suggestion of using key: &str clashes with the existing PaxExtensions reader, which gives back &[u8].

I dunno how much of a real world problem this is. It does sure look to me like in e.g. the python tar parser it's actually expected to be UTF-8.

@cgwalters
Copy link
Collaborator

For general interest I used this to fix a bug in my software - thank you for submitting this patch! cc ostreedev/ostree-rs-ext#679

@gwitrand-ovh
Copy link
Contributor Author

gwitrand-ovh commented Nov 25, 2024

When I was playing with this I noticed that my suggestion of using key: &str clashes with the existing PaxExtensions reader, which gives back &[u8].

I dunno how much of a real world problem this is. It does sure look to me like in e.g. the python tar parser it's actually expected to be UTF-8.

This could be fixed, but I'm not sure it would add any "bonus" as str is used for an utf8 string for the key anyway.
[u8] doesn't implement Write so it would need to be converted I guess

@gwitrand-ovh
Copy link
Contributor Author

Made a PR if want to switch the usage of key but it might be a breaking change if already used by someone

https://github.com/alexcrichton/tar-rs/pull/391/files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants