Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support archiving hard links #119

Open
ids1024 opened this issue Jul 6, 2017 · 5 comments
Open

Support archiving hard links #119

ids1024 opened this issue Jul 6, 2017 · 5 comments

Comments

@ids1024
Copy link
Contributor

ids1024 commented Jul 6, 2017

Tar (at least GNU tar) seems to notice when files being archived are hard links, and stores them as such, extracting as hard links.

I'm not quite sure how this should be implemented (and besides, I don't currently/yet need it) but it should have an option like the symlink option in #117.

@alexcrichton
Copy link
Owner

Makes sense to me!

@BrianOn99
Copy link

BrianOn99 commented Nov 4, 2017

I looked into how busybox tar and gnu tar did it. The way is to record in a HashSet the (inode, device number) of every file they have written, if it has hard link count > 1. Then when they see this (inode, device number) a second time, jsut the header is written, with typeflag set to 1 and an appropiate linkname.

The appropiate place to set the typeflag should be append_fs in Builder, because it is where the header is prepared. However in order for append_fs to know it is a hard link, Builder has to tell it, that means adding a parameter to append_fs. That would make append_fs having 7 parameters and it may be too many. Every caller such as append_path has to pass the parameter around too. So, I am considering to put append_fs inside impl Builder

fn append_fs(&mut self,
             path: &Path,
             meta: &fs::Metadata,
             read: &mut Read,
             link_name: Option<&Path>) -> io::Result<()> {

So that dst, mode and the new HashSet of (inode, device number) can be retreived from self.

That would be a big change as some other module-level function like append_path will be moved to self too. I would like to ask if this is acceptable, or there is a better approach?

@alexcrichton
Copy link
Owner

@BrianOn99 sounds great to me!

@BrianOn99
Copy link

@alexcrichton Then I will take this issue, while make sure not changing external API.

@the8472
Copy link

the8472 commented Apr 19, 2019

See #192, extending append_fs would be too high level when not building an archive from the filesystem but passing in partial headers + paths + readables directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants