Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recompression: Inexpensive expansion for all methods in squashfs #29

Open
M-Gonzalo opened this issue Apr 5, 2018 · 1 comment
Open
Assignees
Labels
feature This has to be implemented good first issue Good for newcomers

Comments

@M-Gonzalo
Copy link
Collaborator

M-Gonzalo commented Apr 5, 2018

Squashfs is a read-only filesystem that is frequently used to transparently compress whole operating systems in a live portable media, to distribute software in Snap and AppImage formats, and to efficiently store large multimedia archives. It divides the data into rather small blocks and then compresses them with one of 6 algorithms:

  • gzip
  • LZ4
  • LZMA
  • LZMA2
  • LZO
  • zstd

If Fairytale were able to recompress them, it could signify several GB of savings on a sysadmin's drive. But doing so by means of brute-force guessing the precise method that was used to create the file, goes from extremely impractical to virtually impossible.

Luckily, there's no need to do that. Squashfs stores all options used to compress a block so recompressing a SQS file is just a matter of decompressing the streams and copying the flags. Complexity of O(1). Just to make sure, I had a private conversation with the author, Philip Lougher:

As I understand the docs, squashfs stores now the options used to compress every block. [...] My doubt is: Does squashfs really store all compression options inside the final archive?


Yes it does.

If a user has specified non-default compression options, these are
stored in the final archive.

If the user has used the default compression options, these are not
stored in the archive, because no stored compression options indicate
defaults were used. If the defaults were used then storing them is
unnecessary, because the software knows what the defaults are.

The presence of compression options is indicated by setting the
SQUASHFS_COMP_OPT bit (bit 10) in the Squashfs flags field in the
Squashfs superblock.

If that is set, then the compression options are stored immediately
after the superblock. The size and structure of the compression
options vary depending on which compressor was used to compress the
filesystem. The compressor used is stored in the "compression" field
in the superblock.

If all the various compressors are enabled and compiled into
Mksquashfs, then mksquashfs -info will list the compression options
supported by each compressor, and the defaults (which is used, means
no compression options will be stored).

Unsquashfs -stat will report the compressor used and any non-default
compression options used by the filesystem. If no compression options
are reported by -stat then the default options were used.

@M-Gonzalo M-Gonzalo added good first issue Good for newcomers feature This has to be implemented labels Apr 5, 2018
@M-Gonzalo M-Gonzalo self-assigned this Apr 5, 2018
@M-Gonzalo
Copy link
Collaborator Author

From https://www.kernel.org/doc/Documentation/filesystems/squashfs.txt

 ---------------
|  superblock 	|
|---------------|
|  compression  |
|    options    |
|---------------|
|  datablocks   |
|  & fragments  |
|---------------|
|  inode table	|
|---------------|
|   directory	|
|     table     |
|---------------|
|   fragment	|
|    table      |
|---------------|
|    export     |
|    table      |
|---------------|
|    uid/gid	|
|  lookup table	|
|---------------|
|     xattr     |
|     table	|
 ---------------

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature This has to be implemented good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant