Significantly reduce size of ext3 `.img` assets #263

edmorley · 2024-03-14T19:31:44Z

Our base images are built as Docker images and on release, published to Docker Hub.

However, the Heroku platform currently doesn't use those images directly. Instead, during release ext3 formatted .img files are generated from each Docker image, which are gzipped and uploaded to S3. At runtime these are then mounted as a loopback device. For more background on this, see:
#42 (comment)

Previously each .img file was created at a fixed size of 2400 MiB, thanks to the bs=100M count=24 arguments to dd (24 x 100 MiB blocks):
https://manpages.ubuntu.com/manpages/jammy/en/man1/dd.1.html

However, this is significantly oversized - for example with Heroku-22's run image utilisation is at 29%:

$ df --human-readable /tmp/heroku-20
Filesystem  Size  Used Avail Use% Mounted on
/dev/loop3  2.3G  654M  1.6G  29% /tmp/heroku-20

This 1.6 GiB of free space is not required, since the image will be mounted as read-only at runtime (the app's own storage lives in separate mounts).

At first glance this over-sizing might not seem like an issue, since dd was invoked with if=/dev/zero so the empty space is zeroed out, and therefore compresses very well (for example the Heroku-22 run image gzips down to 216 MiB) - meaning bytes over the wire and S3 storage costs are not impacted.

However, on the runtime hosts these images have to be stored/used uncompressed on EBS backed storage - and in large quantity due to the number of permutations of stack versions/variants we've accumulated over time (cedar{,-14}, heroku-{16,18,20,22,24}{,-build}). In addition, during a base image release, the Common Runtime hosts have to store both the old and new releases on disk side by side until old dynos cycle - meaning the high water mark for disk usage is doubled for each non-EOL stack.

With the addition of Heroku-24 to staging recently (which increased the storage requirements high-water mark by 9.4 GiB due to the above), this resulted in disk space exhaustion on one partition of some of the single-dyno dedicated instance types:
https://salesforce-internal.slack.com/archives/C01R6FJ738U/p1710236577625989

I did some research to check whether there was a specific reason for over-sizing, and found that the current bs=100M count=24 arguments date back to 2011:
https://github.com/heroku/capturestack/commit/8821890894a7521791e81e8bf8f6ab2b31c93c8e

The 2400 MiB figure seems to have been picked fairly arbitrarily - to roughly fit the larger images at that time with some additional headroom. In addition, I doubt disk usage was a concern since back then there weren't the single-dyno instance types (which have less allocated storage than the multi-tenant instances) or the 12x stack versions/variants we've accumulated since.

As such, rather than increase the allocated EBS storage fleet-wide to support the Heroku-24 rollout, we can offset the increase for Heroku-24 (and in fact reduce overall storage requirements significantly), by instead dynamically sizing the .img files - basing their size on that of the base image contents they hold.

To do this I've chosen to create the .img file at an appropriate size up-front rather than try to shrink it afterwards, since the process of shrinking would be fairly involved (e.g.: https://superuser.com/a/1771500), require a lot more research/testing, and only gain us a couple of MiB additional savings. The .img file format will also eventually be sunset with the move to CNBs / OCI images instead of slugs.

I've also added the printing of disk utilisation during the .img generation process (example), which allows us to see the changes in image size:

Before

Filesystem  Size  Used Avail Use% Mounted on
/dev/loop3  2.3G  654M  1.6G  29% /tmp/heroku-20
/dev/loop3  2.3G  1.5G  770M  67% /tmp/heroku-20-build
/dev/loop3  2.3G  661M  1.6G  30% /tmp/heroku-22
/dev/loop3  2.3G  1.1G  1.2G  46% /tmp/heroku-22-build
/dev/loop3  2.3G  669M  1.6G  30% /tmp/heroku-24
/dev/loop3  2.3G  1.2G  1.1G  51% /tmp/heroku-24-build

Total: 14400 MiB

After

Filesystem  Size  Used Avail Use% Mounted on
/dev/loop3  670M  654M  8.7M  99% /tmp/heroku-20
/dev/loop3  1.6G  1.5G   23M  99% /tmp/heroku-20-build
/dev/loop3  678M  660M   11M  99% /tmp/heroku-22
/dev/loop3  1.1G  1.1G  6.8M 100% /tmp/heroku-22-build
/dev/loop3  686M  669M   10M  99% /tmp/heroku-24
/dev/loop3  1.2G  1.2G   11M 100% /tmp/heroku-24-build

Total: 6027 MiB

Therefore, across those 6 actively updated (non-EOL) stack variants we save 8.2 GiB, which translates to a 16.4 GiB reduction in the high-water mark storage requirements for every Common Runtime instance in the fleet, and an 8.2 GiB reduction for every Private Spaces runtime node (which receive updates via the AMI so don't have double the images during new releases).

There is also potentially another ~6.5 GiB savings to be had from repacking the .img files for the last release of each of the 6 EOL stacks versions/variants, however, since those stacks are no longer built/released that would need a more involved repacking approach. (Plus since these stacks aren't updated, they don't cause double the usage requirements for Common Runtime during releases, so the realised overall storage requirements reduction would be less.)

Docs for the various related tools:
https://manpages.ubuntu.com/manpages/jammy/en/man1/du.1.html
https://manpages.ubuntu.com/manpages/jammy/en/man1/df.1.html
https://manpages.ubuntu.com/manpages/jammy/en/man1/dd.1.html
https://manpages.ubuntu.com/manpages/jammy/en/man1/fallocate.1.html

GUS-W-15245261.

Our base images are built as Docker images and on release, uploaded to Docker Hub. However, the Heroku platform currently doesn't use those images directly. Instead, during release `ext3` formatted `.img` files are generated from each Docker image, which are gzipped and uploaded to S3. At runtime these are then mounted as a loopback device. For more background on this, see: #42 (comment) Previously each `.img` file was created at a fixed size of 2,400 MiB, thanks to the `bs=100M count=24` arguments to `dd` (24 x 100 MiB blocks): https://manpages.ubuntu.com/manpages/jammy/en/man1/dd.1.html However, this is significantly oversized - for example with Heroku-22's run image utilisation is at 29%: ``` $ df --human-readable /tmp/heroku-20 Filesystem Size Used Avail Use% Mounted on /dev/loop3 2.3G 654M 1.6G 29% /tmp/heroku-20 ``` This 1.6 GiB of free space is not required, since the image will be mounted as read-only at runtime (the app's own storage lives in separate mounts). At first glance this over-sizing might not seem like an issue, since `dd` was invoked with `if=/dev/zero` so the empty space is zeroed out, and therefore compresses very well (the Heroku-22 run image gzips down to 216 MiB) - meaning bytes over the wire and S3 storage costs are not impacted. However, on the runtime hosts these images have to be stored/used uncompressed - and in large quantity due to the number of permutations of stack versions/variants we've accumulated over time (`cedar{,-14}`, `heroku-{16,18,20,22,24}{,-build}`). In addition, during a base image release, the Common Runtime hosts have to store both the old and new releases on disk side by side until old dynos cycle - meaning the high water mark for disk usage is doubled for each non-EOL stack. With the addition of Heroku-24 to staging recently (which increased the storage requirements high water mark by ~10 GiB due to the above), this resulted in disk space exhaustion on one partition of some of the single-dyno dedicated instance types: https://salesforce-internal.slack.com/archives/C01R6FJ738U/p1710236577625989 I did some research to check whether there was a specific reason for over-sizing, and found that the current `bs=100M count=24` arguments date back to 2011: heroku/capturestack@8821890 The 2,400 MiB figure seems to have been picked fairly arbitrarily - presumably to roughly fit the larger images at that time with some additional headroom. In addition, I doubt disk usage was a concern since back then there weren't the single-dyno instance types (which have less allocated storage than the multi-tenant instances) or the 12x stack versions/variants we've accumulated since. As such, rather than increase the allocated EBS storage fleet-wide to support the Heroku-24 rollout, we can offset the increase for Heroku-24 (and in fact reduce overall storage requirements significantly), by instead dynamically sizing the `.img` files - basing their size on that of the base image contents they hold. To do this I've chosen to create the `.img` file at an appropriate size up-front rather than try to shrink it afterwards, since the process of shrinking would be fairly involved (eg: https://superuser.com/a/1771500), require a lot more research/testing, and only gain us a couple of MiB additional savings. The `.img` file format will also eventually be sunset with the move to CNBs / OCI images instead of slugs. I've also added the printing of disk utilisation during the `.img` generation process, which allows us to see the changes in image size: ### Before ``` Filesystem Size Used Avail Use% Mounted on /dev/loop3 2.3G 654M 1.6G 29% /tmp/heroku-20 /dev/loop3 2.3G 1.5G 770M 67% /tmp/heroku-20-build /dev/loop3 2.3G 661M 1.6G 30% /tmp/heroku-22 /dev/loop3 2.3G 1.1G 1.2G 46% /tmp/heroku-22-build /dev/loop3 2.3G 669M 1.6G 30% /tmp/heroku-24 /dev/loop3 2.3G 1.2G 1.1G 51% /tmp/heroku-24-build Total: 14,400 MiB ``` ### After ``` Filesystem Size Used Avail Use% Mounted on /dev/loop3 670M 654M 8.7M 99% /tmp/heroku-20 /dev/loop3 1.6G 1.5G 23M 99% /tmp/heroku-20-build /dev/loop3 678M 660M 11M 99% /tmp/heroku-22 /dev/loop3 1.1G 1.1G 6.8M 100% /tmp/heroku-22-build /dev/loop3 686M 669M 10M 99% /tmp/heroku-24 /dev/loop3 1.2G 1.2G 11M 100% /tmp/heroku-24-build Total: 6,027 MiB ``` Across those 6 actively updated (non-EOL) stack variants we save 8.2 GiB, which translates to a 16.4 GiB reduction in the high-water mark storage requirements for every Common Runtime instance in the fleet, and an 8.2 GiB reduction for every Private Spaces runtime node (which receive updates via the AMI so don't have double the images during new releases). There is also potentially another ~6.5 GiB savings to be had from repacking the `.img` files for the last release of each of the 6 EOL stacks versions/variants, however, since those stacks are no longer built/released that would need a more involved repacking approach. (Plus since these stacks aren't updated, they don't cause double the usage requirements for common runtime during releases, so the realised overall storage requirements impact would be less.) Docs for the various related tools: https://manpages.ubuntu.com/manpages/jammy/en/man1/du.1.html https://manpages.ubuntu.com/manpages/jammy/en/man1/df.1.html https://manpages.ubuntu.com/manpages/jammy/en/man1/dd.1.html https://manpages.ubuntu.com/manpages/jammy/en/man1/fallocate.1.html GUS-W-15245261.

runesoerensen

This is a great! Much better way to address the issues that heroku-24 would otherwise cause (than simply increasing EBS volume sizes)

tools/bin/capture-docker-stack

tools/bin/make-filesystem-image

* Removes the redundant `64` suffix from the image filenames/mount directories on disk. This doesn't affect the filename on S3. * Adds comments to the `mkfs` / `tune2fs` usages, since their purpose and arguments are IMO not immediately obvious. Docs for the related tools: https://manpages.ubuntu.com/manpages/jammy/en/man8/mkfs.8.html https://manpages.ubuntu.com/manpages/jammy/en/man8/tune2fs.8.html Split out of #263. GUS-W-15245261.

edmorley self-assigned this Mar 14, 2024

edmorley marked this pull request as ready for review March 14, 2024 19:35

edmorley requested a review from a team as a code owner March 14, 2024 19:35

runesoerensen approved these changes Mar 14, 2024

View reviewed changes

edmorley merged commit bb17881 into main Mar 15, 2024
4 checks passed

edmorley deleted the edmorley/reduce-img-file-size branch March 15, 2024 12:49

edmorley mentioned this pull request Mar 15, 2024

Misc image tool script improvements #264

Merged

dzuelke reviewed Mar 15, 2024

View reviewed changes

tools/bin/capture-docker-stack Show resolved Hide resolved

dzuelke reviewed Mar 15, 2024

View reviewed changes

tools/bin/make-filesystem-image Show resolved Hide resolved

dzuelke approved these changes Mar 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significantly reduce size of ext3 `.img` assets #263

Significantly reduce size of ext3 `.img` assets #263

edmorley commented Mar 14, 2024 •

edited

Loading

runesoerensen left a comment

Significantly reduce size of ext3 .img assets #263

Significantly reduce size of ext3 .img assets #263

Conversation

edmorley commented Mar 14, 2024 • edited Loading

Before

After

runesoerensen left a comment

Choose a reason for hiding this comment

Significantly reduce size of ext3 `.img` assets #263

Significantly reduce size of ext3 `.img` assets #263

edmorley commented Mar 14, 2024 •

edited

Loading