Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[POC]: Add esbuild and update enhanced dockerfile #906

Closed
wants to merge 15 commits into from

Conversation

branberry
Copy link
Contributor

@branberry branberry commented Sep 14, 2023

Notes

This PR was an experiment to see if we could drastically reduce the build time of our docker image deploys.

Currently, the builds have been reduced modestly from 18-25 minutes to 12-15 minutes. The builds do seem to be more consistent, as previous builds without any optimizations could get down to the 17 minute mark.

The following strategies were employed:

  1. Utilize Docker's caching mechanisms to cut back on steps that needed to be run
  2. Decrease the bundle size of the deployed assets
  3. Use a smaller base image to further reduce the build final image size

Caching

For point one, this required a reorganization of the stages so that we do not trigger a cache invalidation every time we deploy. In particular, COPY commands are a common source of cache invalidation.

For more background, if a layer has a change that invalidates the cache for that layer, then all downstream layers must be re-run. Here's a nifty illustration from the Docker site:
image

In our case, this required a bit of shuffling around of the stages. The stage that uses the ubuntu-20.04 image was shuffled up to the top, as this stage simply installs dependencies with fixed versions, a perfect candidate for caching. This means that we need to ensure that we copy all of the dependencies from this stage, to the final stage that runs the application.

Unfortunately, my attempts to configure the cache to work with GitHub Actions and ECR didn't seem to work as I expected. The layers are always re-run, and so I'll have to do more digging to see how to properly configure the caching when running in a workflow.

Caching works locally, but that does not require configuration as each build is stateful i.e. the previously built image exists in storage. But, the GitHub Actions start with a clean slate, and do not have access to the previously built Docker image. There are custom GitHub actions that work with Docker to cache, but there's the extra wrinkle in that we use AWS CDK to deploy, and that handles all of the Docker commands.

Another potential approach could be to manually build the image ourselves, and then simply reference that built image in the CDK.

Bundling

The next optimization made was to reduce the size of the bundles. To do this, ESBuild was used. Currently, we use the TypeScript compiler to compile our TypeScript into JavaScript, and then we copy the package*.json files, and install the node modules in the final stage. This means we include the node_modules directory in the final build for the root project, and each of the modules/. This is all adds up to around ~ 800 MB of data. Using ESBuild, we are able to package the all of the projects into a bundle size of approximately 25 MB:

image

And we can see that the final image size is greatly reduced in large part due to the reduced bundle size:

(these values are in MB. The top value is the image using the smaller bundling approach + a smaller image, and the bottom value is the previous image size)
image

While the reduced image size is nice, it didn't bring down the build times as much as I would have hoped.

Base Image Change

The last change that was made was to reduce the base image that the container uses. This happened in tandem with the stage re-shuffling. Previously, we had the Ubuntu image be the base of the final stage, and it's quite large by default: with no dependencies, it's about 78 MB. Something to note is that we use the Ubuntu image to install necessary dependencies from Giza since it uses Python 2.7. Alpine images, which are much smaller (typically ~ 5 MB), unfortunately no longer support python2.7, and cannot be installed from its package manager. Fortunately, we can still use the Ubuntu base image in an earlier stage to build the dependencies we need, and then copy them over to the final layer, which uses an Alpine Linux base image.

NOTE: This also means that the container is running on a different Linux distribution, and the behavior may not be the same. At this time, I have not tested builds, only confirmed that the tasks start without error. Further testing will need to be done to confirm that the Autobuilder still works as intended.

@branberry branberry closed this Sep 14, 2023
@branberry branberry reopened this Sep 14, 2023
@branberry branberry closed this Sep 14, 2023
@branberry branberry reopened this Sep 14, 2023
@branberry branberry closed this Sep 14, 2023
@branberry branberry reopened this Sep 14, 2023
@branberry branberry closed this Sep 14, 2023
@branberry branberry reopened this Sep 14, 2023
@branberry branberry closed this Sep 15, 2023
@branberry branberry reopened this Sep 15, 2023
@branberry branberry closed this Sep 15, 2023
@branberry branberry reopened this Sep 15, 2023
@branberry branberry closed this Sep 15, 2023
@branberry branberry reopened this Sep 15, 2023
@github-actions
Copy link

The URL for your feature branch webhook is https://z00gepi0cj.execute-api.us-east-2.amazonaws.com/prod/

@branberry branberry closed this Oct 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant