-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Docker] reduce image size, use targeted builds #354
Comments
Yes, the way the current build process is set up leads to some redundancy between |
If you want to have a look at a targeted build via gradle that only builds the |
Regarding the resources, there is the possibility of implementing a "lazy download" strategy, such that Cineast downloads missing resources at runtime if they are missing. This would lead to the benefit, that we could build significantly smaller images, and that use cases that do not require the deep learning features do not require this lengthy download, with the downside of not having the resources contained in the image by default. |
The "lazy download" strategy would be very easy to implement in the Docker image. I'm just worried that this slows down the (re)start of the application. Can you share some details at what stage those resources are required? |
I just noticed that the releases also contain Libraries like JUnit, Mockito etc. Is this intentional? They are not huge but I doubt that anyone is using them in a production environment. |
Generally, all uses are used by features; during extraction or at runtime. I like a lazy download strategy, I'm also open to caching those resources in an attached volume. As for the second point, I'm sure there's some optimization to be done w.r.t library naming, e.g. no testing library is required at normal runtime / in a production environment. Feel free to open a PR for both issues. |
The Docker v3.12.4 image is > 4GB and keeps growing. To compare, the
eclipse-temurin:17-jre
bas image is only 90MB.Docker images should only contain the required runtime libraries and the aim of containers is to be lightweight. Even on a modern system, it takes several minutes to download and extract the
vitrivr/cineast
image. This makes the deployment harder and leads to longer downtimes. Furthermore, if you use cloud native infrastructure, this could lead in higher costs due the bandwidth and storage requirements. Since all dependencies are compressed into two single JARs it's also impossible for Docker to cache or deduplicate image layers.I took a look into the image: 2GB are used by
resources
(I'm not sure if there is any room for optimization) and thecineast-api.jar
andcineast-cli.jar
are about 1GB each. When you look into the JARs, you will see, that they share a large amount of data. If you further look into that data, most of it are native binaries (e.g. tensorflow or ffmpeg). It even contains the same libraries for multiple platforms, like x86, arm, windows, macos etc.So here are some thoughts:
linux/amd64
and therefore it should only contain the libraries forlinux/amd64
. Is there a way to use targeted builds with gradle for Docker? In the future,docker buildx
could be used to support other platforms.The text was updated successfully, but these errors were encountered: