Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prune unused artifacts from non-static builds #59

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Conversation

henriquesimoes
Copy link
Collaborator

Both dynamic-link and no-build targets generate unnecessarily large IOC images due to the trivial COPY done from /opt. Clean up everything from there and /usr/local directory before doing the COPY to the IOC image.

Clean up strategy proposed here is the following:

  • Remove entirely any module that does not have a shared library linked against binaries under the IOC target directories (i.e. /opt/$REPONAME or $RUNDIR);
  • Prune directories from used modules and EPICS base not containing shared libraries, EPICS databases or autosave requirement files;
  • Remove any shared library not linked against binaries under IOC target directories;
  • Remove all statically linked libraries;

For pruning module directories, another approach that could be followed is to delegate this responsibility to install_module. This way, we can have contextualized pruning for modules. On the other hand, this cannot be as aggressive as implemented in this proposal, as build-dependent files (the ones defining the build-system itself, for instance) must exist until the IOC build is complete.

I have tested this with ad-aravis-epics-ioc, and it shrinks the image size from 1.33GB down to 322MB.

This should be further tested and extensively reviewed, so that no artifact is removed unexpectedly.

@henriquesimoes henriquesimoes marked this pull request as draft April 16, 2024 18:10
@henriquesimoes
Copy link
Collaborator Author

I'm waiting for #57 to be merged to rebase it. As of now, this would mean another step is introduced which removes all IOCs installed in the base image.

@henriquesimoes henriquesimoes changed the title Prune unused artifacts from IOC images Prune unused artifacts from non-static builds Apr 16, 2024
@henriquesimoes henriquesimoes mentioned this pull request Apr 16, 2024
@henriquesimoes henriquesimoes linked an issue Apr 17, 2024 that may be closed by this pull request
Copy link
Collaborator Author

@henriquesimoes henriquesimoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added comments to points that I wasn't sure what is the best way to deal with, so that we can discuss if there are relevant or not and how to improve, if that's the case.

In general, I feel like I could have written helper functions that could be more easily shared among the steps. But that's not a concrete thing to propose anything. Let me know if you feel the same.

base/Dockerfile Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
Copy link
Member

@ericonr ericonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first two commits are correct and can be merged separately, if you'd like to split them.

Copy link
Member

@ericonr ericonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Dockerfile parts of the third commit are ok.


Commit message nits:

It's not the caching that's the issue, it's simply how container images are layered:

The clean up phase must be executed in the build stage (before the final
copy), otherwise the COPY layer would still make the image size big.
This implies `no-build` target needs to copy from a pruned version of
the base image, which is a new stage.

base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
Copy link
Collaborator Author

@henriquesimoes henriquesimoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I'm starting to think that we should provide a way to explicitly skip a pruning step, possibly for a given target directory.

This would give users some tooling for managing corner cases, without relying on immediate fixes in lnls-prune-artifacts.

REPONAME: mca
REPONAME: epics/modules/mca
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I don't like very much the idea of reusing REPONAME. It seems awkward to me. I'd rather have them specified by a variable whose name makes sense at all.

I did so because it was easier to test other things first. Let me know what you think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. Do you think it would lead to special casing these modules in the image Dockerfile?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would lead to special casing these modules in the image Dockerfile?

They kinda already have such special handling... all of them are no-build targets. They of course currently have almost the same (compose variables) API of other targets. I say "almost" because they diverged by not using REPONAME before, and now by not meaning "base name to where to clone the repository" but rather "location inside /opt where important artifacts live".

I'm not sure if I answered your question at all, but I do feel like we have always being dealing with a special case.

Copy link
Member

@ericonr ericonr Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I am essentially wondering if we shouldn't use some other name instead of REPONAME, since these images are a special case.

Copy link
Collaborator Author

@henriquesimoes henriquesimoes Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to APP_DIRS. It should encode full paths now. It is named after a plural noun; after all, we can have multiple directories that we want to ensure are properly working. This should work as a palliative measure for spurious cases where we clean up too much. Any directory listed there will have its belonging module removed from pruning (both module-based and directory-based).

Maybe we could be even more explicit about its usage regarding "not pruning it and keeping its executable binaries working". Any opinion and suggestion about that?

base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
Copy link
Member

@ericonr ericonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing isn't clear to me. From targets, you are obtaining unused modules, to then obtain used modules, to finally obtain unused modules again, and remove them. Is that logic necessary, can we simplify it? If you already have these answers, you can add them to the commit message.

Commit message nit:

EPICS base is assumed to be always needed,

always be needed

It is assumed we are using glibc implementation

we are using glibc's ldd

For the second commit, since we are removing stuff after building, static libraries can always be removed, notjust for non-static builds. I think it's only the commit message that's wrong.

Nit:

Static libraries are not used neither during compile time nor runtime for non-static builds.

Static libraries are not used during compile nor runtime.

Static libraries are used neither during compile nor runtime.

All AR archives are assumed to be static libraries,

All files with .a extension are assumed to be static libraries,

base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
@henriquesimoes henriquesimoes force-pushed the prune branch 3 times, most recently from c34d5c2 to d74b331 Compare November 6, 2024 21:29
@gustavosr8
Copy link
Contributor

nit from 13072de:
s/clean-up/cleanup
s/clean up/cleanup
s/so that it contains a valid mapping of all modules/so it contains a valid mapping of all modules
s/assumes the inspected binaries/assumes that the inspected binaries
s/It is assumed we are using glibc's/It is assumed that we are using glibc's

@henriquesimoes
Copy link
Collaborator Author

Thanks, @gustavosr8. I've also made remove_unused_shared_libs respect libraries inside APP_DIRS. This allows users to specify the path where their dlopen(3)ed libraries live. Can you test this again with the pyDevSup-based IOC?

@gustavosr8
Copy link
Contributor

Thanks, @gustavosr8. I've also made remove_unused_shared_libs respect libraries inside APP_DIRS. This allows users to specify the path where their dlopen(3)ed libraries live. Can you test this again with the pyDevSup-based IOC?

Now it seems to work! Thanks :)
Just a reminder to document the use of APP_DIRS ;)

Copy link
Contributor

@gustavosr8 gustavosr8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add instructions to put /opt/epics/modules/pyDevSup/python3.9 under APP_DIRS (or any analogous var) when building an IOC that uses the pyDevSup module.

Copy link
Member

@ericonr ericonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message nits:

"workaround" should be "work around", it's being used as a verb.

"a "import"" should use "an"

base/lnls-prune-artifacts.sh Show resolved Hide resolved
images/docker-compose-mca.yml Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
@ericonr
Copy link
Member

ericonr commented Dec 9, 2024

I'm force pushing into the PR, after rebasing it and fixing some of my own nits.

Original commit: 0a9c79e

@henriquesimoes
Copy link
Collaborator Author

Changed the last implementation bits now. I learned something new during this round: calling a bash function outside a subshell does not create another variable scope. I was assuming it did until now. If I knew this from start, I would have defined several other variables explicitly as local just to avoid overwriting something by mistake in the future.

I haven't done so now, though, to reduce the amount of noise in range diffs. I've only done so now where it would be a bug otherwise. Let me know if you think it is interesting to refactor that before merging.

From my point of view, documentation is the last topic to be covered in this PR.

henriquesimoes and others added 10 commits December 13, 2024 16:28
Scripts that are executed only during IOC build steps don't need to be
copied early in the base image build. In addition, doing so invalidates
cache whenever such scripts are touched. Install them in the base image
in the last steps, so that cache can be more efficiently used, speeding
up local and CI builds.
Remove EPICS modules that do not have a dynamic library linked to any
executables in the target IOC. This shrinks the final image size by
removing useless artifacts. We introduce APP_DIRS (application
directories) variable to specify paths where relevant binaries are, when
no-build target is used, so the proper cleanup can happen in such cases
as well; APP_DIRS replaces REPONAME for these cases, since the defined
REPONAME was unused. For dynamic-build, it encodes any additional
directories besides the base COPY'ed directory to be considered.

Cleanup phase must be executed in the build stage (before the final
copy), otherwise the COPY layer would still make the image size big.
This implies `no-build` target needs to copy from a pruned version of
the base image, which is a new stage.

The list of modules to be removed is taken from the RELEASE file, so
that it contains a valid mapping of all modules that have been
installed. EPICS base is assumed to always be needed, even though some
binaries in it can certainly be thrown away given the IOC linkage (for
instance, PVAccess binaries when the IOC uses only Channel Access).

Linkage information is taken from ldd(1), which assumes that inspected
binaries are safe to be potentially executed. It is assumed that we are
using glibc's ldd, which provides a straightforward way to query all
binaries at once. Moreover, we ignore not-found library entries to
properly cover binaries already shipped in repositories, such as
libopcua.so, which might not contain (properly) defined rpaths.

To avoid implementing a function that returns "all EPICS modules whose
path is a prefix from the used library paths", we instead reuse
`filter_out_paths()` by exploting the fact that it filters *out*
everything we care about when given `linked_libs`. Taking its difference
to the original set gives us what we need without implementing the
variant of `filter_out_paths()`.

The pruning script requires `set -E` in order to propagate errors from
functions called inside subshells.
Static libraries are not used during runtime, and not even compile-time
for non-static builds. On the other hand, they take up a large amount of
storage: about 312MB for EPICS base and 222MB for modules. Remove them
all right before finishing non-static build stages, avoiding their copy
to IOC images. This is not needed for static-link targets, since only
the cleaned IOC directory is copied, already leaving behind all static
libraries.

All files with .a are assumed to be static libraries, which should be a
good assumption in GNU/Linux. Other static artifacts, such as object
files, are correctly removed by the build system clean target and are
not handled explictly here for simplification. A lint script for the
build system can be implemented in the future if this eventually becomes
false.

Files installed into /usr/local aren't copied into the final IOC images
yet. So adding it into the argument list doesn't alter the behavior for
now, it is simply future proofing.
Shared libraries may be installed either in /opt or /usr/local/lib but
may not be used by the target binaries from the IOC image. Remove them
during the prune phase before copying both directories to resulting
image to shrink the final image size. Both actual versioned binary and
its symbolic links are removed to keep the filesystem consistent.

When deciding the set of "used libraries", we assume all libraries
inside any APP_DIRS are used. This allows one to specify a dlopen(3)'ed
library path in APP_DIRS, and ensure it is kept in the resulting image.

Like static libraries, dynamic libraries installed into /usr/local
aren't copied into the final IOC images yet. So adding it into the
argument list doesn't alter the behavior for now, it is simply future
proofing.
Modules may contain several artifacts, including configuration files,
graphical interface files and other repository artifacts that do not
need to be in the IOC image.

Remove them all except the ones containing EPICS database (`.db`,
`.template` and `.substitutions`), autosave requirement (`.req`), or
command (`.cmd`) files, besides shared libraries. Binaries directory
(`bin`) is also removed, as only $REPONAME and $RUNDIR should contain
target executables, which are filtered out from the list.

Command files are kept, because they might be snippets to be sourced by
the IOC's own main command file. ACFs were considered, but ultimately
considered unlikely to be used this way.

Because we call prune_directories() directly (without a subshell), we
must ensure `target` and `keep_paths` are created in the local scope,
avoiding corrupting the callee variables.

Co-authored-by: Érico Nogueira <[email protected]>
EPICS base has rather large configuration files for build, and other
repository files, which are not needed in the IOC images. Prune them
after building the IOCs, shrinking by 40MB the final image size.

Prune is performed with the same script as modules, which discards all
executables in `bin` (~15MB), as well as Perl scripts. This should be
fine considering that `static-link` target also does not preserve EPICS
binaries in the resulting image.

It is also susceptible to be skipped during module directory pruning if
a target specifies it. This may be useful if we ever consider making a
derivative image only with its tools.
When pruning modules, some of them might contain special cases where it
is harder to automatically detect in the pruning algorithm. While users
could workaround this by specifying such special paths under APP_DIRS,
this can be handled here by adding extra tooling to be used by modules.

Introduce a way for modules to specify extra paths that must be kept,
besides those detected by lnls-prune-artifacts. Do this through a
.lnls-keep-paths file that specifies all relative paths (directories or
files) whose contents must be kept. Paths may be globs, so that they are
resolved based on their bash expansion, avoiding hard-coded options when
ambiguity can be automatically resolved.

These exceptions are considered for all ancestors of a candidate to
removal, so that they are not restricted to EPICS modules themselves.
This eases implementation because we won't need to detect which parent
is a module and should contain .lnls-keep-paths, and instead simply
traverse the whole ancestor list looking if any of them defines one.
Some of the dynamic libraries built by pyDevSup are not directly linked
into the IOC binary, but rather dlopen(3)'ed by Python interpreter when
a "import" statement is reached. To avoid having all users of pyDevSup
manually specify which ones to keep, define everything inside
python3.<minor-version>/linux-<architecture> to be kept by
lnls-prune-artifacts.

We use globs here (whose expansion is deferred) to avoid changing the
minor version every distro upgrade.
pvagw isn't interactive, so we should use it as the entrypoint directly.

Also fix missing runtime package.
Pruning artifacts embeds several heuristics about what is worth keeping,
and what should be removed after build. Such predefined pruning steps
might miss corner cases where defining extra APP_DIRS is not enough.
Introduce a knob to skip the pruning steps entirely to give users a
immediate last resort option to use before a corrected version is
eventually released.
Users should know that an automatic step is executed to make the image
small, because this might break their application in (hopefully) rare
situations. Also teach them it is possible to extend the paths where
important binaries are, and also skip pruning entirely if needed.
@henriquesimoes henriquesimoes marked this pull request as ready for review December 17, 2024 16:19
Comment on lines +56 to +64
For non-static builds, an automatic pruning step is executed to make the IOC
image size small by removing unused artifacts. This step preserves all ELF
executables inside `RUNDIR` and the IOC repository, as well as their dependent
EPICS modules based on the linkage information. Additional paths can be
provided in `args` under `APP_DIRS` key to extend the list of where important
applications and shared libraries are, including any `dlopen(3)`ed libraries.
Pruning step can be skipped entirely by providing the `SKIP_PRUNE=1` argument,
but this is highly discouraged as it will increase significantly the image
size.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it is interesting to specify we also prune directories from modules in use?

I don't like the idea of replicating what types of files are kept, but it may be okay to say they are also pruned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prune unused modules from dynamic-link
3 participants