Skip to content

Latest commit

 

History

History
100 lines (73 loc) · 3.9 KB

Build Cache.md

File metadata and controls

100 lines (73 loc) · 3.9 KB

The Omnibus Build Cache

The omnibus project includes a project build caching mechanism that reduces the time it takes to rebuild a project when only a few components need to be rebuilt.

The cache uses git to snapshot the project tree after each software component build. The entire contents of the project's install_dir is included in the snaphot.

When rebuilding, omnibus walks the linearized component list and replays the cache until the first item is encountered that needs to be rebuilt. Items are rebuilt if the code changes (different version or different git SHA) or if the build instructions change (edit to config/software/$COMPONENT).

Location of Cache Data

The default location of the cache (which is just a bare git repository) is /var/cache/omnibus/cache/git_cache/$INSTALL_DIR. You can customize the location of the cache in the omnibus.rb config file using the key git_cache_dir. For example:

git_cache_dir "/opt/omnibus-caches"

How It Works

Omnibus begins by loading all projects (see lib/omnibus.rb#expand_software). The dependencies of each project are loaded using recursively_load_dependency to capture the transitive dependencies of a given project dependency. The result is that each project has a list of components in "install order" such that all dependencies come before the things that depend on them. Components are de-duplicated on the way in and not added if already present (which will occur when two components share a common dependency).

The actual build order is determined by library.rb#build_order which does a small reordering to ensure that components that are explicitly listed in the project file come last. Since the first cache miss causes the system to rebuild everything further to the right in the component list, you want to have your most frequently changed components last to minimize rebuild churn.

Lightweight git tags are used as cache keys. After building a component, a tag name is computed in GitCache#tag. The tag name has the format #{name}-#{version}-#{digest} where name and version map to the component that was just built and digest is a SHA256 value. The digest is computed by string-ifying the name/version pairs of all components built prior to this component (keeping the order of course) and prepending the entire contents of the component's config/software/$NAME.rb file. Important aspects of the cache key scheme:

  • A change in the order or in the name/version of component built before the component will invalidate the cache key.
  • A change in the component's software config (build instructions) invalidates the cache. This is done conservatively; the entire file contents are included in the digest computation so even a whitespace change will invalidate the key.
  • A change in the version of the component itself will invalidate the key.

You can inspect the cache keys like this:

git --git-dir=$CACHE_PATH tag -l

You can manually remove a cache entry using git tag --delete $TAG.

Build Slaves

You can share the cache among build slaves using git clone/fetch.

Using git bundles is another option (if you're using Jenkins, the bundle can be persisted as an artifact):

  • Backup: git --git-dir=$CACHE_PATH bundle create $CACHE_BUNDLE --tags
  • Restore: git clone --mirror $CACHE_BUNDLE $CACHE_PATH

(CACHE_BUNDLE is anywhere you're persisting the bundle to)

Important Caveat

Note that Omnibus caches instructions exactly as they will be executed (e.g. Omnibus caches make -j 9, it doesn't cache make -j #{workers}).

So, if you intend to share your build cache among slaves, you should ensure that their build environments are identical (failing which you'll bust the cache with commands that depend on the environment).

Mac OS X Caveats

When running a build vm on OS X, note that you will likely run into trouble if you attempt to locate your cache on your local filesystem if it is not case sensitive.