-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate Inventory & Sha7 appended filenames for binaries #51
Conversation
Currently the Ruby buildapack is tightly coupled to the URLs that are generated by this repo. I am in the process of introducing an inventory file that the buildpack can use as a lookup. To start that process this commit introduces generating tgz files with the first 7 SHA256 characters appended. This allows us to update the same version number in the future without worrying that it will break future digest checks of the URL. With this change the builder will continue to generate the original URLs, but it will also generate "<filename>-<sha256>.tgz" files as well and append this information to a `jruby_inventory.toml` file.
- We now create only one sha named archive for jruby builds since both ARM and AMD point to the same source. - Binary `inventory_check` takes in a file and validates that all checksums are valid.
Sidenote, the check to validate that the URLs are valid works! https://github.com/heroku/docker-heroku-ruby-builder/actions/runs/10102146736/job/27937182426?pr=51
2cb828c
to
19bfff6
Compare
Before: ``` drwxr-xr-x 2 root root 4096 Jul 25 22:34 . drwxr-xr-x 3 runner docker 4096 Jul 25 22:32 .. -rw-r--r-- 1 root root 24061018 Jul 25 22:34 ruby-3.2.3.tgz ``` With these permissions we were unable to write to the directory. After: After ``` drwxrwxrwx 2 root root 4096 Jul 27 02:53 . drwxr-xr-x 3 runner docker 4096 Jul 27 02:51 .. -rw-r--r-- 1 root root 24061325 Jul 27 02:53 ruby-3.2.3.tgz ``` Now we can copy the file to a new filename in the same directory.
``` error: the borrowed expression implements the required traits --> ruby_executable/src/bin/ruby_build.rs:143:24 | 143 | docker_run.arg(&format!( | ________________________^ 144 | | "./make_ruby.sh {} {}", 145 | | input_tar.display(), 146 | | output_tar.display() 147 | | )); | |_________^ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrows_for_generic_args = note: `-D clippy::needless-borrows-for-generic-args` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::needless_borrows_for_generic_args)]` help: change this to | 143 ~ docker_run.arg(format!( 144 + "./make_ruby.sh {} {}", 145 + input_tar.display(), 146 + output_tar.display() 147 ~ )); | error: could not compile `ruby_executable` (bin "ruby_build" test) due to 1 previous error warning: build failed, waiting for other jobs to finish... ```
a83e17d
to
e3cc816
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d probably have preferred less duplication between ruby_build.rs
and jruby_build.rs
. Seems to me those could actually be merged almost entirely, and at least the newly introduced logic is identical.
Probably more of a matter of style (and not super important in this context/for what this repository does), just seems like there’s higher than necessary risk that the two implementations will fall out of sync over time.
I’d also have preferred to see all the code being included in this PR (and not in the GemVersion
crate), but I understand the intention is to move that in the future/this is acceptable for now, so going to approve :)
The Ruby buildpack is tightly coupled to the URLs generated by this repo. This change introduces an inventory/manifest file that can be used as a layer of indirection. This is a pattern used by several other buildpacks:
The file acts as a lookup, so I can say "find the entry with version 3.3.3 from
ruby_inventory.toml
forheroku-24
witharm64
CPU architecture and it will return an entry containing a URL and a checksum. The URL says where to find that artifact and the checksum validates that the artifact hasn't been modified between when it was added to the inventory and now.It also allows us to decouple modifying the S3 bucket with rolling out the version on Heroku via the buildpack. In effect this allows us to make an automated release pipeline (in the future) where source changes can be detected automatically on the ruby-lang source site and built and uploaded to S3. This will trigger a manifest entry to be added to the files which will (eventually) trigger a PR to the Ruby buildpack. A single engineer can review that, approve and merge it, and then deploy the change. Versus today, the process requires two engineers for compliance. (non-automated PRs require approval of a second engineer that didn't write the change).
Rebuild support
As we roll this out we have to preserve the current binary filenames or it will break existing buildpacks. However we also need to introduce a new filename to allow for multiple binary builds of the same version. Details:
Currently the same version of Ruby can be re-built which will over-write the existing version. This should produce the same output, but we've not done any work on byte-for-byte reproducible builds and even a single bit changed (due to a timestamp, for example) would cause a different SHA. That means that we cannot ship a manifest file that points directly at the current URLs i.e.
https://heroku-buildpack-ruby.s3.us-east-1.amazonaws.com/heroku-24/arm64/ruby-3.3.4.tgz
otherwise if the build is re-run the SHA will change and the inventory will no longer be valid.There is a need to be able to generate a binary multiple times in the event of a problem. A recent example is #49 where a jruby tar file was generated with truncated filenames.
This is solved by appending the first 7 characters of the sha to the file so now when 3.3.4 is built it will generate both:
heroku-24/arm64/ruby-3.3.4.tgz
heroku-24/arm64/ruby-3.3.4-<sha7>.tgz
likeheroku-24/arm64/ruby-3.3.4-7bebeee.tgz
The second tar file's checksum should never change. This will allow us to roll out a manifest file with checksum validation enabled and also allow us to re-generate binaries if needed.