refactor: MEGA refactor #89

jjmaestro · 2024-09-19T00:32:48Z

Note

Stacked on top of #88

<PLACEHOLDER>: PR description will be a combination of all of the commits, but I wanted to get a draft PR first.

See bazelbuild/bazel#20369 While working on refactoring, I kept hitting rebase conflicts due to issues with the MODULE lock. I guess it's because the Bazel version in e2e tests is much lower than the current one with the fix (7.2) but still, I don't think it adds much to have the lock in e2e testing.

PR GoogleContainerTools#73 added the `_resolve` and this breaks the buildozer fix / autofix

thesayyn · 2024-09-19T15:47:23Z

Please, create these as individual PRs. We won't be reviewing PR as this big.

It seemed like 75afff9 in GoogleContainerTools#47 added the new locks but as new files, that is, the old ones were left behind.

Add support for MODULE.bazel to the lock script and avoid printing an unnecessary (and annoying 😅) error when building in a "modern repo".

Add a `nolock` attribute to avoid getting annoying DEBUG messages for repos that we explicitly want to run without a lock.

* make apt/tests more readable by factoring out the parameters * add a "test suite macro" in each test file that group all of the unit tests in the file and prepends a "test suite prefix". IMHO this is better than using `unittest.suite` because we provide better naming than the automated `_test<NUMBER>` plus these better names are actual targets that can be executed one-by-one by name.

Add other nolock tests to exercise the package repos (the templates, etc).

* separate the script into a template file so it's easier to shellcheck and syntax highlight in editors. * shellcheck the script and remove all SC2086 warnings ("Double quote to prevent globbing and word splitting") * improve the buildozer help messages in the copy.sh script: * reduce duplication of buildozer command * add a more clear autofix bazel run command that can be easily copy-pasted * change some of the variable names in the copy.sh template for longer, easier to understand names (repo_name >> name, lock_label >> label) * move repo_name and workspace_relative_path into variables to reduce line length and improve readability

While working with some flaky mirrors and trying to figure out why they were failing I found the _fetch_package_index code a bit hard to follow so here's my attempt at streamlining it a bit: * Change the general flow of the for-loop so that we can directly set the reasons for failure in failed_attempts * Remove both integrity as an argument and as a return value since neither is ever used. * return the content of the Packages index instead of the path, since we already have the repository_context. This way we don't need rctx anywhere else. * Reword failure messages adding more context and debug information * Shorter lines and templated strings, trying to make the code easier to read and follow.

* remove the `state` "intermediary `struct`" in `package_resolution.bzl` since it wasn't used / needed. * refactor and move _set_dict from `util.bzl` to a `_package_set` method * use `dict .get()` with default values instead of the "nested `if`s" * renamed `package()` to `package_get` and make it return all package versions when the version is not specified so we can remove `_package_versions` * reordeder `(name, version, arch)` args to match the order of the index keys `(arch, name, version)`

Add testing for package_index mocking the external / side effects (downloads, decompression, etc).

The package resolution debugging that e.g. checks the package dependencies should all be within package_resolution.bzl _resolve_all() and not "leak out" returning the information that's only needed for debugging / logging. Also: * reduce the verbosity of the optional dependencies warning by just printing one message per root package instead of one per package. * break up the long lines to build the error messages and remove the "# buildifier: disable=print".

* move version constraint parsing from package_resolution to its own _parse_version_and_constraint method in version.bzl * refactor _version_relop into a compare method in version.bzl plus a VERSION_OPERATORS dict so that (1) we use the operator strings everywhere and (2) we can use the keys to validate the operators.

Previously we had: ```starlark pkgindex = package_index.new(rctx, sources = sources, archs = manifest["archs"]) pkgresolution = package_resolution.new(index = pkgindex) ``` And none of the code of package_resolution was used anywhere but in resolve.bzl and after initializing the `pkgindex`. Also, it makes sense since we are building the index from the manifest and once we have the index we use it to resolve the packages and populate the lock.

Cleanup resolve.bzl and package_index.bzl by moving all of the manifest functionality to a separate manifest.bzl file where we now do all of the work to generate the lock: manifest parsing, validation and the package index and resolution. IMHO this is how it should be because the lock is the "frozen state" of the manifest. * _parse() parses the YAML * _from_dict validates the manifest dict and does the rest of the changes that we need to produce a manifest struct * add extra validation for e.g. duplicated architectures * _lock is the only method that's exposed to the outside and it encapsulates all of the other parts, calling _from_dict and all of the package index and resolution, to produce the lock file. * move get_dupes to util.bzl * refactor the "source" struct into the new manifest where we can now centralize a lot of the structure and logic spread across multiple parts of the code. * remove yq_toolchain_prefix since it's always "yq" and, looking at GH code search, this seems to be a copy-paste leftover from rules_js (or the other way around)... the code is always the same and it never receives a string different from "yq".

* move all of the "package logic" to pkg.bzl * The v2 lockfile format: * doesn't need the fast_package_lookup dict because it's already using a dict to store the packages. * has the dependencies sorted so the lockfile now has stable serialization and the diffs of the lock are actually usable and useful to compare with the changes to the manifest. * removes the package and dependency key from the lockfile, now it's done via an external function (make_deb_import_key in deb_import.bzl) * Remove add_package_dependency from the lockfile API. Now, the package dependencies are passed as an argument to add_package. This way, the lockfile functionality is fully contained in lockfile.bzl and e.g. we can remove the "consistency checks" that were only needed because users could forget to add the dependency as a package to the lockfile. * Ensure backwards-compatibility by internally converting lock v1 to v2. Also, when a lock is set and it's in v1 format, there's a reminder that encourages the users to run @repo//:lock to update the lockfile format.

By separating the migration from the previous commit we get to 1. in the previous commit, run all tests with the new code while locks are still v1 2. update the locks n this commit to V2 so we can then re-run all tests in the final state.

…t.bzl Refactor the package repo templates into their own methods and massively cleanup the `for`-loop in `_deb_package_index_impl`. IMHO overall now there's a much better and clear separation of concerns between the "index repo" (`apt/private/index.bzl`) and the "package repos" (`apt/private/deb_import.bzl`).

jjmaestro · 2024-09-19T16:59:36Z

@thesayyn ok, will do!

But just FYI I thought it would actually be much easier to do it this way since I've already separated the small fixes in #87, the small features in #88, then all of #89 is literally refactoring without adding any functionality, and then #90 has the flat repo feature + NVIDIA fix.

In here each commit is small enough and "self-contained", aiming to clean or improve a specific part and has a proper commit message explaining everything done in it (or so I hope!) and thus it can be easily reviewed one-by-one in the PR.

Anyway, I'll start breaking it up then!

jjmaestro added 2 commits September 18, 2024 16:46

fix: repo name in copy.sh script

35003bb

PR GoogleContainerTools#73 added the `_resolve` and this breaks the buildozer fix / autofix

jjmaestro mentioned this pull request Sep 19, 2024

feat: support flat repos #90

Closed

jjmaestro added 16 commits September 19, 2024 17:20

fix: remove dead locks

847f6e8

It seemed like 75afff9 in GoogleContainerTools#47 added the new locks but as new files, that is, the old ones were left behind.

feat: add support for MODULE.bazel to the lock copy.sh script

1235379

Add support for MODULE.bazel to the lock script and avoid printing an unnecessary (and annoying 😅) error when building in a "modern repo".

feat: avoid DEBUG messages for lockless repos

55c332a

Add a `nolock` attribute to avoid getting annoying DEBUG messages for repos that we explicitly want to run without a lock.

test: add a bullseye_nolock package to the tests

a48fe7d

Add other nolock tests to exercise the package repos (the templates, etc).

tests: package_index_test

626b53e

Add testing for package_index mocking the external / side effects (downloads, decompression, etc).

chore: migrate repo locks to v2

3de33d8

By separating the migration from the previous commit we get to 1. in the previous commit, run all tests with the new code while locks are still v1 2. update the locks n this commit to V2 so we can then re-run all tests in the final state.

jjmaestro force-pushed the refactor-mega-refactor branch from ec818c3 to 6fb7ef8 Compare September 19, 2024 16:24

jjmaestro closed this Sep 19, 2024

jjmaestro deleted the refactor-mega-refactor branch September 19, 2024 17:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: MEGA refactor #89

refactor: MEGA refactor #89

jjmaestro commented Sep 19, 2024 •

edited

Loading

thesayyn commented Sep 19, 2024

jjmaestro commented Sep 19, 2024

refactor: MEGA refactor #89

refactor: MEGA refactor #89

Conversation

jjmaestro commented Sep 19, 2024 • edited Loading

thesayyn commented Sep 19, 2024

jjmaestro commented Sep 19, 2024

jjmaestro commented Sep 19, 2024 •

edited

Loading