Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[build] Support bzlmod for building and installing #22271

Merged
merged 1 commit into from
Dec 10, 2024

Conversation

jwnimmer-tri
Copy link
Collaborator

@jwnimmer-tri jwnimmer-tri commented Dec 4, 2024

Towards #20731. See https://bazel.build/external/migration and https://bazel.build/external/overview#bzlmod for background. As of the Bazel 9 release (late 2025), Bazel's (already deprecated) support for repository rules will be removed entirely, so Drake needs to port to bzlmod over the next ~12 months. This is the first step -- adding a MODULE.bazel file and some related tweaks. Future work involves converting our externals from repository rules to modules.

Note that this commit does not pass all tests when bzlmod is enabled; doing so requires enhancements to our runfiles handling (#22269).


This change is Reviewable

@jwnimmer-tri jwnimmer-tri added priority: medium release notes: none This pull request should not be mentioned in the release notes status: do not review labels Dec 4, 2024
jwnimmer-tri

This comment was marked as outdated.

@jwnimmer-tri

This comment was marked as outdated.

@jwnimmer-tri
Copy link
Collaborator Author

@drake-jenkins-bot linux-jammy-unprovisioned-gcc-wheel-experimental-release please

Copy link
Contributor

@ggould-tri ggould-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 10 of 21 files at r1, 8 of 10 files at r2, 4 of 4 files at r3, all commit messages.
Reviewable status: needs platform reviewer assigned, needs at least two assigned reviewers

@jwnimmer-tri
Copy link
Collaborator Author

@drake-jenkins-bot mac-arm-sonoma-unprovisioned-clang-wheel-experimental-release please

@xuchenhan-tri
Copy link
Contributor

+@xuchenhan-tri

Copy link
Contributor

@xuchenhan-tri xuchenhan-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkpoint. I'll come back on Monday for the test.

Reviewed 10 of 21 files at r1, 7 of 10 files at r2, 3 of 4 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: 6 unresolved discussions, LGTM missing from assignee xuchenhan-tri(platform), needs at least two assigned reviewers


-- commits line 15 at r4:
BTW, I can't seem to observe the effect of this.

I tried running bazel run //:install with and without this commit and melded the results.

Code quote:

- Adjust install rule so installed paths are "drake" not "_main".

CMakeLists.txt line 548 at r4 (raw file):

file(CREATE_LINK "${PROJECT_SOURCE_DIR}/.bazeliskrc" drake_build_cwd/.bazeliskrc SYMBOLIC)
file(CREATE_LINK "${PROJECT_SOURCE_DIR}/MODULE.bazel" drake_build_cwd/MODULE.bazel SYMBOLIC)
file(CREATE_LINK "${PROJECT_SOURCE_DIR}/WORKSPACE" drake_build_cwd/WORKSPACE SYMBOLIC)

BTW, why do we still need the WORKSPACE file now that bzlmod is enabled for CMake?


tools/workspace/default.bzl line 112 at r4 (raw file):

load("//tools/workspace/zlib:repository.bzl", "zlib_repository")

# This is the list of modules that our MODULE.bazel already incorporates.

nit this is not exactly accurate. There's also rules_java that's module only.

Suggestion:

This is the list of modules to be excluded when bzlmod is enabled because our MODULE.bazel already incorporates them.

tools/py_toolchain/BUILD.bazel line 26 at r4 (raw file):

)

py_exec_tools_toolchain(

BTW, TBH, I still haven't quite understand the toolchain change. Can you explain again?


tools/workspace/rules_python/repository.bzl line 29 at r4 (raw file):

        name,
        mirrors = None,
        _constants_only = False):

BTW, it took me a while (and the context of a previous PR) to understand this argument (and why it's named as such). It needs some documentation.


MODULE.bazel line 18 at r4 (raw file):

bazel_dep(name = "rules_java", version = "8.6.1")
bazel_dep(name = "rules_license", version = "1.0.0")
bazel_dep(name = "rules_python", version = "0.40.0")

BTW, is this an exhaustive list of all of our external dependencies that are registered in the Bazel Central Registry?

Copy link
Collaborator Author

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r3, 1 of 1 files at r4, 3 of 3 files at r5, all commit messages.
Reviewable status: 5 unresolved discussions, LGTM missing from assignee xuchenhan-tri(platform), needs at least two assigned reviewers


-- commits line 15 at r4:

Previously, xuchenhan-tri wrote…

BTW, I can't seem to observe the effect of this.

I tried running bazel run //:install with and without this commit and melded the results.

Yes, this commit preserves the current behavior.

If you revert the changes to the second hunk of the drake/BUILD.bazel patch (the install(name = "install_files", ...) thing) while keeping the rest of the PR the same (i.e., with --enable_bzlmod=true), then I believe you would see the install tree have a couple of files in _main instead of drake.

It's possible that only a cmake ../drake && make install shows the erroneous behavior (running the install from CMake vs Bazel) but IIRC both were broken.


CMakeLists.txt line 548 at r4 (raw file):

Previously, xuchenhan-tri wrote…

BTW, why do we still need the WORKSPACE file now that bzlmod is enabled for CMake?

Without a WORKSPACE file to start with, it seems that Bazel (in certain cases?) does not even look for the WORKSPACE.bzlmod file. If we remove this line, a subset of the CI builds will fail (IIRC only one of the pre-merge jobs, but several of the post-merge ones). In other words, it serves as a sentinel, but never gets run.


MODULE.bazel line 18 at r4 (raw file):

Previously, xuchenhan-tri wrote…

BTW, is this an exhaustive list of all of our external dependencies that are registered in the Bazel Central Registry?

Not yet. After the next PR (#22269), I believe we'll have all of the starlark dependencies from BCR (i.e., rules and toolchains) listed in MODULE.bazel.

The general idea of this whole train is that we will incrementally keep adding more and more dependencies into MODULE.bazel (so that add_default_workspace(bzlmod = True) does less and less work -- by growing the list of REPOS_ALREADY_PROVIDED_BY_BAZEL_MODULES) until we can delete WORKSPACE.bzlmod entirely, at which point we are purely either 100% workspace or 100% bzlmod. At that point (or maybe a little sooner), we can deprecate our support for workspace for eventual removal entirely.

I could put some of that backstory into either the issue, or a comment in this file, if that helps?


tools/py_toolchain/BUILD.bazel line 26 at r4 (raw file):

Previously, xuchenhan-tri wrote…

BTW, TBH, I still haven't quite understand the toolchain change. Can you explain again?

This one is slightly different than RobotLocomotion/drake-external-examples#344.

Over there, we disabled the exec interpreter. Here, this is the magic incantation for "make the exec interpreter be the same as the normal interpreter". (This stanza might have worked over there, too, it just took me a while to find this one so I never tried it in DEE.)

The exec toolchain is described a bit here: https://rules-python.readthedocs.io/en/latest/toolchains.html#exec-tools-toolchain-type

Basically, it's the python interpreter which is used internal to the build, as opposed to the normal python interpreter that we're targeting (e.g., the one used to run/test pydrake and its tutorials).


tools/workspace/default.bzl line 112 at r4 (raw file):

Previously, xuchenhan-tri wrote…

nit this is not exactly accurate. There's also rules_java that's module only.

Actually, that's a good point. If I adjust this list to exactly match MODULE.bazel, then the test will get a smidge simpler and the comment here doesn't need to dance around the laser beams.

Done.


tools/workspace/rules_python/repository.bzl line 29 at r4 (raw file):

Previously, xuchenhan-tri wrote…

BTW, it took me a while (and the context of a previous PR) to understand this argument (and why it's named as such). It needs some documentation.

Done

@jwnimmer-tri
Copy link
Collaborator Author

+@ggould-tri for platform review per schedule (tomorrow), please.

Copy link
Contributor

@xuchenhan-tri xuchenhan-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r5, all commit messages.
Reviewable status: 1 unresolved discussion, LGTM missing from assignee ggould-tri(platform)


-- commits line 15 at r4:

Previously, jwnimmer-tri (Jeremy Nimmer) wrote…

Yes, this commit preserves the current behavior.

If you revert the changes to the second hunk of the drake/BUILD.bazel patch (the install(name = "install_files", ...) thing) while keeping the rest of the PR the same (i.e., with --enable_bzlmod=true), then I believe you would see the install tree have a couple of files in _main instead of drake.

It's possible that only a cmake ../drake && make install shows the erroneous behavior (running the install from CMake vs Bazel) but IIRC both were broken.

Ok, I can reproduce the error from bazel build. I don't understand why the diff fixes the error though... Care to explain (perhaps in the BUILD file as comments)?


CMakeLists.txt line 548 at r4 (raw file):

Previously, jwnimmer-tri (Jeremy Nimmer) wrote…

Without a WORKSPACE file to start with, it seems that Bazel (in certain cases?) does not even look for the WORKSPACE.bzlmod file. If we remove this line, a subset of the CI builds will fail (IIRC only one of the pre-merge jobs, but several of the post-merge ones). In other words, it serves as a sentinel, but never gets run.

Ok, interesting. Thanks!


MODULE.bazel line 18 at r4 (raw file):

I could put some of that backstory into either the issue, or a comment in this file, if that helps?

I don't think that's necessary, but certainly go ahead and do that if you think it might help somebody.

The reason that I asked is because the list seems so much shorter than our full set of dependencies...


tools/py_toolchain/BUILD.bazel line 26 at r4 (raw file):

Previously, jwnimmer-tri (Jeremy Nimmer) wrote…

This one is slightly different than RobotLocomotion/drake-external-examples#344.

Over there, we disabled the exec interpreter. Here, this is the magic incantation for "make the exec interpreter be the same as the normal interpreter". (This stanza might have worked over there, too, it just took me a while to find this one so I never tried it in DEE.)

The exec toolchain is described a bit here: https://rules-python.readthedocs.io/en/latest/toolchains.html#exec-tools-toolchain-type

Basically, it's the python interpreter which is used internal to the build, as opposed to the normal python interpreter that we're targeting (e.g., the one used to run/test pydrake and its tutorials).

I think I understood. Maybe I'll come to appreciate this more fully another day.

Copy link
Collaborator Author

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r6, all commit messages.
Reviewable status: LGTM missing from assignee ggould-tri(platform)


-- commits line 15 at r4:

Previously, xuchenhan-tri wrote…

Ok, I can reproduce the error from bazel build. I don't understand why the diff fixes the error though... Care to explain (perhaps in the BUILD file as comments)?

Done (new comment in BUILD.bazel).


MODULE.bazel line 18 at r4 (raw file):

Previously, xuchenhan-tri wrote…

I could put some of that backstory into either the issue, or a comment in this file, if that helps?

I don't think that's necessary, but certainly go ahead and do that if you think it might help somebody.

The reason that I asked is because the list seems so much shorter than our full set of dependencies...

Indeed, a TODO would be very helpful to future readers. Added that now.


tools/py_toolchain/BUILD.bazel line 26 at r4 (raw file):

Previously, xuchenhan-tri wrote…

I think I understood. Maybe I'll come to appreciate this more fully another day.

I probably should have also said -- most acutely, this exec interpreter is the interpreter used by the mkdoc tool to produce documentation_pybind.h; if it ends up being set incorrectly (i.e., if we don't set it here) then we can't even compile pydrake.

Copy link
Contributor

@xuchenhan-tri xuchenhan-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r6, all commit messages.
Reviewable status: LGTM missing from assignee ggould-tri(platform)

Copy link
Collaborator Author

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: LGTM missing from assignee ggould-tri(platform)

Copy link
Contributor

@ggould-tri ggould-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong: ; I learned a lot from this PR.

Reviewed 2 of 3 files at r5, 4 of 4 files at r6, all commit messages.
Reviewable status: 4 unresolved discussions


a discussion (no related file):
minor: The name MODULE.bazel doesn't match the patterns of bazel_lint.bzl and so is excluded from buildifier linting. Not sure if that's intentional but I suspect it isn't.


tools/skylark/drake_py.bzl line 153 at r6 (raw file):

            main = main,
            deps = deps,
            precompile = "disabled",  # To avoid *.pyc conflicts.

minor: I don't know what this means. Is this a reference to the last item of https://rules-python.readthedocs.io/en/stable/precompiling.html ?

(It seems harmless, just mysterious)

Code quote:

# To avoid *.pyc conflicts.

tools/workspace/default.bzl line 390 at r6 (raw file):

        toolchain_excludes = [],
        mirrors = DEFAULT_MIRRORS,
        bzlmod = False):

BTW: In the next file, in _canonicalize_workspace, you rely on bzlmod being explicitly set by name; this default therefore cannot be used correctly in the obvious case.

But possibly this default is meant to be a cmake-only affordance? But given that cmake is already getting anomalous treatment, it might be better to fix the at most one implicated callsite.

Author's discretion, but IMO considering that we're string-matching calls to this function we should probably be explicit.

Suggestion:

        *, bzlmod):

MODULE.bazel line 25 at r6 (raw file):

# TODO(#20731) Move all of our dependencies from WORKSPACE.bzlmod into this file, so
# that downstream projects can consume Drake exclusively via bzlmod (and so that we
# can delete our WORKSPACE files prior to Bazel 9 which drops suppose for it).

BTW, this comment is wrapped to 85 characters.

Code quote:

# TODO(#20731) Move all of our dependencies from WORKSPACE.bzlmod into this file, so
# that downstream projects can consume Drake exclusively via bzlmod (and so that we
# can delete our WORKSPACE files prior to Bazel 9 which drops suppose for it).

See https://bazel.build/external/migration for background.

Note that this commit does not pass all tests when bzlmod is enabled;
doing so requires enhancements to our runfiles handling, so for now
bzlmod is still disabled by default (except for CMake installs).

Details:
- Add MODULE.bazel and WORKSPACE.bzlmod for bzlmod use.
  - Note that WORKSPACE.bzlmod is a copy of WORKSPACE with a tweaked
    overview comment and a single functional change to our bzlmod flag.
  - Fix latent bug in repository_excludes for rules_python.
- Adjust install rule so installed paths are "drake" not "_main".
- Disable drake_py add_test_rule pyc files (for newer rules_python).
- Tweak local_config_cc spelling again for bzlmod.
- Add linter for module<->workspace syncing.
- Turn on bzlmod for CMake to achieve test coverage in CI.
Copy link
Collaborator Author

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r7, all commit messages.
Reviewable status: :shipit: complete! all discussions resolved, LGTM from assignees ggould-tri(platform),xuchenhan-tri(platform)


tools/skylark/drake_py.bzl line 153 at r6 (raw file):

Previously, ggould-tri wrote…

minor: I don't know what this means. Is this a reference to the last item of https://rules-python.readthedocs.io/en/stable/precompiling.html ?

(It seems harmless, just mysterious)

I added more commentary.

Copy link
Contributor

@ggould-tri ggould-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r7, all commit messages.
Reviewable status: :shipit: complete! all discussions resolved, LGTM from assignees ggould-tri(platform),xuchenhan-tri(platform)

@jwnimmer-tri jwnimmer-tri merged commit 6914106 into RobotLocomotion:master Dec 10, 2024
10 checks passed
@jwnimmer-tri jwnimmer-tri deleted the bzlmod-install branch December 10, 2024 17:01
RussTedrake pushed a commit to RussTedrake/drake that referenced this pull request Dec 15, 2024
…2271)

See https://bazel.build/external/migration for background.

Note that this commit does not pass all tests when bzlmod is enabled;
doing so requires enhancements to our runfiles handling, so for now
bzlmod is still disabled by default (except for CMake installs).

Details:
- Add MODULE.bazel and WORKSPACE.bzlmod for bzlmod use.
  - Note that WORKSPACE.bzlmod is a copy of WORKSPACE with a tweaked
    overview comment and a single functional change to our bzlmod flag.
  - Fix latent bug in repository_excludes for rules_python.
- Adjust install rule so installed paths are "drake" not "_main".
- Disable drake_py add_test_rule pyc files (for newer rules_python).
- Tweak local_config_cc spelling again for bzlmod.
- Add linter for module<->workspace syncing.
- Turn on bzlmod for CMake to achieve test coverage in CI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: medium release notes: none This pull request should not be mentioned in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants