Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Path mapping #70

Merged
merged 17 commits into from
Nov 18, 2024
Merged

Path mapping #70

merged 17 commits into from
Nov 18, 2024

Conversation

jjudd
Copy link

@jjudd jjudd commented Nov 15, 2024

This gets us path mapping support and a handful of other niceities

…apping

Path mapping requires a custom Java toolchain where
javac_supports_worker_multiplex_sandboxing is set to True.

There was an issue where rules_jvm_external came before rules_java in
the WORKSPACE and it called the default rules_java toolchain setup
functions, preventing our custom toolchain from being used correctly.

This in turn broke path mapping.
…ferent configurations

Currently we compare the absolute path in the deps checker. This causes
us problems when there are configuration specific parts of one input and
not the other. This happens when there we've written a configuration
specific part of a path to disk and are comparing it to a path mapped
argument.

I imagine this is not the only place we'll run into issues with this.
@jjudd jjudd requested a review from jadenPete November 15, 2024 12:41
# rules_license
rules_license_tag = "1.0.0"

http_archive(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we using this for?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A dependency depended upon this and without setting the version explicitly things were failing. I'm very much looking forward to moving to bzlmod, so we don't have to play this game as much.

@@ -16,7 +16,7 @@ def phase_coverage_jacoco(ctx, g):
return

toolchain = ctx.toolchains["//rules/scala:toolchain_type"]
worker_inputs, _, worker_input_manifests = ctx.resolve_command(
worker_inputs, _ = ctx.resolve_tools(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, can you explain what this does?
How is it different from passing toolchain.code_coverage_configuration.instrumentation_worker in the tools argument of the action below?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolve_command returns a 3 element tuple and we never used the last element. The last element is also empty. This method also requires bash to be installed on Windows.

The Bazel doc also says to consider using resolve_tools if that meets your needs, so I moved to that instead. https://bazel.build/rules/lib/builtins/ctx#resolve_command

In contrast to ctx.resolve_command, this method does not require that Bash be installed on the machine, so it's suitable for rules built on Windows.

Copy link

@jadenPete jadenPete Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why ctx.resolve_tools exists at all? The Bazel documentation didn't make that very clear. In other words, how is this:

worker_inputs, _ = ctx.resolve_tools(
    tools = [toolchain.code_coverage_configuration.instrumentation_worker],
)

ctx.actions.run(
    ...,
    inputs = ... + worker_inputs.to_list(),
)

different from this:

ctx.actions.run(
    ...,
    tools = [toolchain.code_coverage_configuration.instrumentation_worker.files_to_run],
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, doesn't Bazel automatically add the runfiles of the executable to the list of files needed by the action? Why is it necessary to call ctx.resolve_tools and add the files it outputs to the inputs of the action?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have a point. FWIW here's a message from a Bazel Slack thread:

Based on a quick reading of the code, I think that the tools attribute of ctx.actions.run does pretty much the same thing as manually calling ctx.resolve_tools and passing the result to inputs and input_manifests: In both cases, the FilesToRunProvider of the individual targets is checked and its runfiles are added.

It only matters if you accept e.g. a list of labels considered tools (for example when implementing a genrule-like rule).

Here's the design doc as well: https://docs.google.com/document/d/1xPsvTY-vWav9zX--7ieXjUilcl7M46m-88_oNV0QhEU/edit?tab=t.0#heading=h.5mcn15i0e1ch

Link to the Slack thread: https://bazelbuild.slack.com/archives/CA31HN1T3/p1684317657372349

outputs = [in_out_pair[1] for in_out_pair in in_out_pairs],
executable = toolchain.code_coverage_configuration.instrumentation_worker.files_to_run.executable,
input_manifests = worker_input_manifests,
executable = toolchain.code_coverage_configuration.instrumentation_worker.files_to_run,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If toolchain.code_coverage_configuration.instrumentation_worker.files_to_run.executable were provided, would we need to add worker_inputs to the inputs?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctx.actions.run can accept a FileToRun provider, so there was no need for us to be passing in the .executable. This code is equivalent, but just uses less code. https://bazel.build/rules/lib/builtins/actions#run

We were passing in a File with the .executable, but the FilesToRun provider works as well.

I figure if it works with less code, then great. No downsides as far as I know.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we compare the absolute path in the deps checker. This causes
us problems when there are configuration specific parts of one input and
not the other. This happens when there we've written a configuration
specific part of a path to disk and are comparing it to a path mapped
argument.

I wonder if this is why the dependency checking tests have been flaky, and why those tests are now succeeding on this branch. By the way, you may want to re-enable tests/dependencies/indirect/test.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind; you did that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally thought the same thing, but the dependency checking tests were flaky because of the multiplex sandbox bug. I think there's a Bazel bug and have seen some reports on the Bazel GitHub that seem suspiciously similar to what we're running into. At this point, I'm planning to try things out with Bazel 8 and see if the bug persists. I hope it just goes away with Bazel 8.

rules/scala.bzl Outdated
@@ -408,7 +408,7 @@ _scala_repl_private_attributes = _dicts.add(
_runtime_private_attributes,
{
"_runner": attr.label(
cfg = "host",
cfg = "exec",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be host because this is run when the target is run, not when it's built?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think cfg = "host" is going away. https://bazel.build/reference/command-line-reference#flag--incompatible_disable_starlark_host_transitions

Which is why I made these changes.

If set to true, rule attributes cannot set 'cfg = "host"'. Rules should set 'cfg = "exec"' instead.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. Shouldn't it be cfg = "target"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I think you're right. I'll change this.

if g.classpaths.srcs:
args.add_joined("--srcs", g.classpaths.srcs, join_with = " ")
else:
fail("Empty srcs list passed to bootstrap compiler")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until now, our rules have supported an empty source list. Shouldn't it continue to be valid? Or does the Scala compiler not support this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove the fail here. I think whether it will work depends on what the other arguments are.

shell_args = ctx.actions.args()
shell_args.add(ctx.executable._zipper)
shell_args.add_all([gendir], expand_directories = False)
shell_args.add(gendir.short_path)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be shell_args.add([gendir], map_each = _short_path)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand things, short_path doesn't include any configuration specific parts of the path, so it doesn't need to be done the same way.

From the state of path mapping thread:

Generally speaking, all functions that return a path string that may contain a configuration prefix such as bazel-out/darwin-amd64-fastbuild/bin must only be called in map_each callbacks, where they are automatically path mapped by Bazel.

and https://bazel.build/rules/lib/builtins/File#short_path

The path of this file relative to its root. This excludes the aforementioned root, i.e. configuration-specific fragments of the path. This is also the path under which the file is mapped if it's in the runfiles of a binary.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense. We discussed this over our call and I understand better what short_path does now.

args = ctx.actions.args()
args.add(ctx.file._runner)
args.add(ctx.workspace_name)
args.add(manifest.short_path)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be args.add([manifest], map_each = _short_path).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment response as above.

args = ctx.actions.args()
args.add(ctx.file._testrunner)
args.add(ctx.workspace_name)
args.add(manifest.short_path)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be args.add([manifest], map_each = _short_path).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same response as above.

@@ -4,6 +4,6 @@
echo "class A$RANDOM" > Example.gen.scala

rm -fr "$(bazel info execution_root)/.bazel-zinc"
bazel build --noworker_sandboxing --worker_extra_flag=ScalaCompile=--persistence_dir=.bazel-zinc :lib
bazel build --experimental_output_paths=off --noworker_sandboxing --worker_extra_flag=ScalaCompile=--persistence_dir=.bazel-zinc :lib

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why'd we turn this setting off?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not remember and will poke at this. I think this test fails with path mapping on, but I could be wrong about that.

In general the incremental Zinc compilation tests just disable everything like path mapping, sandboxing, etc. in order to function.

That said, now that I know everything else is working I'll go try the test with path mapping on and see what happens.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed that path mapping needs to be disabled for the incremental compilation stuff to work at all as it reaches outside Bazel's sandbox and thus things like path mapping trip it up.

I have some code in phase_zinc_compile that should disable path mapping if using a toolchain that has incremental compilation enabled. Problem is, we're just using the default toolchain in this test and using command line options to disable sandboxing and path mapping.

I don't think I care enough about the incremental compilation stuff to improve this as we're likely ripping that feature out soon.

We replace the string ${workDir} with the workDir path. Problem is,
sometimes the workDir path is " " (the current directory). If that's the
case, then we replace ${workDir} with a blank string. Using an absolute
path prevents that blank string from happening.
Because we're using strings, names like zinc_3 would likely conflict in
other repos.
For some reason multiplex sandboxing seems to have issues with files
going missing. It's been happening for a while, but intermittently. It
got much worse recently.

Not sure if this is a rule set bug or a Bazel bug. Regardless, disabling
multiplex sandboxing fixes it.
It runs on the target architecture rather than the exec architecture, so
this makes more sense.
@jjudd jjudd force-pushed the path-mapping-stacked-toolchain branch from c33607b to ee6bbb7 Compare November 16, 2024 02:06
@jjudd jjudd merged commit ae550b8 into lucid-master Nov 18, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants