Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose a SemanticDB provider in the semanticdb rule phase #60

Merged
merged 3 commits into from
Nov 12, 2024

Conversation

jadenPete
Copy link

This also makes the phase work under path mapping.

@jadenPete jadenPete requested a review from jjudd October 21, 2024 15:34
@jadenPete jadenPete self-assigned this Oct 21, 2024
Copy link

@jjudd jjudd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a pass through the review and things mostly LGTM. I have some concerns around performance, but no concerns about correctness.

)

def _semanticdb_directory_from_file(file):
return file.path[:file.path.find("META-INF") - 1]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work with path mapping? If so, can this be changed to use short_path instead, so we don't accidentally match on any config/architecture dependent parts of the path?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably should've mentioned this in a comment, but I acknowledge this is pretty janky and could match on the wrong part of the path. Unfortunately, we're pretty limited in what we can do here. From the documentation on Args#add_all:

To avoid unintended retention of large analysis-phase data structures into the execution phase, the map_each function must be declared by a top-level def statement; it may not be a nested function closure by default.

It does work with path mapping because this function will be called at execution time, not analysis time. In fact, this sort of approach is recommended for use with path mapping:

Generally speaking, all functions that return a path string that may contain a configuration prefix such as bazel-out/darwin-amd64-fastbuild/bin must only be called in map_each callbacks, where they are automatically path mapped by Bazel.

Using short_path doesn't work because the compiler expects the output base to be in the path as its working directory is not the output base. I can do this instead, though:

def _semanticdb_directory_from_file(file):
    return "{}/{}".format(file.root.path, file.short_path[:file.short_path.find("META-INF") - 1])


options.flatMap {
case s"-P:semanticdb:targetroot:$path" =>
case scala2SemanticDbTargetRootRegex(path) =>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be slow and it's going to get slower the more cases we have here. If we have lots of regexes here you could be evaluating a lot of regexes for every compiler option that has a path in it.

Could we instead make --compiler_option_referencing_path with a format of --compiler_option_referencing_path=<template_compiler_option> <path> to avoid the regex?

We could do something like --compiler_option_referencing_path=-P:semanticdb:targetroot:${path} <path> and then replace ${path} with the path. We do something vaguely similar to this already for --group in the deps checker.

That transforms this from something where, worst case, we evaluate n regexes to something where we evaluate 0 regexes and instead just do a string replace.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize reading more that this doesn't handle your -sourceroot case, but that can easily be handled with something like -sourceroot:${workdir} and then on every string we'd be doing two string replaces: one for the path and one for workdir.

If that's too slow then you could have a separate argument list for --compiler_option_referencing_workdir and just do the workdir there. That way you're not doing unnecessary string replace calls.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea; I didn't think about the case where we add more compiler options to this list. I'll do that.

List(s"-P:semanticdb:sourceroot:${workDir.toString}", s"-P:semanticdb:targetroot:${adjustStringPath(path)}")

case s"-semanticdb-target:$path" =>
case scala3SemanticDbTargetRootRegex(path) =>
List(s"-semanticdb-target:${adjustStringPath(path)}", s"-sourceroot:${workDir.toString}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to ask this when we added the semanticdb functionality, but does it matter if the sourceroot ceases to exist after the semanticdb call? Because there's no guarantee the sandbox directory will exist long term.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. SemanticDB information is separate from sources and from what I understand, the SemanticDB files don't include the source root path. The sandbox directory only needs to live as long as the compiler.

workRequest.plugins.map(p => s"-Xplugin:$p") ++
workRequest.compilerOptions ++
workRequest.compilerOptionsReferencingPaths,
).toArray,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the single .toArray optimization here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏


case s"-semanticdb-target:$path" =>
List(s"-semanticdb-target:${adjustStringPath(path)}", s"-sourceroot:${workDir.toString}")
private val pathPlaceholderRegex = """\$\{path\}""".r
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is used

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I'll remove it.

@jadenPete
Copy link
Author

Squashed.

@jadenPete jadenPete merged commit 3fe0077 into lucid-master Nov 12, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants