Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework how bazel stamping work in the repository #20881

Merged
merged 10 commits into from
Feb 2, 2024

Conversation

pamaury
Copy link
Contributor

@pamaury pamaury commented Jan 18, 2024

This PR reworks completely how stamping works in the repository:

  • It introduces stamp.bzl which is a module to handle stamping. Notably, it allows rules to add stamp parameters
    that behaves like the usual bazel conversion (0 means no stamping, 1 is always stamping, -1 follows the command line).
  • It fixes several python tool and bazel rule to use this module so that they only stamp if required to do so.
  • It removes some targets that were always stamping and have now become useless

What does this change do:
With this PR, there are two main changes:

  • Certain headers generated by reggen (autogen_hjson_header, the tock files) do not contain stamping information anymore by default. If one requires stamping, see below.
  • The topgen code has been updated but it is not called direcly by bazel (it is called by make) so the default behaviour remains the same and this does not affect bazel caching anyway.
  • Opentitantool is not stamped anymore by default so the version command will print:
{
  version: null,
  revision: null,
  clean: null
}

The build timestamp is removed also since it adds no useful information.

Open question:
The chip_info section of the ROM still includes the git sha by default. We could change this behaviour by setting stamp = -1 in the definition of autogen_chip_info_src but the rom_chip_info.py script will error out because it always exactly a valid SHA to embed. This is problematic though because it will prevent proper caching of the ROM and the tests. I think we should not stamp by default but I see at least two options to do that:

  • make the rom_chip_info.py script use a default value if none is provided: this is easy but implicit/hidden and easy to forget, I am not a fan of this idea.
  • modify the stamping code so that we explicitely pass a default version file when stamping is disabled and we store a default value of the sha in the ROM when stamping is disabled. This produces the same effect but with the added benefit that this is a lot more explicit. I envision something like this:
autogen_chip_info_src = rule(
    implementation = _chip_info_src,
    attrs = {
        "_tool": attr.label(
            default = "//util:rom_chip_info",
            executable = True,
            cfg = "exec",
        ),
    } | stamp_attr(-1, "//rules:stamp_flag", "//path/to/default/version/file/for/rom_chip_info"),

Summary
I believe thoses changes are acceptable since by default (i.e. in CI or locally), embedding this information does not really make sense. However, when doing a release, we definitely want to have this information. We need to make sure that various release flows add the --stamp parameters to recover the previous behaviour. Alternatively, we could make stamping the default and add --nostamp in CI.

@pamaury pamaury requested review from nbdd0121 and jwnrt January 18, 2024 15:58
@pamaury pamaury changed the title Bazel stamp Rework how bazel stamping work in the repository Jan 18, 2024
Copy link
Contributor

@jwnrt jwnrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes LGTM but I'm not sure I understand what happens when stamping is disabled. Will parse_version_file not find the version file and return an empty dictionary? How does that translate to the data that gets inserted into files, i.e. what will this Tock comment look like?

https://github.com/lowRISC/opentitan/blob/master/util/reggen/gen_tock.py#L398

@pamaury
Copy link
Contributor Author

pamaury commented Jan 22, 2024

Yes parse_version_file will return an empty dictionary and it will print something like:

// Generated register constants for rv_plic.
// Built for <unknown>
// https://github.com/lowRISC/opentitan/tree/<unknown>
// Tree status: <unknown>
// Build date: <something>

I think this is a bit ugly so my plan is to go through the python code and make them not generate those string at all if there is no information to display. I think this is more reasonable than displaying the above, what do you think?

@jwnrt
Copy link
Contributor

jwnrt commented Jan 22, 2024

Yes, I agree this sounds like a good plan.

The one case you may have issues with is that year that gets inserted into the Tock copyright header. I think the correct year for that header should be the year that the template for the generated file was written by a human anyway and it shouldn't be updated every time the file is generated.

Anyway, maybe that's not an issue and you can just leave the year out entirely.

@pamaury
Copy link
Contributor Author

pamaury commented Jan 22, 2024

Yes, in the current code (not yet pushed), I am leaving out the year entirely. Anyway, most files in our repository do not include a year in the copyright.


with open(path, "rt") as f:
version = f.read().strip()
version = version_info.get('BUILD_SCM_REVISION', "8badF00d")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer the current approach where a version file is produced to them, instead of having all these tools having to parse volatile-status.txt.

I think having an extra step in the middle (like we currently do), have an advantage that if we need stamping, we can still have some cache hit if some bits of volatile-status.txt changed, but the needed bits happen to stay the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment though, we have a mix of tools using those "derived" files and some other parsing volatile-status.txt (or a copy thereof). This also means that we would have to use "fake" values when stamping is disabled which I don't find to be particularly useful.
Also, some tools like reggen use at least four different pieces of information so that would require to pass four different files on the command-line which I am not sure is much better than passing a single file with a known format?
What I am proposing is that all tools use the same format which is parsed by the same code contained in a python library so there is no code duplication.

rules/stamp.bzl Show resolved Hide resolved
Several tools need to extract some information from bazel's version
file so create small library for that purpose to avoid duplicating
code.

Signed-off-by: Amaury Pouly <[email protected]>
The purpose of this module is to enabled rules to detect whether
stamping is enabled on the command-line or not. It requires some
hackery inspired by a bazel bug thread and which is very similar
to how rules_rust handles this as well.

Signed-off-by: Amaury Pouly <[email protected]>
With this change, `autogen_hjson_header` will now only pass
stamping information to regtool if required on the command-line.

Signed-off-by: Amaury Pouly <[email protected]>
util/topgen/rust.py Outdated Show resolved Hide resolved
@pamaury
Copy link
Contributor Author

pamaury commented Jan 23, 2024

I have done some fixes to topgen and also remove the use of the timestamp entire: I agree with @nbdd0121 that the generation time and date seems irrelevant, what matters is the git revision.

Copy link
Contributor

@jwnrt jwnrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo the style error, nice work!

With this change, `autogen_chip_info_src` will only pass version
information if stamping is enabled on the command-line. This
commit changes the meaning of the `--ot_version_file` flag of
`//util:rom_chip_info`. Previously, it was a path to a file
that contains the git sha. It is now a path to the bazel version
info file which is parsed using the `version_info` library.

Note: with this change, unless stamping is enabled, the ROM will
not contained the git sha in the chip_info anymore.

Signed-off-by: Amaury Pouly <[email protected]>
This dependency is not used because the version information
comes from `stamp-env.txt` which is handled directly by
`rust_binary`.

Signed-off-by: Amaury Pouly <[email protected]>
With the previous changes, those targets have become useless and
can be removed.

Signed-off-by: Amaury Pouly <[email protected]>
If stamping is disable, version_stamp will be an empty dictionary
and the code will print something like
```
// Generated register constants for rv_plic.
// Built for <unknown>
// https://github.com/lowRISC/opentitan/tree/<unknown>
// Tree status: <unknown>
// Build date: <current date>
```
This is unhelpful and still prints the current time which is defeats
the point of not stamping. The new code will not print anything
when a piece of information is not available.

Signed-off-by: Amaury Pouly <[email protected]>
Instead of treating version_stamp like a dictionary with specific
name entries, turn it into a class with proper accessors so that
the particular representation of the version information is
abstracted.

Signed-off-by: Amaury Pouly <[email protected]>
@pamaury
Copy link
Contributor Author

pamaury commented Jan 23, 2024

I had to make some changes so that the generator emits exactly the same code as previously (otherwise I would have to regenerate some files for no good reason). I also reverted partially the change to rom_info so that it always stamps (see updated text at the top).

Allow opentitantool to be built with stamping disabled, and disable
it by default. Update the `version` command to handle the case of
missing information. The build time/date is removed since it adds
no useful information.

Signed-off-by: Amaury Pouly <[email protected]>
@pamaury
Copy link
Contributor Author

pamaury commented Jan 23, 2024

I have updated the PR to disable stamping in opentitantool by default as well.

pub clean: Option<bool>,
}

fn parse_variable(content: &str) -> Option<String> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a hack. Could you make it so that if stamping is disabled, the stamp-env is not given at all? Then you can just use std::option_env!(...).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that this is not possible with the way rules_rust handles rustc_env_file: stamping is handled by replacing {BUILD_SCM_REVISION} and others literally in the env file. If stamping is disabled, the replacement does not occur. But the content of the stamp-env.txt file does not depend on whether stamping is enabled or not. This, admitedly, make the way stamping is handled by rustc not so useful.
We could add custom feature to our rules_rust fork so that writing "{BUILD_SCM_REVISION?}" deletes the variables if not present. Maybe this could even be accepted upstream?

Alternatively, this could be achieved with some custom bazel rule to produce either an empty file or a version file, depending on whether stamping is enabled or not, and then give the target to rustc_env_file, but this could be more trouble than it is worth. Something like:

version_file_or(
    name = "ott_version_file",
    default_file = "//path/to/empty/file",
)
genrule(
  name = "ott_env_file",
  src = [":ott_version_file"],
  outs = ["stamp-env.txt"],
  cmd = "<inset awk script>", # convert from "KEY VALUE" to "KEY=VALUE"
)

Copy link
Contributor

@a-will a-will left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Starlark and BUILD files seem good to me. I didn't get a chance to look at the opentitantool bits.

log.info("{} {}", k, v)
except ValueError:
raise SystemExit(sys.exc_info()[1])
version_stamp = version_file.VersionInformation(args.version_stamp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but at some point, we may want to also accept user-provided versioning info. If others are to use a release from this repo for tools and IP, the opentitan git revision is not necessarily the source for it (if a release is even encapsulated in a git repo at all!).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's a good point. To some extent this is already possible by customizing the util/get_workspace_status.sh script but if necessary, we could introduce a bazel mecanism to override this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this is quibbling on terms, but the user-provided versioning info is encoded in BUILD_GIT_VERSION. That value is the output of git describe and communicates the last git tag, the "distance" from the tag (ie: number of commits) and a git object identifier of the current head at build time.

In the case where one is building exactly at the tag, the extra info (distance, object id) are omitted and the value is exactly the name of the tag.

IMHO, the tag name is the user-provided versioning info as users are responsible for naming the tags. Is there some additional perspective about what we should consider as user-provided versioning info?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I'm talking about when the bazel workspace is no longer in a git repo. Once you make a release tarball, there's no more git describe. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a fair point, we do have any mecanism that works outside the git repository. It seems to me that the purpose of a release could actually include the production of a file that captures the output of the workspace status during the release, and that is then used by bazel instead of running the script. I am not entirely certain this could be achieved without the passing a config file to bazel or releasing a modified BUILD/WORKSPACE file.

@pamaury pamaury marked this pull request as ready for review February 1, 2024 10:03
@pamaury pamaury requested review from msfschaffner, cfrantz and a team as code owners February 1, 2024 10:03
@pamaury pamaury added the CherryPick:earlgrey_es_sival This PR should be cherry-picked to earlgrey_es_sival label Feb 1, 2024
@jwnrt jwnrt merged commit fa1aab4 into lowRISC:master Feb 2, 2024
33 checks passed
Copy link

github-actions bot commented Feb 2, 2024

Successfully created backport PR for earlgrey_es_sival:

@jwnrt
Copy link
Contributor

jwnrt commented Feb 2, 2024

I'm seeing these warnings in the Bazel output with this change applied, but they don't look fatal. Were these always here or from this PR?

I don't think they warrant reverting, but we should address them soon.

INFO: From Action hw/ip/edn/data/edn_regs.h:
opentitan/util/reggen/gen_cheader.py:215: UserWarning: Cannot generate a module define of type logic [31:0]
  warnings.warn("Cannot generate a module define of type {}"
INFO: From Action hw/top_earlgrey/alert_handler_regs.h:
opentitan/util/reggen/gen_cheader.py:215: UserWarning: Cannot generate a module define of type logic [NAlerts-1:0][NLpgWidth-1:0]
  warnings.warn("Cannot generate a module define of type {}"
opentitan/util/reggen/gen_cheader.py:215: UserWarning: Cannot generate a module define of type logic [NAlerts-1:0]
  warnings.warn("Cannot generate a module define of type {}"
INFO: From Action hw/top_earlgrey/alert_handler_regs.h:
opentitan/util/reggen/gen_cheader.py:215: UserWarning: Cannot generate a module define of type logic [NAlerts-1:0][NLpgWidth-1:0]
  warnings.warn("Cannot generate a module define of type {}"
opentitan/util/reggen/gen_cheader.py:215: UserWarning: Cannot generate a module define of type logic [NAlerts-1:0]
  warnings.warn("Cannot generate a module define of type {}"
INFO: From Action hw/ip/edn/data/edn_regs.h:
opentitan/util/reggen/gen_cheader.py:215: UserWarning: Cannot generate a module define of type logic [31:0]
  warnings.warn("Cannot generate a module define of type {}"

@pamaury
Copy link
Contributor Author

pamaury commented Feb 2, 2024

I have seen these warnings for ages on my machine.

@nbdd0121
Copy link
Contributor

nbdd0121 commented Feb 2, 2024

Yeah, these are unlikely to be relevant.

@aning12345
Copy link

Hi all, I have an ERROR while building the latest version:
ERROR: Traceback (most recent call last):
File "/rv/opentitan-20240307/opentitan/rules/autogen.bzl", line 75, column 7, in
} | stamp_attr(-1, "//rules:stamp_flag"),

Error: unsupported binary operation: dict | dict

@pamaury
Copy link
Contributor Author

pamaury commented Mar 7, 2024

Hi all, I have an ERROR while building the latest version: ERROR: Traceback (most recent call last): File "/rv/opentitan-20240307/opentitan/rules/autogen.bzl", line 75, column 7, in } | stamp_attr(-1, "//rules:stamp_flag"),

Error: unsupported binary operation: dict | dict

Hi, just to clarify what do you mean by "the latest version"? Are you using the bazelisk.sh script in the repository? If not, which version of bazel?
Per the bazel documentation: The union expression x | y yields a new dictionary by combining two existing dictionaries. So this is surprising!

@a-will
Copy link
Contributor

a-will commented Mar 7, 2024

Hi all, I have an ERROR while building the latest version: ERROR: Traceback (most recent call last): File "/rv/opentitan-20240307/opentitan/rules/autogen.bzl", line 75, column 7, in } | stamp_attr(-1, "//rules:stamp_flag"),

Error: unsupported binary operation: dict | dict

Sounds like the incorrect bazel version. The union operator ostensibly became available in 6.0.0, which is still older than the required version (6.2.1).

@aning12345
Copy link

yes, I update bazel version, it is ok now. BTW, if my network is limited, how to resolve the Error as below:
Error downloading [https://static.rust-lang.org/dist/2023-10-05/rustc-nightly-x86_64-unknown-linux-gnu.tar.gz] to /rv/.cache/bazel/_bazel_ning/1b677cabd696bf267ab38566915a2e95/external/rustfmt_nightly-2023-10-05__x86_64-unknown-linux-gnu_tools/temp16369494883206175619/rustc-nightly-x86_64-unknown-linux-gnu.tar.gz: connect timed out

please comment ,thank you very much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CherryPick:earlgrey_es_sival This PR should be cherry-picked to earlgrey_es_sival
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants