Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyc: add arg reset-m-time #26

Closed
wants to merge 1 commit into from

Conversation

prestist
Copy link
Contributor

In ostree systems, the mtime gets confused and results in a re-compile of the .pyc files due to how the files are linked. Add support to identify and set mtime to zero to avoid this issue.

In ostree systems, the mtime gets confused and results in a re-compile of the .pyc files due to how the files are linked. Add support to identify and set mtime to zero to avoid this issue.

Co-authored-by: Luke Yang <[email protected]>
@prestist
Copy link
Contributor Author

@keszybz we are wanting to leverage add-determinism to rpm-ostree and ostree-rs-ext and were wondering if there would be a way to use the current version of add-determinism as a crate or if we needed to expand some of the calls by making it more suitable for this.

The two use cases we see listed in the readme dont really follow ours.

What are your thoughts? and if its okay can we expand the api to allow it to be consumed in this manner?

@@ -700,8 +700,28 @@ impl PycParser {
assert_eq!(data == self.data, removed_count == 0);
Ok((removed_count > 0, data))
}

fn clear_mtime(&mut self) -> Result<(bool, Vec<u8>)> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pass the existing data and have_mod as refs here, no need to clone.

@keszybz
Copy link
Owner

keszybz commented Jul 12, 2024

What are your thoughts? and if its okay can we expand the api to allow it to be consumed in this manner?

Yes.

In ostree systems, the mtime gets confused and results in a re-compile of the .pyc files due to how the files are linked. Add support to identify and set mtime to zero to avoid this issue.

Can you explain what the issue is? I thought ostree uses mtime of 0, but I'd like to understand the issue more fully.

@prestist
Copy link
Contributor Author

prestist commented Jul 12, 2024

What are your thoughts? and if its okay can we expand the api to allow it to be consumed in this manner?

Yes.

In ostree systems, the mtime gets confused and results in a re-compile of the .pyc files due to how the files are linked. Add support to identify and set mtime to zero to avoid this issue.

Can you explain what the issue is? I thought ostree uses mtime of 0, but I'd like to understand the issue more fully.

Sure, its a looong thread found here ostreedev/ostree#1469

Once we add the support to add-determinism

we want to use it in rpm-ostree post-process phases, before ostree commit, probably around https://github.com/coreos/rpm-ostree/blob/edd1183a116d5c152419832ab8b845c6298a57bf/rust/src/composepost.rs#L1202

Then we are going to add it to ostree container commit to make this work for container builds as well. See: ostreedev/ostree-rs-ext#388 and probably implemented around here https://github.com/ostreedev/ostree-rs-ext/blob/main/lib/src/commit.rs

@@ -713,21 +733,29 @@ impl super::Processor for Pyc {

fn process(&self, input_path: &Path) -> Result<super::ProcessResult> {
let (mut io, input) = InputOutputHelper::open(input_path, self.config.check)?;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: some whitespace error here

let mut parser = PycParser::from_file(input_path, input)?;
if parser.version < (3, 0) {
return Ok(super::ProcessResult::Noop); // We don't want to touch python2 files
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem

@travier
Copy link
Contributor

travier commented Jul 15, 2024

keszybz added a commit that referenced this pull request Jul 17, 2024
Pyc files store the mtime, size, and optionally hash of the original
py file. Mtime is used to validate the cached bytecode: the mtime
of the .py file (if present) must be equal to the stored value in
the .pyc file. If not, the pyc file is considered stale.

On ostree systems, all mtimes are set to 0. The problem is that the
.py file is created first, then the byte compilation step creates
.pyc with some embedded timestamp, and later ostree discards mtimes
on all files. On the end system, the bytecode cache is considered
invalid.

This new handler does the two very simple things:
1. zero out the mtime in the .pyc file
2. set mtime (in the filesystem) to zero for the .py file

This second step is also done by ostree, so strictly speaking, it's
not necessary. But it's easy to do it, and this way the bytecode
still matches the file on disk, so if we called Python after this
operation, it would be able to use the bytecode and wouldn't try to
rewrite it.

Replaces #26.
See ostreedev/ostree#1469,
https://gitlab.com/fedora/bootc/tracker/-/issues/3,
https://pagure.io/workstation-ostree-config/pull-request/505.
@keszybz
Copy link
Owner

keszybz commented Jul 17, 2024

Please see #27. It implements this idea, but the implementation is quite different.

@prestist
Copy link
Contributor Author

prestist commented Jul 17, 2024

Please see #27. It implements this idea, but the implementation is quite different.

@keszybz Firstly, thank you for taking the time to implement it. It was fun to try and figure out how to work with this repo, sorry if we did not follow the form of it. Your changes look appropriate for our use. The only thing outstanding would be a way to consume the project via rust rather then cli, though I dont think its a blocker either way.

@prestist
Copy link
Contributor Author

Closing as it is being replaced by #27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants