-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure proper repair of wheels #20
Comments
This is broadly speaking desirable but I have some implementation concerns. If you prototype this I can look into it. |
Sure! I had a revelation – instead of making the |
So, would you be okay with GitHub-hosted CI, and with the general proposition to add CI to this repository? |
No, I don't think that will work, because it's important that wheels uploaded to PyPI are reproducible by anyone who wants, with the exact same SHA checksum. So if we use As for CI, I'm not concerned about having it, and I don't want it to become a maintenance overhead. |
Also, |
As far as I understand,
This will be much more difficult, however... since all four of these tools, on any platform, will do
and
i.e., the macOS wheels before and after have different checksums, and this shall similarly be so for Linux. I checked by renaming the file into the previous name again, and the checksum remained the same as the newly modified one, so, the wheel was definitely unzipped and zipped back again during repair. Adding CI indeed brings maintenance overhead. One option that we have still remains, and should work for all of the cases, is hardcoding the new tags (at least the |
wheels, which seems all incorrect to me – I'll file an issue on their tracker. The workflow run, here: https://github.com/agriyakhetarpal/zig-pypi/actions/runs/8841793735 (PR linked above) has the required output, so I think this should be the diff (the ZIG_PYTHON_PLATFORMS = {
'x86_64-windows': 'win_amd64',
'x86-windows': 'win32',
- 'x86_64-macos': 'macosx_10_9_x86_64',
- 'aarch64-macos': 'macosx_11_0_arm64',
+ 'x86_64-macos': 'macosx_11_7_x86_64',
+ 'aarch64-macos': 'macosx_11_7_arm64',
'i386-linux': 'manylinux_2_12_i686.manylinux2010_i686.musllinux_1_1_i686',
# renamed i386 to x86 since v0.11.0, i386 was last supported in v0.10.1
'x86-linux': 'manylinux_2_12_i686.manylinux2010_i686.musllinux_1_1_i686',
- 'x86_64-linux': 'manylinux_2_12_x86_64.manylinux2010_x86_64.musllinux_1_1_x86_64',
+ 'x86_64-linux': 'manylinux_2_12_x86_64.manylinux2010_x86_64.musllinux_1_1_x86_64.manylinux_2_5_x86_64.manylinux1_x86_64',
'aarch64-linux':
'manylinux_2_17_aarch64.manylinux2014_aarch64.musllinux_1_1_aarch64',
'armv7a-linux': 'manylinux_2_17_armv7l.manylinux2014_armv7l.musllinux_1_1_armv7l', While I may not be totally sure about Linux, with my macOS machine I can confirm whether Here's what macOS arm64 wheel$ otool -l dist/ziglang-0.12.0-py3-none-macosx_11_7_arm64/ziglang/zig
macOS amd64 wheel$ otool -l dist/ziglang-0.12.0-py3-none-macosx_11_7_x86_64/ziglang/zig returns
Therefore, both of them were likely built on a macOS 14 Sonoma device with I'll revise this in a PR for now – so, it's up to you if you wish to keep this issue open, since a restricted set of platform tags provides benefits in other ways, such as lesser problems for those with older Linux systems (because they won't be able to download the wheel at all). I'm by no means an expert on debugging GLIBC compatibility issues, and a very novice user of Zig, so I can only answer as much about the Python packaging ecosystem more. |
This commit, as noted in ziglang#20, makes sure that the correct minimum macOS version is asserted at the time when someone tries to download and install the wheels via pip or another package or dependency management tool.
This commit, as noted in #20, makes sure that the correct minimum macOS version is asserted at the time when someone tries to download and install the wheels via pip or another package or dependency management tool.
I think what will happen is that unless we yank and re-publish all the wheels (which raises questions about whether Zig has changed the deployment target--I suspect it did) it'll just result in an older wheel being downloaded. |
As long as this process is completely deterministic that's fine, but if it's not I'm not going to use it since it breaks reproducibility.
I don't have time to do this but I can ping you about it. |
Fair enough – I confirmed with a few more tests and it does break reproducibility, since repairing the same wheel multiple times displays a different SHA-256 checksum each time, regardless of whether I use
Much appreciated :) I am happy to help out with every new release of Zig. |
I suspected this would be the case if something unpacks and repacks the archive, since most people are quite blasé about implementing that and don't care much about timestamps or file ordering. |
Yes, it's most certainly doing that – in cases where libraries are needed to be vendored into the wheel, it has to be unzipped and a hidden
We should be able to do this on our own with a shell script or two, but I feel that ensuring reproducibility is important in general – if these wheel repair tools can offer this use case upstream, it would be highly beneficial. I'll put in a feature request there. |
This repo is supposed to work on Windows (not Cygwin or MinGW), so I don't think a shell script cuts it. I'm also not completely sold on the approach of "repairing" the wheel, vs. producing a correct one in first place. |
Ah, apologies if I was terse. With this, I meant having a PowerShell script as well, since it's now the default shell on Windows. A general Python script would be a better cross-platform alternative, in any case.
I agree, this is just one of the solutions. I think that out of all of the solutions, it's still better to dry-run repair, and adjust platform tags manually, at least not until this is added as a feature. Therefore, I can offer no better than that at this time. Although, I wonder if you could version the script with tags for this repository. That would ensure a higher level of reproducibility. For example, changes like #21 have bumped the macOS minimum version, and if someone now wants to build a bundle of macOS wheels for older Zig versions, say, 0.7.0, they'll receive this new constraint too – but Zig 0.7.0 might be compatible with older versions of macOS, too say, 10.15, (I haven't checked though) and having 11.7 here renders the wheel to be incorrectly tagged). Asking users to check out a particular tag is better than asking them to check out an older commit. That way, you can specify lower and upper bounds for Zig wheels for a particular version of the script, i.e., highlighting what they can build wheels for – the layperson's time machine. This might be too much to maintain, though, so I understand if you don't wish to do this and would prefer responding to users individually if someone runs into this problem and then chooses to report it. |
The problem with tagging is that e.g. I backfilled the 32-bit wheels for 0.11.0 when you asked for it... |
Ah, well, as long as there are not a lot of users for the repository (and if there won't be many), tags are not needed and it's possible to edit the script and publish the wheels manually :) (and many thanks for uploading those!) |
Yeah, it's kind of an error-prone process in first place. One thing that can be done with CI is OIDC based "trusted publishing". I think PyPI did not have it originally when I came up wit this project or I'd have used it. In that case, there is a transparency record published in a log, referring to the git repo, CI workflow, CI workflow logs, etc every time a package is published. |
That works, too. Additionally, please consider signing the binaries with Sigstore, that way they are verifiable (if you publish the relevant signature files to GitHub Releases). I would be happy to set CI up for you for both of these things. |
I think we can start with a workflow responding to I agree that reproducibility is probably not enough and I'd really rather not be on a critical path to accurate delivery of opaque binaries. |
Thank you for your comments. I agree |
(Hi - author of repairwheel here) Are you seeing this when repairing a single wheel multiple times (i.e., repairing an already-repaired wheel), or are you seeing different output when repairing the same original wheel multiple times? If it's the latter I'd definitely be interested in more details. |
Hi, @jvolkman! Thanks for chiming in – I just tried to build the wheel here again with
I then copied this wheel to a particular directory, repaired it with First attempt
The SHA-256 checksum of the wheel:
and I repaired the same wheel, again, this time into a different directory (not repairing an already repaired wheel): Second time
and the checksum is indeed the same:
So, yes, you're right – I was indeed repairing an already-repaired wheel multiple times, and repairing the same unrepaired wheel is deterministic, which is good – resolves many of the problems here. However, as a by-product, the issue that I noticed is about the wheel is receiving a |
Same thing with the |
Got it. Good to hear about the reproducible repair. I'd guess the other issue is just that repairing musl wheels isn't currently supported, and the failure manifests as that ancient linux tag being added. The problem is that I don't have a good solution, but it would be nice to support musl wheels somehow. Feel free to open an issue to track that if you'd like. |
I notice that the wheels currently have hardcoded platform tags that do not check for GLIBC version constraints or the macOS version constraints, and therefore might not be repaired fully, i.e., additional tags can be added for the Linux wheels and macOS wheels can better reflect the minimum required OSX version.
A quick experiment with
delocate
locally led me to find out that the minum required macOS version is macOS X 11.7 for both M-series and Intel wheels, and not 11.0 and 10.9 respectively. On my local machine (macOS), it is not possible to useauditwheel
because of hard-coded checks to stop usage outside of Linux platforms, therefore, I tested using the cross-platform alternativerepairwheel
, which granted additional tags to some of the Linux wheels:linux_1_1_aarch64
and thelinux_2_17_armv7l
tags for some reason, which won't be accepted on PyPI. I think that is a bug onrepairwheel
's side.auditwheel
might be more robust in this case.For Windows wheels,
delvewheel
showcased that no external DLLs are required or need to be mangled, as expected, and did not modify the wheel.I would like to propose that an additional step in the
make_wheels.py
file after the wheels have been downloaded can do this action, andrepairwheel
can be added as a dependency inpyproject.toml
. It would also make sense to provide an option through the command-line argument parser to not repair the wheels if needed, but I would argue that it's most likely better to have repaired wheels by default. Owing to how the binary inside the wheel can change during linkage, I would recommend this as an additional step instead of hardcoding the new tags in the values of theZIG_PYTHON_PLATFORMS
dictionary (those can then remain the same, or be updated).Footnotes
https://manylinuxinspector.joerick.me/ ↩
https://mayeut.github.io/manylinux-timeline/ ↩
The text was updated successfully, but these errors were encountered: