-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] DEBUG only {2023.06,2023a} PyTorch-bundle v2.1.2 #603
base: 2023.06-software.eessi.io
Are you sure you want to change the base?
[WIP] DEBUG only {2023.06,2023a} PyTorch-bundle v2.1.2 #603
Conversation
- PR to help debugging various issues when building PyTorch-bundle - includes a fix for `find_library` provided by `ctypes.util` which prevented importing `sndfile` - includes a fix for `aarch64/generic` where importing `sentencepiece` lead to the error `libtcmalloc_minimal.so.4: cannot allocate memory in static TLS block` - includes a fix for the extension `torchvision` where some library was not compiled with `jpeg` support, hence some tests failed
Instance
|
Instance
|
Initially we'll build only for bot: build arch:x86_64/amd/zen2 repo:eessi.io-2023.06-software |
Updates by the bot instance
|
Updates by the bot instance
|
New job on instance
|
New job on instance
|
The two jobs (12607 and 12608) that did not include any fixes failed both in the sanity check for
we repeat the building for the same architectures bot: build arch:x86_64/amd/zen2 repo:eessi.io-2023.06-software |
Updates by the bot instance
|
Updates by the bot instance
|
New job on instance
|
New job on instance
|
Rebuilding for bot: build arch:x86_64/amd/zen2 repo:eessi.io-2023.06-software |
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
New job on instance
|
Maybe related to: |
…-layer into debug-2023.06-software.eessi.io-PyTorch-2.1.2-foss-2023a
- PR EESSI#655 implements a general fix for the import error
Rebuilding after #655 got merged to verify if the bot: build arch:x86_64/amd/zen2 repo:eessi.io-2023.06-software |
Updates by the bot instance
|
Updates by the bot instance
|
New job on instance
|
New job on instance
|
…-layer into debug-2023.06-software.eessi.io-PyTorch-2.1.2-foss-2023a
Rebuilding after changes have been minimised (only hook for SentencePiece kept for now) and #660 has been ingested... bot: build arch:x86_64/amd/zen2 repo:eessi.io-2023.06-software |
Updates by the bot instance
|
Updates by the bot instance
|
New job on instance
|
…-layer into debug-2023.06-software.eessi.io-PyTorch-2.1.2-foss-2023a
Revisit switching off TCMALLOC... bot: build arch:aarch64/generic repo:eessi.io-2023.06-software |
Updates by the bot instance
|
Updates by the bot instance
|
New job on instance
|
Maybe switch off the following |
The main purpose of this PR is to facilitate debugging various issues when building PyTorch-bundle and demonstrating approaches that could solve the issues. It is expected that the fixes provided here are not final.
includes a fix forfind_library
provided byctypes.util
which prevented importingsoundfile
aarch64/{generic,neoverse_n1,neoverse_v1}
where importingsentencepiece
lead to the errorlibtcmalloc_minimal.so.4: cannot allocate memory in static TLS block
includes a fix for the extension$\rightarrow$ torchvision
where some library was not compiled withjpeg
support, hence some tests failedInitially we will disable all fixes, build for selected architectures and document the errors. We then enable fixes one-by-one and document the results (some error fixed, some new errors, ...).
Note, see the original PR for PyTorch-bundle (#585) for additional discussion about some of the issues listed above.