Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jaxlib v0.4.32 #281

Merged
merged 12 commits into from
Oct 14, 2024
Merged

Conversation

regro-cf-autotick-bot
Copy link
Contributor

It is very likely that the current package version for this feedstock is out of date.

Checklist before merging this PR:

  • Dependencies have been updated if changed: see upstream
  • Tests have passed
  • Updated license if changed and license_file is packaged

Information about this PR:

  1. Feel free to push to the bot's branch to update this PR if needed.
  2. The bot will almost always only open one PR per version.
  3. The bot will stop issuing PRs if more than 3 version bump PRs generated by the bot are open. If you don't want to package a particular version please close the PR.
  4. If you want these PRs to be merged automatically, make an issue with @conda-forge-admin,please add bot automerge in the title and merge the resulting PR. This command will add our bot automerge feature to your feedstock.
  5. If this PR was opened in error or needs to be updated please add the bot-rerun label to this PR. The bot will close this PR and schedule another one. If you do not have permissions to add this label, you can use the phrase @conda-forge-admin, please rerun bot in a PR comment to have the conda-forge-admin add it for you.

Pending Dependency Version Updates

Here is a list of all the pending dependency version updates for this repo. Please double check all dependencies before merging.

Name Upstream Version Current Version
bazel 7.3.1 Anaconda-Server Badge
cudnn 9.3.0.75 Anaconda-Server Badge

This PR was created by the regro-cf-autotick-bot. The regro-cf-autotick-bot is a service to automatically track the dependency graph, migrate packages, and propose package version updates for conda-forge. Feel free to drop us a line if there are any issues! This PR was generated by - please use this URL for debugging.

@conda-forge-webservices
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

I do have some suggestions for making it better though...

For recipe/meta.yaml:

  • No valid build backend found for Python recipe for package jaxlib using pip. Python recipes using pip need to explicitly specify a build backend in the host section. If your recipe has built with only pip in the host section in the past, you likely should add setuptools to the host section of your recipe.

@xhochy
Copy link
Member

xhochy commented Sep 25, 2024

It doesn't build locally for because of conda-forge/binutils-feedstock#76

Copy link
Contributor

github-actions bot commented Sep 27, 2024

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

@xhochy
Copy link
Member

xhochy commented Oct 1, 2024

This now uses a new hermetic CUDA approach outlined here: https://github.com/openxla/xla/blob/main/docs/hermetic_cuda.md we need to adjust our CUDA builds to it.

Comment on lines 23 to +45
export CUDA_HOME="${BUILD_PREFIX}/targets/x86_64-linux"
export TF_CUDA_PATHS="${BUILD_PREFIX}/targets/x86_64-linux,${PREFIX}/targets/x86_64-linux"
# Needed for some nvcc binaries
export PATH=$PATH:${BUILD_PREFIX}/nvvm/bin
# XLA can only cope with a single cuda header include directory, merge both
rsync -a ${PREFIX}/targets/x86_64-linux/include/ ${BUILD_PREFIX}/targets/x86_64-linux/include/
# Needed for some nvcc binaries
export PATH=$PATH:${BUILD_PREFIX}/nvvm/bin
# XLA can only cope with a single cuda header include directory, merge both
rsync -a ${PREFIX}/targets/x86_64-linux/include/ ${BUILD_PREFIX}/targets/x86_64-linux/include/

# Although XLA supports a non-hermetic build, it still tries to find headers in the hermetic locations.
# We do this in the BUILD_PREFIX to not have any impact on the resulting jaxlib package.
# Otherwise, these copied files would be included in the package.
rm -rf ${BUILD_PREFIX}/targets/x86_64-linux/include/third_party
mkdir -p ${BUILD_PREFIX}/targets/x86_64-linux/include/third_party/gpus/cuda/extras/CUPTI
cp -r ${PREFIX}/targets/x86_64-linux/include ${BUILD_PREFIX}/targets/x86_64-linux/include/third_party/gpus/cuda/
cp -r ${PREFIX}/targets/x86_64-linux/include ${BUILD_PREFIX}/targets/x86_64-linux/include/third_party/gpus/cuda/extras/CUPTI/
mkdir -p ${BUILD_PREFIX}/targets/x86_64-linux/include/third_party/gpus/cudnn
cp ${PREFIX}/include/cudnn.h ${BUILD_PREFIX}/targets/x86_64-linux/include/third_party/gpus/cudnn/
export LOCAL_CUDA_PATH="${BUILD_PREFIX}/targets/x86_64-linux"
export LOCAL_CUDNN_PATH="${PREFIX}/targets/x86_64-linux"
export LOCAL_NCCL_PATH="${PREFIX}/targets/x86_64-linux"
mkdir -p ${BUILD_PREFIX}/targets/x86_64-linux/bin
ln -s $(which ptxas) ${BUILD_PREFIX}/targets/x86_64-linux/bin/ptxas
ln -s $(which nvlink) ${BUILD_PREFIX}/targets/x86_64-linux/bin/nvlink
ln -s $(which fatbinary) ${BUILD_PREFIX}/targets/x86_64-linux/bin/fatbinary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am sorry to see this is needed 😢

Is it able to parse CUDA_HOME with multiple paths. IOW would this be possible?

export CUDA_HOME="${PREFIX}:${PREFIX}/targets/x86_64-linux:${BUILD_PREFIX}/targets/x86_64-linux"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, in the end the bazel scripts do some globbing over $LOCAL_CUDA_PATH/bin* to determine the available CUDA binaries. You can do a bit of trickery and use one of LOCAL_CUDNN_PATH / LOCAL_NCCL_PATH to add a third directory but this then sadly led to the wrong protobuf headers being picked up.

Comment on lines -85 to +90
- cuda-nvcc # [(cuda_compiler_version or "").startswith("12")]
# Workaround for https://github.com/conda-forge/cuda-cupti-feedstock/issues/14
- cuda-cupti >=12.0.90,<13.0a0 # [(cuda_compiler_version or "").startswith("12")]
- cuda-nvcc-tools # [(cuda_compiler_version or "").startswith("12")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Uwe! 🙏

Looks good (as does the host addition above)

@@ -57,6 +60,7 @@ requirements:
- cuda-cudart-dev # [(cuda_compiler_version or "").startswith("12")]
- cuda-nvml-dev # [(cuda_compiler_version or "").startswith("12")]
- cuda-nvtx-dev # [(cuda_compiler_version or "").startswith("12")]
- cuda-nvcc-tools # [(cuda_compiler_version or "").startswith("12")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to confirm that JAX is using nvvm from here? Does it have some config paths we can check in the test phase?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea.

@mwittgen would you know how to do this/where to look? We sadly don't have a GPU device in the CI to test it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah was hoping this kind of check would not actually use the GPU. Sort of like python -m sysconfig for JAX. Poking around it looks like they have jax.print_environment_info(). Maybe we could try that?

If we are unable to figure out an appropriate config check, we could configure CI to use a GPU. Looks like some work has already been done here. So this would just be adding another runner type

Another option we have made heavy use of elsewhere is caching the built packages on CI and then downloading them for local testing ( conda-forge/conda-forge.github.io#1424 ). This may be more useful in the end as it is easier to iterate

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poking around it looks like they have jax.print_environment_info(). Maybe we could try that?

I looked at its implementation and it doesn't provide anything useful for the case we are testing on.

If we are unable to figure out an appropriate config check, we could configure CI to use a GPU. Looks like some work has already been done here. So this would just be adding another runner type

This seems like rather expensive to add a GPU here to do a simple test.

Overall, it seems that this will need input from someone where it doesn't work. The current PR is an improvement on the status quo and I don't want to hold up the release anymore, thus I'm going to merge.

@xhochy xhochy merged commit 9ead0a1 into conda-forge:main Oct 14, 2024
18 checks passed
@regro-cf-autotick-bot regro-cf-autotick-bot deleted the 0.4.32_h06a6e9 branch October 14, 2024 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants