-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Required for PyTorch and CUDA Versions to Support NVIDIA H100 GPUs and Resolve Dependency Issues #132
Comments
The main issues arise when running amptorch on the GPU. This is not currently tested by the provided unit tests. Changing the tests to use the GPU instead of the CPU results in the following error for multiple tests:
Running the
For the installation, I used PyTorch 2.1.0 with CUDA 11.8 on Python 3.9.19. I installed the other dependencies exactly as specified in env_gpu.yml, such as Skorch 0.10, NumPy 1.20, ASE 3.21, etc. |
Thanks for pointing this out! Unfortunately, we don't currently have the funding or bandwidth to officially maintain AmpTorch. @nicoleyghu may have some insights or be able to take a look, but since she has graduated the response time may be slow. You may also want to check out the FAIR chem repo (https://github.com/FAIR-Chem/fairchem) which is actively maintained by the chemistry team at Meta and has many similar tools available. |
Are there plans to update the PyTorch and CUDA versions for this installation to the latest releases, such as PyTorch 2.3.1 and CUDA 11.8? My research would greatly benefit from using the NVIDIA H100 GPUs provided by my institution, which are not supported by the current repository versions.
Additionally, compiling dependencies like torch-scatter on Windows requires outdated compilers, such as VS 2015 SDK, leading to issues on newer systems. Using the latest versions of these dependencies resolves compatibility and compilation issues but causes errors in amptorch due to moved/renamed functions and incorrect data types.
I have attempted to fix some of these errors, but doing so is challenging and requires deep insight into the amptorch code.
Any feedback would be much appreciated!
[Edit]: Since the recent release of Skorch version 1.0.0, I highly recommend upgrading to this stable version instead of using the beta 0.10.0
The text was updated successfully, but these errors were encountered: