-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU is not detected in R, but appears in python. #1456
Comments
Can you confirm that the R session is indeed finding the correct python env? What is the output of |
yes, I only created 1 conda env called keras
|
What a curious bug, thanks for reporting. Just to rule some things out:
R -q -e 'keras3::install_keras()'
R -q -e 'library(reticulate); use_virtualenv("r-keras"); import("tensorflow")$config$list_physical_devices()' |
Thans for the reply. > Sys.getenv("CUDA_VISIBLE_DEVICES")
[1] "" I tried the shell command to install keras, and it ends out the same. evan@DESKTOP-KGBNUBC:~$ R -q -e 'library(reticulate); use_virtualenv("r-keras"); import("tensorflow")$config$list_physical_devices()'
> library(reticulate); use_virtualenv("r-keras"); import("tensorflow")$config$list_physical_devices()
2024-06-12 03:25:20.236712: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-12 03:25:20.782594: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-06-12 03:25:21.546573: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:282] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
[[1]]
PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU') One thing different is, in this run evan@DESKTOP-KGBNUBC:~$ source .virtualenvs/r-keras/bin/activate
(r-keras) evan@DESKTOP-KGBNUBC:~$ pip list | grep tensor
tensorboard 2.16.2
tensorboard-data-server 0.7.2
tensorflow-cpu 2.16.1
tensorflow-datasets 4.9.6
tensorflow-io-gcs-filesystem 0.37.0
tensorflow-metadata 1.15.0
(r-keras) evan@DESKTOP-KGBNUBC:~$ pip list | grep cuda
(r-keras) evan@DESKTOP-KGBNUBC:~$ I dug a little to find out that the evan@DESKTOP-KGBNUBC:~$ lspci
4d66:00:00.0 SCSI storage controller: Red Hat, Inc. Virtio console (rev 01)
6e30:00:00.0 System peripheral: Red Hat, Inc. Virtio file system (rev 01)
d98b:00:00.0 3D controller: Microsoft Corporation Device 008e
evan@DESKTOP-KGBNUBC:~$ nvidia-smi
Wed Jun 12 03:43:34 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.52.01 Driver Version: 555.99 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4060 Ti On | 00000000:01:00.0 On | N/A |
| 39% 37C P8 10W / 160W | 1657MiB / 8188MiB | 20% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 66 G /Xwayland N/A |
+-----------------------------------------------------------------------------------------+ Then I installed keras again with |
I'll try to get on a Windows machine tomorrow and see if I can reproduce. |
Just an update about what I've tried. After a whole system reinstall (including the WSL Ubuntu), I found out that I can't see GPU in python too. NVIDIA_DIR=$(dirname $(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)")))
for dir in $NVIDIA_DIR/*; do
if [ -d "$dir/lib" ]; then
export LD_LIBRARY_PATH="$dir/lib:$LD_LIBRARY_PATH"
fi
done I'm not sure if this is vital to ensure python can see the GPU, it is apparently not affecting R. |
Thanks, I can reproduce. This seems to be specific to TF 2.16, the GPU is visible with the identical setup using TF 2.15. It seems that we need to do some more work on WSL with helping Tensorflow discover the nvidia shared libraries (note, we already workaround some deficiencies by creating symlinks to nvidia shared libraries in the tensorflow virtual env. This works on Linux, but is apparently not sufficient on WSL) For now, you can fix by running this in WSL before starting the R session (Or setting the env vars in the R session before reticulate has initializing Python). #!/bin/sh
# Store original LD_LIBRARY_PATH
export ORIGINAL_LD_LIBRARY_PATH="${LD_LIBRARY_PATH}"
# Get the CUDNN directory
CUDNN_DIR=$(dirname $(dirname $(python -c "import nvidia.cudnn; print(nvidia.cudnn.__file__)")))
# Set LD_LIBRARY_PATH to include CUDNN directory
export LD_LIBRARY_PATH=$(find ${CUDNN_DIR}/*/lib/ -type d -printf "%p:")${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
# Get the ptxas directory
PTXAS_DIR=$(dirname $(dirname $(python -c "import nvidia.cuda_nvcc; print(nvidia.cuda_nvcc.__file__)")))
# Set PATH to include the directory containing ptxas
export PATH=$(find ${PTXAS_DIR}/*/bin/ -type d -printf "%p:")${PATH:+:${PATH}} Note, there is nothing specific to conda here. We still recommend using a virtualenv if possible. I'll push an update soon making sure that the R package does this work so users don't have to. |
Thanks a lot! That saves me from learning python again...😂 |
This is fixed on main now, the workaround should not longer be necessary. Please install the development version and reinstall keras+tensorflow to test it out. remotes::install_github("rstudio/keras3")
keras3::install_keras() # new R session
library(keras3) # load hook hints to reticulate to use_virtualenv("r-keras")
tensorflow::tf$config$list_physical_devices() |
Hi there,
I recently started moving my training environment to WSL2 to keep pace to keras3.
after following the installation guide, I successfully installed the tensorflow on my conda environment through command
However, when I checked tf.config in R, I found out that the GPU was not detected.
I test some code and keras worked just fine with CPU.
Then I turned to python to get more details. dramatically, the GPU just showed up.
googled a while and found nothing similar to this. Is that I shouldn't install TF into a conda environment?
Thanks in advance for any advice.
session info is here:
The text was updated successfully, but these errors were encountered: