You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I am facing this issue for some time and not able to fix this.
Python version: 3.7
Operating System: Linux
TensorFlow version: 1.14.0
CUDA version: 10.0
Description
I keep getting this warning and then the execution crashes at Epoch 1.
What I Did
import tensorflow as tf
if tf.test.gpu_device_name():
print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))
else:
print("Please install GPU version of TF")
And it shows tf is using GPU fine.
2019-10-03 13:11:01.720688: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-10-03 13:11:01.768834: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2596780000 Hz
2019-10-03 13:11:01.771431: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56157647a930 executing computations on platform Host. Devices:
2019-10-03 13:11:01.771460: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined>
2019-10-03 13:11:01.772877: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-10-03 13:11:04.249822: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: Tesla M60 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:04:00.0
2019-10-03 13:11:04.250926: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 1 with properties:
name: Tesla M60 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:05:00.0
2019-10-03 13:11:04.251999: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 2 with properties:
name: Tesla M60 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:09:00.0
2019-10-03 13:11:04.253103: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 3 with properties:
name: Tesla M60 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:0a:00.0
2019-10-03 13:11:04.254193: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 4 with properties:
name: Tesla M60 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:85:00.0
2019-10-03 13:11:04.255276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 5 with properties:
name: Tesla M60 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:86:00.0
2019-10-03 13:11:04.255566: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-10-03 13:11:04.256938: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-10-03 13:11:04.258142: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-10-03 13:11:04.258427: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-10-03 13:11:04.260019: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-10-03 13:11:04.261283: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-10-03 13:11:04.265096: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-10-03 13:11:04.277832: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0, 1, 2, 3, 4, 5
2019-10-03 13:11:04.277873: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-10-03 13:11:04.284987: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-03 13:11:04.285005: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 1 2 3 4 5
2019-10-03 13:11:04.285013: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N Y Y Y N N
2019-10-03 13:11:04.285018: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 1: Y N Y Y N N
2019-10-03 13:11:04.285023: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 2: Y Y N Y N N
2019-10-03 13:11:04.285028: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 3: Y Y Y N N N
2019-10-03 13:11:04.285033: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 4: N N N N N Y
2019-10-03 13:11:04.285040: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 5: N N N N Y N
2019-10-03 13:11:04.293727: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:0 with 7647 MB memory) -> physical GPU (device: 0, name: Tesla M60, pci bus id: 0000:04:00.0, compute capability: 5.2)
2019-10-03 13:11:04.296282: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:1 with 7647 MB memory) -> physical GPU (device: 1, name: Tesla M60, pci bus id: 0000:05:00.0, compute capability: 5.2)
2019-10-03 13:11:04.298803: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:2 with 7647 MB memory) -> physical GPU (device: 2, name: Tesla M60, pci bus id: 0000:09:00.0, compute capability: 5.2)
2019-10-03 13:11:04.301310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:3 with 7647 MB memory) -> physical GPU (device: 3, name: Tesla M60, pci bus id: 0000:0a:00.0, compute capability: 5.2)
2019-10-03 13:11:04.303979: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:4 with 7647 MB memory) -> physical GPU (device: 4, name: Tesla M60, pci bus id: 0000:85:00.0, compute capability: 5.2)
2019-10-03 13:11:04.306456: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:5 with 7647 MB memory) -> physical GPU (device: 5, name: Tesla M60, pci bus id: 0000:86:00.0, compute capability: 5.2)
2019-10-03 13:11:04.310204: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56157ab4cab0 executing computations on platform CUDA. Devices:
2019-10-03 13:11:04.310223: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Tesla M60, Compute Capability 5.2
2019-10-03 13:11:04.310229: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (1): Tesla M60, Compute Capability 5.2
2019-10-03 13:11:04.310234: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (2): Tesla M60, Compute Capability 5.2
2019-10-03 13:11:04.310239: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (3): Tesla M60, Compute Capability 5.2
2019-10-03 13:11:04.310244: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (4): Tesla M60, Compute Capability 5.2
2019-10-03 13:11:04.310249: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (5): Tesla M60, Compute Capability 5.2
2019-10-03 13:11:04.314251: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: Tesla M60 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:04:00.0
2019-10-03 13:11:04.315484: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 1 with properties:
name: Tesla M60 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:05:00.0
2019-10-03 13:11:04.316567: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 2 with properties:
name: Tesla M60 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:09:00.0
2019-10-03 13:11:04.317632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 3 with properties:
name: Tesla M60 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:0a:00.0
2019-10-03 13:11:04.318705: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 4 with properties:
name: Tesla M60 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:85:00.0
2019-10-03 13:11:04.319780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 5 with properties:
name: Tesla M60 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:86:00.0
2019-10-03 13:11:04.319806: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-10-03 13:11:04.319820: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-10-03 13:11:04.319833: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-10-03 13:11:04.319846: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-10-03 13:11:04.319859: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-10-03 13:11:04.319872: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-10-03 13:11:04.319885: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-10-03 13:11:04.332488: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0, 1, 2, 3, 4, 5
2019-10-03 13:11:04.332811: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-03 13:11:04.332823: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 1 2 3 4 5
2019-10-03 13:11:04.332830: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N Y Y Y N N
2019-10-03 13:11:04.332835: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 1: Y N Y Y N N
2019-10-03 13:11:04.332840: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 2: Y Y N Y N N
2019-10-03 13:11:04.332845: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 3: Y Y Y N N N
2019-10-03 13:11:04.332850: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 4: N N N N N Y
2019-10-03 13:11:04.332856: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 5: N N N N Y N
2019-10-03 13:11:04.340711: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:0 with 7647 MB memory) -> physical GPU (device: 0, name: Tesla M60, pci bus id: 0000:04:00.0, compute capability: 5.2)
2019-10-03 13:11:04.341796: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:1 with 7647 MB memory) -> physical GPU (device: 1, name: Tesla M60, pci bus id: 0000:05:00.0, compute capability: 5.2)
2019-10-03 13:11:04.342889: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:2 with 7647 MB memory) -> physical GPU (device: 2, name: Tesla M60, pci bus id: 0000:09:00.0, compute capability: 5.2)
2019-10-03 13:11:04.343989: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:3 with 7647 MB memory) -> physical GPU (device: 3, name: Tesla M60, pci bus id: 0000:0a:00.0, compute capability: 5.2)
2019-10-03 13:11:04.345103: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:4 with 7647 MB memory) -> physical GPU (device: 4, name: Tesla M60, pci bus id: 0000:85:00.0, compute capability: 5.2)
2019-10-03 13:11:04.346189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:5 with 7647 MB memory) -> physical GPU (device: 5, name: Tesla M60, pci bus id: 0000:86:00.0, compute capability: 5.2)
Default GPU Device: /device:GPU:0
I set the argument of GPU in TGANModel to '/GPU:0' and also tried with '/device:GPU:0'
But, it is the same warning and the crash just while running the first epoch.
I also uninstalled and re-installed Tensorflow-gpu and TGAN, just to check but of no use.
Regards,
Nabaruna
The text was updated successfully, but these errors were encountered:
Would you mind sharing a short code snippet that shows the exact arguments that you use when creating the TGAN instance and calling the fit and sample methods?
We will then try to reproduce the error to be able to assist you better.
Also, regarding the GPU usage, please check this other issue: #34
So, basically, the gpu argument is now being ignored, and all that matters in regards of GPU usage is whether you have installed tensorflow or tensorflow-gpu.
Hi,
I am facing this issue for some time and not able to fix this.
Description
I keep getting this warning and then the execution crashes at Epoch 1.
What I Did
And it shows tf is using GPU fine.
I set the argument of GPU in TGANModel to '/GPU:0' and also tried with '/device:GPU:0'
But, it is the same warning and the crash just while running the first epoch.
I also uninstalled and re-installed Tensorflow-gpu and TGAN, just to check but of no use.
Regards,
Nabaruna
The text was updated successfully, but these errors were encountered: