问题现象
2021-06-04 16:54:01.790711: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
0%| | 0/102 [00:02<?, ?it/s]
Traceback (most recent call last):
File "get_dr_txt.py", line 120, in <module>
frcnn.detect_image(image_id,image)
File "get_dr_txt.py", line 64, in detect_image
rpn_pred = self.model_rpn.predict(photo)
File "/home/user/anaconda3/envs/gu_keras/lib/python3.7/site-packages/keras/engine/training.py", line 1462, in predict
callbacks=callbacks)
File "/home/user/anaconda3/envs/gu_keras/lib/python3.7/site-packages/keras/engine/training_arrays.py", line 324, in predict_loop
batch_outs = f(ins_batch)
File "/home/user/anaconda3/envs/gu_keras/lib/python3.7/site-packages/tensorflow_core/python/keras/backend.py", line 3476, in __call__
run_metadata=self.run_metadata)
File "/home/user/anaconda3/envs/gu_keras/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1472, in __call__
run_metadata_ptr)
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
(0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[classification/Reshape/_1971]]
frcnn.detect_image(image_id,image) File "get_dr_txt.py", line 64, in detect_image rpn_pred = self.model_rpn.predict(photo) File "/home/user/anaconda3/envs/gu_keras/lib/python3.7/site-packages/keras/engine/training.py", line 1462, in predict callbacks=callbacks)
File "/home/user/anaconda3/envs/gu_keras/lib/python3.7/site-packages/keras/engine/training_arrays.py", line 324, in predict_loop
batch_outs = f(ins_batch)
File "/home/user/anaconda3/envs/gu_keras/lib/python3.7/site-packages/tensorflow_core/python/keras/backend.py",line 3476, in __call__ run_metadata=self.run_metadata) File "/home/user/anaconda3/envs/gu_keras/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1472, in __call__
run_metadata_ptr)
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
(0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node conv1/convolution}}]]
[[classification/Reshape/_1971]]
(1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node conv1/convolution}}]]
0 successful operations.
0 derived errors ignored.
解决方式
设置了一下GPU的使用情况
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" #实现卡号匹配
os.environ["CUDA_VISIBLE_DEVICES"] = "0"