-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support of CuDNN8 #7000
base: master
Are you sure you want to change the base?
Support of CuDNN8 #7000
Conversation
- switch to cudnnFind* API instead of cudnnGet* that was removed in 8 - fixed cudnn version search - search of the alogrithms happens only in case shape really changed
Anybody here? |
@artyom-beilis Thanks for your patch! I have tried it the same as this #6970. But encountered with a large memory utilization in case of cudnn8. |
Hi,
I noticed larger memory use as well. It looks like related to cudnn8 in general. I see clear difference when I build same code with cudnn7 vs cudnn8.
Also make sure you use latest alignment fix, i.e. latest branch: https://github.com/artyom-beilis/caffe/commits/fixes_for_cudnn8_bvlc_master
Also caffe in general is memory hug. AFAIR I noticed the difference in memory use of cudnn7 vs cudnn8 with other frameworks as well.
Artyom
@artyom-beilis Thanks for your patch! I have tried it the same as this #6970. But encountered with a large memory utilization in case of cudnn8.
After some tests I have tried a model with single conv layer and ( 20 * 3 * 1280 * 720 ) input, it's "head" of ResNet used for detection task. With cuda10 and cudnn7.6 I observed about 1.7Gb usage for a forward pass, for cuda 11 and cudnn8 ~ 2.6Gb. Maybe this comparison is not fully correct, because different GPUs were used, Titan XP in the first case and 3060 for the second.
Have you seen something like this with 3070 and 1080? Thank you!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
|
Could you tell more about other frameworks? I have tried to find similar GPU memory problems mentions but unsuccessful. |
I don't really remember. It was either pytorch or mxnet. I don't recall. Was long time ago. |
Anyway, thank you! ) |
…ch, so switched to cudnnGet*_v7 API instead of much heavier cudnnFind and query optimal algorithm on _any_ reshape - not ignoring batch size reduction
Following this as current caffe I built with |
I tried the proposed changes to make cuDNN8 work but it does not work and the training immediately ends with the following error:
Ubuntu 20.04 Build without cuDNN runs without problems. |
Support of CuDNN8
Some of the API that was used by Caffe was removed in cudnn8. Without it it is impossible to run Caffe on Ampre architecture.
It required:
The change was tested on