cuDNN is a framework built on the top of CUDA and developed by NVIDIA for deep learning primitives. It stands for CUDA Deep Neural Network. It serves as a building block for deep learning and machine learning frameworks.
More specifically, the cuDNN Frontend API is a C++ header-only library that shows how to use the cuDNN C backend API.
Deep learning framework creators and researchers use cuDNN for high-performance GPU acceleration. Using this, they can develop software programs and train neural networks instead of spending time tuning low-level GPU performance. cuDNN accelerated frameworks include Keras, Caffe2, MxNet, PyTorch, MATLAB, TensorFlow, and a few more.
Features of cuDNN
- Tensor Core acceleration for famous convolutions such as 2D, 3D, Depth-wise separable
- Enhanced kernels for computer speech and vision models, such as ResNet, EfficientNet, EfficientDet, ResNext, SSD, etc.
- Supports FP32, FP16, BF16, and TF32 floating-point formats
- Supports INT8 and UINT8 integer formats
- Arbitrary dimension ordering, striding, and sub-regions for 4d tensors. So, easy integration into neural network implementation.
- Accelerates fused operations on CNN architectures