Deep learning programs need complex, multi-stage data processing pipelines, including loading, decoding, resizing, and other augmentations. CPU-powered data processing pipelines limit the performance and scalability of inference and training.
The NVIDIA Data Loading Library (DALI), a library for data loading and pre-processing, addresses these problems. It offers highly optimized building blocks for loading and processing sound, video, and images.
- GPU Library: DALI
- Author: Nvidia
- Initial Release: N/A
- Type: GPU library used to accelerate the processing of deep learning training and inference
- License: Apache License 2.0
- Language: Python, C++, and others
- GitHub: Nvidia/DALI has 3.6k stars, 45 releases, and 63 contributors
- Examples: for MXNet, TensorFlow, and PyTorch
Features of DALI
- Features an exclusive execution engine that helps maximize the process of the input pipeline.
- Provides access to features such as prefetching, batch processing, parallel execution
- DALI-based data processing pipelines are portable because they can easily be retargeted to PaddlePaddle, TensorFlow, PyTorch, and MXNet.
- Open source and easy-to-use Python API
- Supports multiple data formats – COCO, LMDB, TFRecord, RecordIO, JPEG, JPEG 2000, FLAC, OGG, WAV, H.264, VP9, and HEVC.
- Scalable across different GPUs
- Accelerates image classifications
- Extensible for user-specific custom operators
- Easy integration with NVIDIA Triton Inference Server with DALI TRITON Backend.