Nvidia DALI

Deep learning programs need complex, multi-stage data processing pipelines, including loading, decoding, resizing, and other augmentations. CPU-powered data processing pipelines limit the performance and scalability of inference and training.

The NVIDIA Data Loading Library (DALI), a library for data loading and pre-processing, addresses these problems. It offers highly optimized building blocks for loading and processing sound, video, and images.

Project Background

    • GPU Library:  DALI
    • Author: Nvidia
    • Initial Release: N/A 
    • Type: GPU library used to accelerate the processing of deep learning training and inference
    • License: Apache License 2.0
    • Language: Python, C++, and others
    • GitHub: Nvidia/DALI has 3.6k stars, 45 releases, and 63 contributors
    • Examples:  for MXNet, TensorFlow, and PyTorch

Features of DALI

  • Features an exclusive execution engine that helps maximize the process of the input pipeline.
  • Provides access to features such as prefetching, batch processing, parallel execution
  • DALI-based data processing pipelines are portable because they can easily be retargeted to PaddlePaddle, TensorFlow, PyTorch, and MXNet.
  • Open source and easy-to-use Python API
  • Supports multiple data formats – COCO, LMDB, TFRecord, RecordIO, JPEG, JPEG 2000, FLAC, OGG, WAV, H.264, VP9, and HEVC.
  • Scalable across different GPUs
  • Accelerates image classifications
  • Extensible for user-specific custom operators
  • Easy integration with NVIDIA Triton Inference Server with DALI TRITON Backend.

Scroll to Top