Release DALI v0.12.0 · NVIDIA/DALI

Bug fixes

Remove dependency with gitlab-master in DALI TF (#1038)
Added include(CheckSymbolsExists) to cmakelists (#1035)
Fix uninitialized number of dimensions in TensorListShape. (#1023)
Add const-qualifiers to TensorShape first and last functions. (#1020)
Add missing bracket in the BoxEncoder docs (#1018)
Adjust espilon in tests. (#1017)
Add ASAN support, fix reported problems in the unit tests (#362)
Fix for OF test (#1008)
Fix nvjpeg_decoder legacy api build (#1006)
Fix scratchpad allocation in CropMirrorNormalize (#1000)
Fix Resize ratio calculation (#997)
Add missing device guard in the reader prefetch thread (#978)
optical flow test fix (#976)
Make errors from build_helper propagate correctly (#961)
Add casting to float before normalization in SliceFlipNormalizePermute tests (#974)
Fix displacement filter (#524)
Fix output allocation in operator benchmark (#959)
Handle NULL pointer in ctypes_void_ptr (#965)
Fix error of indexing shape in Optical Flow (#1087)
Reduced batch size to avoid out of memory condition in 19.07 container

Improvements

Create pull request template (#1039)
Add environment variables to DALI TF build image (#1034)
Replace HostDecoder and nvJPEGDecoder with generic ImageDecoder (#1028)
Add deprecated operator warning when using it (#1030)
Expose and document fine grain control API for pipeline run (#972)
Use TensorListShape for TensorList shape (#1025)
Rework nvidia-dali-tf-plugin build (#1007)
Span improvements. (#1032)
Add ImageDecoder operator, selecting implementation based on device argument (#995)
Removed unified memory from resampling filters. (#1026)
Add mechanism to mark an operator as deprecated in favor of another one (#1001)
Add matrix types + tests. (#1014)
Use TensorShape in dali::Tensor (#1015)
Introduce number of samples to TensorListShape (#1010)
Video reader label (#998)
Add path to json in case of error in the COCO reader (#1011)
Add vector types. (#1009)
Add no squeeze option and dynamic shape for MXNet and PyTorch plugins (#988)
Update test_python_function_operator.py (#880)
Restructure subdirectories in nvjpeg decoder (#999)
Add printing of error string enums with nvJPEG error codes (#983)
Remove deprecated __init__ usage from backend (#993)
Replace usage of NormalizePermute by CropMirrorNormalize (#994)
Remove OldCropMirrorNormalize (#992)
Optimize python operator outputs copy. (#958)
Rework how DALI handles py_buffer format string (#985)
Improve obtaining TensorFlow build flags for prebuild DALI plugins (#963)
Replace CropMirrorNormalize with new implementation (#989)
Add COCO tfrecord support (#979)
Add test cases for Flip operator (#973)
Add NewCropMirrorNormalize GPU (#970)
Read COCO categories from json file in COCOReader (#986)
Add -std=c++14 to cuda nvcc flags in custom plugin example (#984)
Add max_size upperbound option to Resize with resize_short (#960)
Enable no-crop by default in NewCropMirrorNormalize (#977)
Change type traits to use C++14 library aliases. (#975)
Use c++14 standard (#971)
Change storage device from boolean to enum in workspace (#967)
Add new SliceFlipNormalizePermute CPU kernel. (#949)
Remove lint from the default target list (#964)
Add split_scenes and transcode_scenes doc in Superres example (#944)
Update libjpeg-turbo to 2.0.2 version (#951)
Add lint as the first class, separate target to CMake (#952)
Create test_optical_flow.py (#911)
Adjust TensorFlow ResNet50 example to 1.14 version API (#1081)
Change test prefix from L*_ to TL*_ (#1069)

Breaking API changes

CPU operators have moved from per-sample processing (pipeline process sample after sample, all the way through the pipeline) to batch-procession (all samples are processed by the first operator before moving to the next operator). This may result in a small performance degradation for some use cases. However, in the long term it will make some currently unavailable optimizations possible, together with making possible operations that need to view the whole batch during the processing (like random sample blending inside a batch).
Deprecated _run, _share_outputs and _release_outputs in favor of schedule_run, share_outputs and release_outputs
Replaced HostDecoder and nvJPEGDecoder with generic ImageDecoder. ImageDecoder is the recommended way function for the image decoding, and old API will be removed in the future

Known issues:

New Video reader operator requires NVIDIA VIDEO CODEC SDK support in the platform. NVIDIA GPU Cloud (NGC) optimized containers lacks this functionality in the default configuration prior to 19.01. To enable it please run the container with the ‘video’ capability enabled, ie.:
-e "NVIDIA_DRIVER_CAPABILITIES=compute,utility,video"
The video loader operator requires that the key frames occur at a minimum every 10 to 15 frames of the video stream. If the key frames occur at a lesser frequency, then the returned frames may be out of sync.
DALI TensorFlow plugin may be not compatible with TensorFlow 1.14.0 release. The DALI TensorFlow plugin requires that the gcc compiler that matches the one used to build TensorFlow (gcc 4.8.4 or gcc 4.8.5, depending on the particular version) be present on the system.

Binary builds

Install via pip for CUDA 9:
pip install --extra-index-url http://developer.download.nvidia.com/compute/redist/cuda/9.0 nvidia-dali==0.12.0
or for CUDA 10
pip install --extra-index-url http://developer.download.nvidia.com/compute/redist/cuda/10.0 nvidia-dali==0.12.0

Or use direct download links (CUDA 9.0):

Or use direct download links (CUDA 10.0):

FFmpeg source code:

This software uses code of FFmpeg licensed under the LGPLv2.1 and its source can be downloaded here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DALI v0.12.0

Bug fixes

Improvements

Breaking API changes

Known issues:

Binary builds