Release v4.4.0 · tensorflow/datasets

API:

Add PartialDecoding support, to decode only a subset of the features (for performances)
Catalog now expose links to KnowYourData visualisations
tfds.as_numpy supports datasets with None
Dataset generated with disable_shuffling=True are now read in generation order.
Loading datasets from files now supports custom tfds.features.FeatureConnector
tfds.testing.mock_data now supports
- non-scalar tensors with dtype tf.string
- builder_from_files and path-based community datasets
File format automatically restored (for datasets generated with tfds.builder(..., file_format=)).
Many new reinforcement learning datasets
Various bug fixes and internal improvements like:
- Dynamically set number of worker thread during extraction
- Update progression bar during download even if downloads are cached

Dataset creation:

Add tfds.features.LabeledImage for semantic segmentation (like image but with additional info.features['image_label'].name label metadata)
Add float32 support for tfds.features.Image (e.g. for depth map)
All FeatureConnector can now have a None dimension anywhere (previously restricted to the first position).
tfds.features.Tensor() can have arbitrary number of dynamic dimension (Tensor(..., shape=(None, None, 3, None)))
tfds.features.Tensor can now be serialised as bytes, instead of float/int values (to allow better compression): Tensor(..., encoding='zlib')
Add script to add TFDS metadata files to existing TF-record (see doc).
New guide on common implementation gotchas

Thank you all for your support and contribution!

Provide feedback