-
Notifications
You must be signed in to change notification settings - Fork 12
High level dataset API overview
This wiki page gives you a basic understanding of the high-level dataset API.
For more detailed examples see webknossos-cuber/tests/test_dataset.py
.
There are three different dataset types (WKDataset
, TiffDataset
, and TiledTiffDataset
) which support a very similar interface.
The main difference between the TiffDataset
and the TiledTiffDataset
is that the TiffDataset
stores a z-layer in a single image, whereas TiledTiffDataset
divides a z-layer into multiple images.
The essential operations for datasets are creating, opening, reading data and writing data. The datasource-properties.json
gets updated automatically.
Here are some examples for working with the high-level dataset API:
Creating a WKDataset:
ds = WKDataset.create("path_to_dataset/wk_dataset", scale=(1, 1, 1))
ds.add_layer("color", "color")
ds.get_layer("color").add_mag("1")
ds.get_layer("color").add_mag("2-2-1")
# The directories are created automatically
assert path.exists("path_to_dataset/wk_dataset/color/1")
assert path.exists("path_to_dataset/wk_dataset/color/2-2-1")
assert len(ds.properties.data_layers) == 1
assert len(ds.properties.data_layers["color"].wkw_magnifications) == 2
Similar to the WKDataset, this also works for TiffDatasets:
ds = TiffDataset.create("path_to_dataset/tiff_dataset", scale=(1, 1, 1))
ds.add_layer("color", Layer.COLOR_TYPE)
ds.get_layer("color").add_mag("1")
ds.get_layer("color").add_mag("2-2-1")
...
To create a TiledTiffDatasets, you also have to specify the tile_size
:
ds = TiledTiffDataset.create(
"./testoutput/TiledTiffDataset",
scale=(1, 1, 1),
tile_size=(32, 64),
pattern="{xxx}/{yyy}/{zzz}.tif",
)
ds.add_layer("color", Layer.COLOR_TYPE)
ds.get_layer("color").add_mag("1")
ds.get_layer("color").add_mag("2-2-1")
...
Opening datasets:
wk_ds = WKDataset("path_to_dataset/wk_dataset")
...
tiff_ds = TiffDataset("path_to_dataset/tiff_dataset")
...
tiled_tiff_ds = TiledTiffDataset("path_to_dataset/tiled_tiff_dataset")
...
Reading and writing data (this also works the same way for the TiffDataset and TiledTiffDataset):
wk_ds = WKDataset("path_to_dataset/wk_dataset")
mag = wk_ds.add_layer("another_layer", Layer.COLOR_TYPE, num_channels=3).add_mag("1")
data = (np.random.rand(3, 250, 250, 250) * 255).astype(np.uint8)
mag.write(data)
assert np.array_equal(data, mag.read(size=(250, 250, 10)))
The high-level dataset API also introduces the concept of a View
. A View is a handle to a specific bounding box in the dataset. Views can be used to read and write data. The advantage is that Views can be passed around.
wk_view = WKDataset("path_to_dataset/wk_dataset").get_view(
"another_layer",
"1",
size=(32, 32, 32),
offset=(10,10,10)
)
data = (np.random.rand(3, 20, 20, 20) * 255).astype(np.uint8)
wk_view.write(data)
...
The TiledTiffDataset
also supports a method to return the data of a specific tile:
tiled_tiff_ds = TiledTiffDataset.create(
"path_to_dataset/tiled_tiff_dataset",
scale=(1, 1, 1),
tile_size=(32, 64),
pattern="{xxxx}_{yyyy}_{zzzz}.tif",
)
mag = tiled_tiff_ds.add_layer("color", "color").add_mag("1")
data = (np.random.rand(250, 200, 10) * 255).astype(np.uint8)
mag.write(data, offset=(5, 5, 5))
assert mag.get_tile(1, 1, 6).shape == (1, 32, 64, 1)
# the method get_tile returns the content of the image with the specified x-, y-, and z-value
assert np.array_equal(
mag.get_tile(1, 2, 6)[0, :, :, 0],
TiffReader("./testoutput/tiled_tiff_dataset/color/1/001_002_006.tif").read(),
)