Skip to content

Commit

Permalink
Rename all the things.
Browse files Browse the repository at this point in the history
  • Loading branch information
Tony Tung committed May 16, 2018
1 parent 413e566 commit e39f1ae
Show file tree
Hide file tree
Showing 15 changed files with 191 additions and 186 deletions.
58 changes: 29 additions & 29 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@ Sliced imaging data
Background
==========

If we want to store imaging data on the cloud and allow scientists to experiment with this data with interactive local tools (e.g., Jupyter notebooks), we should provide an easy interface to retrieve this data. We should future-proof this model for extremely large images with multiple dimensions, where users may want to pull slices of this data without having to download the entire image.
If we want to store imaging data on the cloud and allow scientists to experiment with this data with interactive local tools (e.g., Jupyter notebooks), we should provide an easy interface to retrieve this data. We should future-proof this model for extremely large image sets with multiple dimensions, where users may want to pull slices of this data without having to download the entire image set.

Design
------

Images will be stored in a tiled format such that ranged requests can be used to efficiently fetch slices of the data. The tiles of the image is described by a manifest, which is itself broken up into multiple files for easy consumption.
An image set will be stored in a tiled format such that ranged requests can be used to efficiently fetch slices of the data. The tiles of the image set is described by a manifest, which is itself broken up into multiple files for easy consumption.

There should be a python API that allows users to point at an image set, ranges across multiple dimensions, and yields the data in numpy format. The python API should retrieve the table of contents, calculate the objects needed, fetch them in parallel, decode them, and slice out the data needed.

Expand All @@ -19,40 +19,41 @@ Locating a tile

The location for each tile is given in coordinates and indices. Coordinates is the location of the tile in geometric space, and indices is the location of the tile in non-geometric space. Together, coordinates and indices resolve exactly where the tile is in the n-dimensional space.

Format
------
Manifest
--------

Each image should have a manifest, which is a hierarchical tree of JSON table-of-contents documents. The leaf documents (`Image partition`_) contain a list of tiles. The non-leaf documents (`TOC partition`_) contain a map from an arbitrary unique name (within the space of the entire image) to relative paths or URLs containing either other `TOC partitions`__ or `image partitions`__.
Each image set is described by a manifest, which is a hierarchical tree of JSON table-of-contents documents. The leaf documents (`tile sets`__) describe a list of tiles. The non-leaf documents (`collections`__) contain a map from an arbitrary unique name (within the space of the entire image) to relative paths or URLs containing either other `collections`__ or `tile sets`__.

__ `TOC partition`_
__ `Image partition`_
__ `Tile Set`_
__ `Collection`_
__ `Collection`_
__ `Tile Set`_

.. _`TOC partition`:
.. _`Collection`:

TOC partition
~~~~~~~~~~~~~
Collection
~~~~~~~~~~

TOC partitions should have the following fields:
A collection should have the following fields:

=================== ====== ======== =================================================================================
Field Name Type Required Description
------------------- ------ -------- ---------------------------------------------------------------------------------
version string Yes Semantic versioning of the file format.
tocs dict Yes Map of names to relative paths or URLs of `image partitions`__ or
`TOC partition`__.
contents dict Yes Map of names to relative paths or URLs of `collections`__ or `tile sets`__.
extras dict No Additional application-specific payload. The vocabulary and the schema are
uncontrolled.
=================== ====== ======== =================================================================================

__ `Image partition`_
__ `TOC partition`_
__ `Collection`_
__ `Tile Set`_

.. _`Image partition`:
.. _`Tile Set`:

Image partition
~~~~~~~~~~~~~~~
Tile Set
~~~~~~~~

Image partitions should have the following fields:
A tile set should have the following fields:

=================== ====== ======== =================================================================================
Field Name Type Required Description
Expand All @@ -61,7 +62,7 @@ version string Yes Semantic versioning of the file format.
dimensions list Yes Names of the dimensions. Dimensions must include `x` and `y`.
tiles dict Yes See Tiles_
shape dict Yes Maps each non-geometric dimension to the possible number of values for that
dimension for the tiles in this `Image partition TOC`_.
dimension for the tiles in this `Tile Set`_.
default_tile_shape tuple No Default pixel dimensions of a tile, ordered as x, y.
default_tile_format string No Default file format of the tiles.
zoom dict No See Zoom_
Expand All @@ -83,18 +84,17 @@ file string Yes Relative path to the file.
coordinates dict Yes Maps each of the dimensions in geometric space, either `x`, `y`, or `z`, to either a
single dimension value, or a tuple specifying the range for that dimension. The `x` and
`y` coordinates must be provided as ranges. Each of the dimensions here must be
specified in the `Image partition TOC`_.
specified in the `Tile Set`_.
indices dict Yes Maps each of the dimensions *not* in geometric space to the value for this tile. Each
of the dimensions here must be specified in the `Image partition TOC`_. The values of
the indices must be non-negative integers, and every value up to but not including the
maximum specified in the `shape` field of the `Image partition TOC`_ must be
represented.
of the dimensions here must be specified in the `Tile Set`_. The values of the indices
must be non-negative integers, and every value up to but not including the maximum
specified in the `shape` field of the `Tile Set`_ must be represented.
tile_shape tuple No Pixel dimensions of a tile, ordered as x, y. If this is not provided, it defaults to
`default_tile_shape` in the `Image partition TOC`_). If neither is provided, the tile
shape is inferred from actual file.
`default_tile_shape` in the `Tile Set`_). If neither is provided, the tile shape is
inferred from actual file.
tile_format string No File format of the tile. If this is not provided, it defaults to `default_tile_format`
in the `Image partition TOC`_). If neither is provided, the tile format is inferred
from actual file.
in the `Tile Set`_). If neither is provided, the tile format is inferred from actual
file.
sha256 string No SHA256 checksum of the tile data.
extras dict No Additional application-specific payload. The vocabulary and the schema are
uncontrolled.
Expand Down
8 changes: 4 additions & 4 deletions slicedimage/__init__.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
from __future__ import absolute_import, division, print_function, unicode_literals

from ._formats import ImageFormat
from ._imagepartition import ImagePartition
from ._collection import Collection
from ._tile import Tile
from ._tocpartition import TocPartition
from ._tileset import TileSet
from .io import Reader, Writer, v0_0_0


__all__ = [
Collection,
ImageFormat,
ImagePartition,
Reader,
Tile,
TocPartition,
TileSet,
Writer,
v0_0_0,
]
36 changes: 36 additions & 0 deletions slicedimage/_collection.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
from __future__ import absolute_import, division, print_function, unicode_literals

from ._tileset import TileSet


class Collection(object):
def __init__(self, extras=None):
self.extras = extras
self._partitions = dict()

def validate(self):
pass

def add_partition(self, name, partition):
self._partitions[name] = partition

def all_tilesets(self):
"""Return all tilesets in this collection, directly or indirectly, as (name, tileset) tuples."""
for name, partition in self._partitions.items():
if isinstance(partition, Collection):
for descendant_name, descendant_tileset in partition.all_tilesets():
yield descendant_name, descendant_tileset
elif isinstance(partition, TileSet):
yield name, partition

def find_tileset(self, name):
for partition_name, image_partition in self.all_tilesets():
if name == partition_name:
return image_partition
return None

def tiles(self, filter_fn=lambda _: True):
result = []
for partion_name, image_partition in self.all_tilesets():
result.extend(image_partition.tiles(filter_fn))
return result
12 changes: 6 additions & 6 deletions slicedimage/_imagepartition.py → slicedimage/_tileset.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
from __future__ import absolute_import, division, print_function, unicode_literals


from ._typeformatting import format_imagepartition_dimensions, format_imagepartition_shape
from ._typeformatting import format_tileset_dimensions, format_tileset_shape


class ImagePartition(object):
class TileSet(object):
def __init__(self, dimensions, shape, default_tile_shape=None, default_tile_format=None, extras=None):
self.dimensions = format_imagepartition_dimensions(dimensions)
self.shape = format_imagepartition_shape(shape)
self.dimensions = format_tileset_dimensions(dimensions)
self.shape = format_tileset_shape(shape)
self.default_tile_shape = tuple() if default_tile_shape is None else tuple(default_tile_shape)
self.default_tile_format = default_tile_format
self.extras = {} if extras is None else extras
Expand All @@ -23,8 +23,8 @@ def add_tile(self, tile):

def tiles(self, filter_fn=lambda _: True):
"""
Return the tiles in this image partition. If a filter_fn is provided, only the tiles for which filter_fn
returns True are returned.
Return the tiles in this tileset. If a filter_fn is provided, only the tiles for which filter_fn returns True
are returned.
"""
for tile in self._tiles:
if filter_fn(tile):
Expand Down
36 changes: 0 additions & 36 deletions slicedimage/_tocpartition.py

This file was deleted.

9 changes: 5 additions & 4 deletions slicedimage/_typeformatting.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,16 +17,17 @@ def format_tile_dimensions(tile_dimensions):
return result


def format_imagepartition_dimensions(imagepartition_dimensions):
def format_tileset_dimensions(tileset_dimensions):
"""
Given an iterable of strings or enums, return a frozenset consisting of the same values, except all converted to
strings.
"""
return frozenset(_str_or_enum_to_str(imagepartition_dimension)
for imagepartition_dimension in imagepartition_dimensions)
return frozenset(
_str_or_enum_to_str(tileset_dimension)
for tileset_dimension in tileset_dimensions)


def format_imagepartition_shape(d):
def format_tileset_shape(d):
"""
Given a dictionary mapping keys to values, where the keys may either be strings or enums, return a new dictionary
where the keys are all converted to strings.
Expand Down
12 changes: 7 additions & 5 deletions slicedimage/cli/checksum.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,11 @@
class ChecksumCommand(CliCommand):
@classmethod
def register_parser(cls, subparser_root):
checksum_command = subparser_root.add_parser("checksum", help="Read a TOC file and add missing checksums.")
checksum_command.add_argument("in_url", help="URL for the source TOC file")
checksum_command.add_argument("out_path", help="Path to write TOC file with checksums")
checksum_command = subparser_root.add_parser(
"checksum",
help="Read a partition file and add missing checksums.")
checksum_command.add_argument("in_url", help="URL for the source partition file")
checksum_command.add_argument("out_path", help="Path to write partition file with checksums")
checksum_command.add_argument("--pretty", action="store_true", help="Pretty-print the output file")

return checksum_command
Expand All @@ -24,11 +26,11 @@ def run_command(cls, args):
slicedimage,
args.out_path,
pretty=args.pretty,
tile_opener=identity_file_namer,
tile_opener=fake_file_opener,
tile_writer=null_writer)


def identity_file_namer(toc_path, tile, ext):
def fake_file_opener(partition_path, tile, ext):
class fake_handle(object):
def __init__(self, name):
self.name = name
Expand Down
Loading

0 comments on commit e39f1ae

Please sign in to comment.