Skip to content

HDF5 Image Metadata Standardization

Petr Baudis edited this page Sep 11, 2013 · 14 revisions

##Introduction NEMALOAD deals with a lot of image data, and as such, standardizations are needed. This document specifies the HDF5 image metadata standardization. It is subject to change.

##General Structure

###Filename Upon creation, the HDF5 file should have the same base name as the original file, lacking the original extension, instead bearing the extension .hdf5.

###Group All datasets and subgroups containing image frames should reside in a group called images. Other groups may exist, containing some auxiliary information.

###Attributes All images, regardless of type, shall store the following attributes in the images group:

  • createdAt: A string containing the time of file creation, in the format YYYY-MM-DDTHH:MM:SS[timezone] (must be conformant to ISO 8601)
  • numFrames: An integer number of frames
  • originalName: A string containing the base name of the original file, with the extension of the original file from which the data was derived.
  • opticalSystem: A string containing details about the type of data, at this time either LS (light-sheet), or LF (light-field).

##Light-field images ###Image group The light-field image datasets shall reside within the images group. The images group is associated with the following optical parameter attributes:

  • op_pitch: The microlens array pitch in um.
  • op_flen: The microlens focal length in um.
  • op_mag: The objective magnification in times.
  • op_na: The objective NA.
  • op_medium: The refractive index of the medium.

Frame dataset

The frame datasets are stored in the images group, named by their sequence number (starting with zero). The dataset is a 2D matrix representing a grayscale image of the light-field capture.

This dataset is interpreted in a rather peculiar way by our tools, however - before the lens grid is interpreted, the X and Y axis are swapped, transposing the matrix effectively.

Autorectification group

Rectification information describes the lens grid configuration in the light-field capture. The lens grid is rectangular and regular, but may be rotated around its center and have arbitrary spacing in both directions (though we assume only small tilt and similar spacing in both directions).

This is usually determined automatically and the parameters are stored in the autorectification group. The group holds no dataset, but is associated with the following attributes:

  • x_offset: X coordinate of the center of the central lenslet in the (yet uncropped) lightfield capture.
  • y_offset: Y coordinate of the center of the central lenslet in the (yet uncropped) lightfield capture.
  • right_dx: delta-X coordinate of one lenslet to the right of the current lenslet.
  • right_dy: delta-Y coordinate of one lenslet to the right of the current lenslet.
  • down_dx: delta-X coordinate of one lenslet downwards of the current lenslet.
  • down_dy: delta-Y coordinate of one lenslet downwards of the current lenslet.

The coordinates are interpreted after transposition of the axes is done. Therefore, right_dx would actually move along the Y axis in the raw HDF5 data matrix.

This group is optional, but any tool that needs to interpret the lightfield capture will require it for operation.

Crop Window group

In some cases the edge of the capture dataset contains irrelevant or noisy information (e.g. a ragged edge) and it is desirable to crop it for the purpose of further automated processing.

Crop window coordinates may in that case be stored in the cropwindow group. This group holds no dataset, but is associated with the following attributes:

  • x0: X coordinate of the top left corner of the crop window.
  • y0: Y coordinate of the top left corner of the crop window.
  • x1: X coordinate of the bottom right corner of the crop window.
  • y1: Y coordinate of the bottom right corner of the crop window.

The coordinates are interpreted before transposition of the axes is done. x0 really means the top left corner in the original HDF5 data matrix.

This group is completely optional; the whole dataset will be processed if no crop window is set.

##Light-sheet images ###Images group The light-sheet image datasets reside also within the images group.

###Channel subgroup

There is then a level of subgroups for each channel (channel 0 corresponds to green color and contains a Ca2+ marker corresponding to neuron activity; channel 1 corresponds to red color and reflects statically stained cells).

###Chunk subgroup

In each channel, there is one subgroup per chunk: a time-localized depth sweep - what would produce a single "video frame" after 3D reconstruction. All light-sheet chunk groups shall contain the following attributes:

  • ls_chunk_filename: The original filename of the chunk, with original tiff file extension.

###Frame dataset

Each chunk contains a set of frames holding actual datasets with 2D grayscale maps (each representing a particular channel of a fixed-depth slice of the sample). All light-sheet image datasets shall contain the following attributes:

  • ls_ver: The light shield version type.
  • ls_n: The frame number
  • ls_offset: The offset of the frame
  • ls_channel: The channel of the frame
  • ls_time: The time of the frame
  • ls_z_request: The requested z value(regular)
  • ls_z_measured: The measured z value(should theoretically match the requested z value, but most images have poor calibration)
Clone this wiki locally