Skip to content

Commit

Permalink
Restructured the documentation for datasets.
Browse files Browse the repository at this point in the history
Update #127
  • Loading branch information
eugenwintersberger committed Nov 10, 2017
1 parent f24500d commit 160019d
Show file tree
Hide file tree
Showing 4 changed files with 225 additions and 205 deletions.
2 changes: 2 additions & 0 deletions doc/source/users_guide/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ set(SOURCES index.rst
groups.rst
overview.rst
dataspace.rst
dataspace_custom_types.rst
dataspace_selections.rst
)

add_sphinx_source(${SOURCES})
Expand Down
209 changes: 4 additions & 205 deletions doc/source/users_guide/dataspace.rst
Original file line number Diff line number Diff line change
Expand Up @@ -189,217 +189,16 @@ STL container to obtain all simple dataspaces in a collection sdfsdfsf
return space.type() == Type::SIMPLE;
});
Dataspace type trait
====================

When working with user defined types a new type trait to create a dataspace
must be provided if something else than a scalar dataspace should be
returned for this type.

As an example we consider here a trait for a 3x3 matrix type. The C++ class
template for such a class could look like this

.. code-block:: cpp
template<typename T> class Matrix
{
private:
std::array<T,9> data_;
public:
T *data();
const T *data() const;
};
Now as a dataspace for such a type we would like to have a simple dataspace
of shape 3x3 and fixed size. The type trait which must be provided could
look like this

.. code-block:: cpp
#include <h5cpp/hdf5.cpp>
namespace hdf5 {
namespace dataspace {
template<> class TypeTrait<Matrix>
{
public:
using DataspaceType = Simple;
static DataspaceType create(const Matrix &)
{
return Simple({3,3});
}
static void *ptr(Matrix &value)
{
return reinterpret_cast<void*>(value.data());
}
static const void*cptr(const Matrix &value)
{
return reinterpret_cast<const void*>(value.data());
}
};
}
}
Selections
==========

Selections in HDF5 allow the user to read or write only specific data to or
from a file. This is particularly useful if the total size of a dataset
is too large to fit into memory or only the specific data is required
to performa particular action.


.. figure:: ../images/hdf5_selections.svg
:align: center
:width: 60%

HDF5 provides two types of selections

* *hyperslabs* (:cpp:class:`hdf5::dataspace::Hyperslab`) which are
multidimensional selections that maybe can be compared to the complex array
slicing and indexing features that numpy arrays allow in Python
* *point selections* (:cpp:class:`hdf5::dataspace::Points`) which allow picking
individual elements from a dataset.

All selections derive from :cpp:class:`hdf5::dataspace::Selection`. This
class basically provides a single method to apply a selection on a dataspace.


.. attention::

Currently only hyperslabs are implemented in *h5cpp*.


Applying a selection
--------------------

In order to apply a selection you can use the :cpp:class:`SelectionManager`
interface provided by a :cpp:class:`Dataspace` via the public member
:cpp:member:`Dataspace::selection`.
.. figure:: ../images/hdf5_selection_manager.svg
:align: center
:width: 75%

A selection can be applied like this

.. code-block:: cpp
dataspace::Dataspace file_space = dataset.dataspace();
dataspace::Hyperslab slab(...);
file_space.selection(dataspace::SelectionOperation::SET,slab);
.. important::

Both, :cpp:class:`Dataspace` and :cpp:class:`SelectionManager` have a
:cpp:func:`size` method. However, their return value is rather different.
If no selection is applied then both methods return the same value.
However, if a selection is applied :cpp:func:`Dataspace::size` still returns
the total number of elements described by the dataspace while
:cpp:func:`SelectionManager::size` returns the number of selected elements.

.. code-block:: cpp
dataspace::Simple space({1024});
std::cout<<space.size()<<std::endl; // would print 1024
std::cout<<space.selection.size()<<std::endl; // would print 1024
space.selection.none();
std::cout<<space.size()<<std::endl; // would print 1024
std::cout<<space.selection.size()<<std::endl; // would print 0
Multiple selections can be applied onto a single dataspace. The way how
the different selections are combined with each other to form the set of
selected elements can be controlled by *selection operations* which
are determined by the :cpp:enum:`SelectionOperation` enumerations.

Hyperslab selections
--------------------

Hyperslabs allow fairly complex multidimensional selections in a dataspace
which are characterized by 4 quantities

* *offset* the starting index of the hyperslab in the selection
* *block* the number of elements along each dimension of the original dataspace
in a signle block
* *count* the number of blocks along each dimension
* *stride* the offset between each block.

Lets have a look on the following example with a original dataspace of shape
(9,10).

.. figure:: ../images/hyperslab_1.svg
:align: center
:width: 65%

The selected elements are denoted by the red rectangles. Such a hyperslab would
have the following parameters

* *offset* = [1,1]
* *block* = [1,2]
* *count* = [3,3]
* *stride* = [2,1]

To construct such a hyperslab you could use

.. code-block:: cpp
dataspace::Simple space({9,10});
Dimensions offset{1,1};
Dimensions block{1,2};
Dimensions count{3,3};
Dimensions stride{2,1};
dataspace::Hyperslab{offset,block,count,stride};
For details of how to manipulate or alter an instance of
:cpp:class:`dataspace::Hyperslab` see the API documentation for details.

As this is quite some code there are two more additional constructors
which cover common but quite simplier selection scenarios.
The first one covers the selection of a single contiguous region of data
within the dataset. For our above example that could look somehow like this

.. figure:: ../images/hyperslab_2.svg
:align: center
:width: 65%
.. toctree::
:maxdepth: 2

For such a purpose there is a two argument constructor which takes only
the *offset* and the *block* - everything else is set internally to 1
dataspace_custom_types
dataspace_selections

.. code-block:: cpp

Dimensions offset{1,1};
Dimensions block{4,5};
dataspace::Hypeslab{offset,block};
In some applications domains such a selection would be called a
*region of interest* or *ROI*.

The second selection scenario is a number of blocks of size 1 along each
dimension separated by a particular stride.

.. figure:: ../images/hyperslab_3.svg
:align: center
:width: 65%

The constructor call for such a selection would look like this

.. code-block:: cpp
Dimensions offset{1,1};
Dimensions stride{2,3};
Dimensions count{3,3};
dataspace::Hyperslab{offset,count,stride};
Point selections
----------------

.. todo:: write this section


56 changes: 56 additions & 0 deletions doc/source/users_guide/dataspace_custom_types.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
==================================
Using dataspaces with custom types
==================================

When working with user defined types a new type trait to create a dataspace
must be provided if something else than a scalar dataspace should be
returned for this type.

As an example we consider here a trait for a 3x3 matrix type. The C++ class
template for such a class could look like this

.. code-block:: cpp
template<typename T> class Matrix
{
private:
std::array<T,9> data_;
public:
T *data();
const T *data() const;
};
Now as a dataspace for such a type we would like to have a simple dataspace
of shape 3x3 and fixed size. The type trait which must be provided could
look like this

.. code-block:: cpp
#include <h5cpp/hdf5.cpp>
namespace hdf5 {
namespace dataspace {
template<> class TypeTrait<Matrix>
{
public:
using DataspaceType = Simple;
static DataspaceType create(const Matrix &)
{
return Simple({3,3});
}
static void *ptr(Matrix &value)
{
return reinterpret_cast<void*>(value.data());
}
static const void*cptr(const Matrix &value)
{
return reinterpret_cast<const void*>(value.data());
}
};
}
}
Loading

0 comments on commit 160019d

Please sign in to comment.