You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Implementing A DelayedArray Backend vignette only covers how to implement a backend for read access only. Backends that support saving DelayedArray objects (a.k.a. realization backends) are not covered yet. Reasons for this are various: no demand so far, exact procedure still kind of a work-in-progress and subject to changes, lack of time, etc...
In the meantime, I'm putting some material here (and will move it to the Implementing A DelayedArray Backend vignette as time allows).
Say we want to implement a realization backend for the ADS format (the imaginary format made up for the Implementing A DelayedArray Backend vignette), the 2 core things we need to implement are:
A RealizationSink subclass for the ADS backend. RealizationSink is a virtual class defined in the DelayedArray package (in R/RealizationSink-class.R). By analogy with the HDF5RealizationSink class defined in the HDF5Array package (in R/writeHDF5Array.R), let's assume that the RealizationSink subclass for the ADS backend will be called ADSRealizationSink.
Coercion methods from ADSRealizationSink to ADSArraySeed, ADSArray, and DelayedArray.
RealizationSink subclass
The purpose of an ADSRealizationSink object is to point to a new ADS dataset and allow writing blocks of data to it. The class definition for ADSRealizationSink would typically look something like:
setClass("ADSRealizationSink",
contains="RealizationSink",
representation(
dim="integer", # Naming this slot "dim" makes dim() work out of the box.
dimnames="list",
type="character", # Single string.
## Additional slots would typically include the path or connection to a file....
...
)
)
Then we need a constructor function for these objects. The constructor should be named as the class and its first 3 arguments should be dim, dimnames, and type. It can have more arguments but those are optional and calling ADSRealizationSink() with the first 3 arguments only (i.e. ADSRealizationSink(dim, dimnames, type)) should work. Furthermore, every call to ADSRealizationSink() should create a new dataset that is ready to be written to.
ADSRealizationSink objects must support the following operations (via defining appropriate methods):
dim(), dimnames(), and type(). These should return the values that were passed to the call to ADSRealizationSink() that was used to create the object.
write_block(). This is a generic defined in the DelayedArray package in R/read_block.R.
close(). This base R S3 generic is promoted to S4 generic in the DelayedArray package in R/RealizationSink-class.R. A default method for RealizationSink objects is provided and does nothing (no-op). Implement a method for ADSRealizationSink objects only if some connection needs to be closed and/or other resources need to be released after writing the data is complete and before the ADSRealizationSink object can be turned into an ADSArraySeed object for reading.
Coercion methods from ADSRealizationSink to ADSArraySeed, ADSArray, and DelayedArray
From ADSRealizationSink to ADSArraySeed
Think of an ADSRealizationSink object as a "write" connection to an ADS data set. Think of an ADSArraySeed object as a "read only" connection to an ADS data set. The purpose of the coercion from ADSRealizationSink to ADSArraySeed is to change the nature of this connection from "write" to "read only" and to produce an object that can be wrapped into a DelayedArray object.
From ADSRealizationSink to ADSArray and DelayedArray
Once we have the above (i.e. ADSRealizationSink objects, ADSRealizationSink() constructor, and the 3 coercion methods), we can realize an arbitrary DelayedArray object x as a new pristine ADSArray object x2 by using the simple code below:
DelayedArray:::BLOCK_write_to_sink() reads blocks from x, realizes them in memory, and writes them to sink (with write_block()).
Note that realize_as_ADSArray() also works on an ordinary array or any array-like object that supports extract_array(), not just on a DelayedArray object.
Add some convenience
Now we can build some convenience on top of this.
One basic convenience is a coercion method from ANY to ADSArray that just does what realize_as_ADSArray() does:
Unfortunately, trying to coerce a DelayedArray or DelayedMatrix object to ADSArray would produce a broken object if we didn't also have the following coercion methods:
So in the same way that an array-like object x (ordinary array or DelayedArray object) can be realized as an HDF5Array or RleArray object with as(x, "HDF5Array") or as(x, "RleArray"), now it can also be realized as an ADSArray object with as(x, "ADSArray").
Real examples of realization backends
Refer to R/writeHDF5Array.R and R/writeTENxMatrix.R in the HDF5Array package for the implementation of the HDF5Array and TENxMatrix realization backends.
Note that you can use supportedRealizationBackends() to see the list of realization backends currently supported.
The text was updated successfully, but these errors were encountered:
The Implementing A DelayedArray Backend vignette only covers how to implement a backend for read access only. Backends that support saving DelayedArray objects (a.k.a. realization backends) are not covered yet. Reasons for this are various: no demand so far, exact procedure still kind of a work-in-progress and subject to changes, lack of time, etc...
In the meantime, I'm putting some material here (and will move it to the Implementing A DelayedArray Backend vignette as time allows).
Say we want to implement a realization backend for the ADS format (the imaginary format made up for the Implementing A DelayedArray Backend vignette), the 2 core things we need to implement are:
A RealizationSink subclass for the ADS backend. RealizationSink is a virtual class defined in the DelayedArray package (in
R/RealizationSink-class.R
). By analogy with the HDF5RealizationSink class defined in the HDF5Array package (inR/writeHDF5Array.R
), let's assume that the RealizationSink subclass for the ADS backend will be called ADSRealizationSink.Coercion methods from ADSRealizationSink to ADSArraySeed, ADSArray, and DelayedArray.
RealizationSink subclass
The purpose of an ADSRealizationSink object is to point to a new ADS dataset and allow writing blocks of data to it. The class definition for ADSRealizationSink would typically look something like:
Then we need a constructor function for these objects. The constructor should be named as the class and its first 3 arguments should be
dim
,dimnames
, andtype
. It can have more arguments but those are optional and callingADSRealizationSink()
with the first 3 arguments only (i.e.ADSRealizationSink(dim, dimnames, type)
) should work. Furthermore, every call toADSRealizationSink()
should create a new dataset that is ready to be written to.ADSRealizationSink objects must support the following operations (via defining appropriate methods):
dim()
,dimnames()
, andtype()
. These should return the values that were passed to the call toADSRealizationSink()
that was used to create the object.write_block()
. This is a generic defined in the DelayedArray package inR/read_block.R
.close()
. This base R S3 generic is promoted to S4 generic in the DelayedArray package inR/RealizationSink-class.R
. A default method for RealizationSink objects is provided and does nothing (no-op). Implement a method for ADSRealizationSink objects only if some connection needs to be closed and/or other resources need to be released after writing the data is complete and before the ADSRealizationSink object can be turned into an ADSArraySeed object for reading.Coercion methods from ADSRealizationSink to ADSArraySeed, ADSArray, and DelayedArray
From ADSRealizationSink to ADSArraySeed
Think of an ADSRealizationSink object as a "write" connection to an ADS data set. Think of an ADSArraySeed object as a "read only" connection to an ADS data set. The purpose of the coercion from ADSRealizationSink to ADSArraySeed is to change the nature of this connection from "write" to "read only" and to produce an object that can be wrapped into a DelayedArray object.
From ADSRealizationSink to ADSArray and DelayedArray
A basic example
Once we have the above (i.e. ADSRealizationSink objects,
ADSRealizationSink()
constructor, and the 3 coercion methods), we can realize an arbitrary DelayedArray objectx
as a new pristine ADSArray objectx2
by using the simple code below:DelayedArray:::BLOCK_write_to_sink()
reads blocks fromx
, realizes them in memory, and writes them tosink
(withwrite_block()
).Note that
realize_as_ADSArray()
also works on an ordinary array or any array-like object that supportsextract_array()
, not just on a DelayedArray object.Add some convenience
Now we can build some convenience on top of this.
One basic convenience is a coercion method from ANY to ADSArray that just does what
realize_as_ADSArray()
does:Unfortunately, trying to coerce a DelayedArray or DelayedMatrix object to ADSArray would produce a broken object if we didn't also have the following coercion methods:
So in the same way that an array-like object
x
(ordinary array or DelayedArray object) can be realized as an HDF5Array or RleArray object withas(x, "HDF5Array")
oras(x, "RleArray")
, now it can also be realized as an ADSArray object withas(x, "ADSArray")
.Real examples of realization backends
Refer to
R/writeHDF5Array.R
andR/writeTENxMatrix.R
in the HDF5Array package for the implementation of the HDF5Array and TENxMatrix realization backends.Note that you can use
supportedRealizationBackends()
to see the list of realization backends currently supported.The text was updated successfully, but these errors were encountered: