-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add conversion factor to waveform columns #1422
Add conversion factor to waveform columns #1422
Conversation
@rly This would be a bit more proper to utilize the |
{'name': 'waveforms', 'description': waveforms_desc, 'index': 2, | ||
'class': MeasurementData, 'unit': 'volts', 'conversion': 1., 'offset': 0.} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rly I had assumed this logic would propagate to https://github.com/catalystneuro/hdmf/blob/dev/src/hdmf/common/table.py#L473-L480 and allow the values to be specified at __init__
of the UnitsTable
in this fashion, but it's not recognizing that this subclass of VectorData
has any additional arguments - any ideas?
How would you recommend passing these values (conversion + offset and maybe units) both for the initiation of the UnitsTable
(including cases where no units have waveforms added later), as well as manual user specification of these values (possibly during add_unit
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allow_extra=True
needs to be added to the docval of DynamicTable.add_column
, i.e.,
{'name': 'col_cls', 'type': type, 'default': VectorData,
'doc': ('class to use to represent the column data. If table=True, this field is ignored and a '
'DynamicTableRegion object is used. If enum=True, this field is ignored and a EnumData '
'object is used.')},
allow_extra=True)
def add_column(self, **kwargs): # noqa: C901
The tests still fail with that because conversion=1.
is being passed but a VectorData is expected based on the spec
@@ -173,13 +173,14 @@ def test_add_waveforms(self): | |||
[1, 2, 3] # spike 4 | |||
] | |||
] | |||
ut.add_unit(waveforms=wf1) | |||
ut.add_unit(waveforms=wf1, unit='volts', conversion=1., offset=0.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rly I was wondering if it would be easiest to just have these specified either (a) the first time waveforms are passed in add_unit
, (b) every time they are passed in add_unit
, or (c) they should be defined on the first call of Units(...)
with defaults set in case no waveforms are intended to be added to the table (which is also kind of what I was trying to do above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on what use case we want to support -- a single conversion and offset for all units or a conversion and offset for each unit. In my experience, the latter case, where there is more than one set of conversion and offset for all the units is much less common, so I think we need not support it by default. In that case, we could do something like what has been implemented for waveform unit and rate, where it is set on the constructor
@@ -1,6 +1,6 @@ | |||
# minimum versions of package dependencies for installing PyNWB | |||
h5py==2.10 # support for selection of datasets with list of indices added in 2.10 | |||
hdmf==3.1.1 | |||
hdmf @ git+https://github.com/catalystneuro/hdmf.git@add_measurement_vector_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately this will not clone the repo with the submodule, so the CI breaks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it needs a --recurse-submodule
option on it? I've never worked with this many nested git submodules before, lol
Motivation
For large amounts of recording data, even adding waveform snippets to the columns of a
Units
table can be a significant task. One step towards reducing unnecessary data inflation is to add attributes for scaling factors similar to the behavior of otherTimeSeries
-like objects that allow the data to be stored in some minimal base type, while the conversion factor then scales it into the specified scientific units.I've added these attributes and attempted to propagate them through the IO, mirroring the patterns established by the
waveform_rate
attribute.Sister PR: NeurodataWithoutBorders/nwb-schema#491
How to test the behavior?
Behavior is showcased in the steps, identical to
waveform_rate
. What is currently untested in both cases is the default behavior of merely callingnwbfile.add_unit()
the first time in a fresh NWBFile, which auto-generates a blankUnits
table. We should discuss how attributes of that table ought to be set in that situation (probably just be always being sure to definenwbfile.Units = Units(**my_attributes)
before adding any actual units).Checklist
flake8
from the source directory.