Add conversion factor to waveform columns #1422

CodyCBakerPhD · 2022-01-10T18:33:05Z

Motivation

For large amounts of recording data, even adding waveform snippets to the columns of a Units table can be a significant task. One step towards reducing unnecessary data inflation is to add attributes for scaling factors similar to the behavior of other TimeSeries-like objects that allow the data to be stored in some minimal base type, while the conversion factor then scales it into the specified scientific units.

I've added these attributes and attempted to propagate them through the IO, mirroring the patterns established by the waveform_rate attribute.

Sister PR: NeurodataWithoutBorders/nwb-schema#491

How to test the behavior?

Behavior is showcased in the steps, identical to waveform_rate. What is currently untested in both cases is the default behavior of merely calling nwbfile.add_unit() the first time in a fresh NWBFile, which auto-generates a blank Units table. We should discuss how attributes of that table ought to be set in that situation (probably just be always being sure to define nwbfile.Units = Units(**my_attributes) before adding any actual units).

Checklist

Did you update CHANGELOG.md with your changes?
Have you checked our Contributing document?
Have you ensured the PR clearly describes the problem and the solution?
Is your contribution compliant with our coding style? This can be checked running flake8 from the source directory.
Have you checked to ensure that there aren't other open Pull Requests for the same change?

CodyCBakerPhD · 2022-01-10T18:43:56Z

@rly This would be a bit more proper to utilize the MeasurementData extension to the VectorData being proposed at the higher hdmf level, and while it's clear how to specify that in the nwb-schema it's not as clear to me how to actually get the adjust the columns of the DynamicTable here in pynwb to use that.

CodyCBakerPhD · 2022-01-20T20:22:07Z

src/pynwb/misc.py

+        {'name': 'waveforms', 'description': waveforms_desc, 'index': 2,
+         'class': MeasurementData, 'unit': 'volts', 'conversion': 1., 'offset': 0.}


@rly I had assumed this logic would propagate to https://github.com/catalystneuro/hdmf/blob/dev/src/hdmf/common/table.py#L473-L480 and allow the values to be specified at __init__ of the UnitsTable in this fashion, but it's not recognizing that this subclass of VectorData has any additional arguments - any ideas?

How would you recommend passing these values (conversion + offset and maybe units) both for the initiation of the UnitsTable (including cases where no units have waveforms added later), as well as manual user specification of these values (possibly during add_unit)?

allow_extra=True needs to be added to the docval of DynamicTable.add_column, i.e.,

{'name': 'col_cls', 'type': type, 'default': VectorData, 'doc': ('class to use to represent the column data. If table=True, this field is ignored and a ' 'DynamicTableRegion object is used. If enum=True, this field is ignored and a EnumData ' 'object is used.')}, allow_extra=True) def add_column(self, **kwargs): # noqa: C901

The tests still fail with that because conversion=1. is being passed but a VectorData is expected based on the spec

CodyCBakerPhD · 2022-01-20T20:23:52Z

tests/unit/test_misc.py

@@ -173,13 +173,14 @@ def test_add_waveforms(self):
                    [1, 2, 3]   # spike 4
                ]
            ]
-        ut.add_unit(waveforms=wf1)
+        ut.add_unit(waveforms=wf1, unit='volts', conversion=1., offset=0.)


@rly I was wondering if it would be easiest to just have these specified either (a) the first time waveforms are passed in add_unit, (b) every time they are passed in add_unit, or (c) they should be defined on the first call of Units(...) with defaults set in case no waveforms are intended to be added to the table (which is also kind of what I was trying to do above).

It depends on what use case we want to support -- a single conversion and offset for all units or a conversion and offset for each unit. In my experience, the latter case, where there is more than one set of conversion and offset for all the units is much less common, so I think we need not support it by default. In that case, we could do something like what has been implemented for waveform unit and rate, where it is set on the constructor

rly · 2022-01-20T21:33:56Z

requirements-min.txt

@@ -1,6 +1,6 @@
 # minimum versions of package dependencies for installing PyNWB
 h5py==2.10  # support for selection of datasets with list of indices added in 2.10
-hdmf==3.1.1
+hdmf @ git+https://github.com/catalystneuro/hdmf.git@add_measurement_vector_data


Unfortunately this will not clone the repo with the submodule, so the CI breaks

Maybe it needs a --recurse-submodule option on it? I've never worked with this many nested git submodules before, lol

CodyCBakerPhD added 2 commits January 10, 2022 17:40

added waveform conversion/offset details

4418738

mirrored waveform attribute changes

055835b

CodyCBakerPhD mentioned this pull request Jan 10, 2022

Add waveform conversion factor and offset to schema NeurodataWithoutBorders/nwb-schema#491

Closed

3 tasks

CodyCBakerPhD added 5 commits January 20, 2022 19:19

hdmf level changes

c156277

integrating with MeasurementData

a6e03ee

updated for actual usage of MeasurementData class

fc7587b

remove unused logic

14f649c

remove unused attributes

2c5a2e5

CodyCBakerPhD commented Jan 20, 2022

View reviewed changes

CodyCBakerPhD and others added 3 commits January 20, 2022 20:26

fix deps

c430a6c

Update .gitmodules

2ceb1ba

Update .gitmodules

4aee7df

rly reviewed Jan 20, 2022

View reviewed changes

CodyCBakerPhD self-assigned this Apr 20, 2022

CodyCBakerPhD closed this Sep 6, 2024

CodyCBakerPhD deleted the change_waveform_column_to_measurement branch September 6, 2024 19:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add conversion factor to waveform columns #1422

Add conversion factor to waveform columns #1422

CodyCBakerPhD commented Jan 10, 2022 •

edited

Loading

CodyCBakerPhD commented Jan 10, 2022

CodyCBakerPhD Jan 20, 2022

rly Jan 20, 2022 •

edited

Loading

CodyCBakerPhD Jan 20, 2022

rly Jan 20, 2022

rly Jan 20, 2022

CodyCBakerPhD Jan 20, 2022

		{'name': 'waveforms', 'description': waveforms_desc, 'index': 2,
		'class': MeasurementData, 'unit': 'volts', 'conversion': 1., 'offset': 0.}

Add conversion factor to waveform columns #1422

Add conversion factor to waveform columns #1422

Conversation

CodyCBakerPhD commented Jan 10, 2022 • edited Loading

Motivation

How to test the behavior?

Checklist

CodyCBakerPhD commented Jan 10, 2022

CodyCBakerPhD Jan 20, 2022

Choose a reason for hiding this comment

rly Jan 20, 2022 • edited Loading

Choose a reason for hiding this comment

CodyCBakerPhD Jan 20, 2022

Choose a reason for hiding this comment

rly Jan 20, 2022

Choose a reason for hiding this comment

rly Jan 20, 2022

Choose a reason for hiding this comment

CodyCBakerPhD Jan 20, 2022

Choose a reason for hiding this comment

CodyCBakerPhD commented Jan 10, 2022 •

edited

Loading

rly Jan 20, 2022 •

edited

Loading