Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Digital Compression #414

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Added Digital Compression #414

wants to merge 2 commits into from

Conversation

clackner-gpa
Copy link
Member

@clackner-gpa clackner-gpa commented Jan 31, 2025

This adds a new compression option for datablobs.

It is used if < 10% of points represent actual changes in values (e.g. Digitals)

It compresses the data into a byte array containg:
# of sampes (int)
SeriesID (int)
First Timestamp (long)
Ticks between samples (long)
Offset and scaling for values (2 doubles)
And the data for every change in Value in form
Timestamp (short long) Value (ushort)

With this PR there is no difference except in length of the data blob.

decompressedValue = decompressionScale * compressedValue + decompressionOffset;
while (time > (lastTime + samplingrate))
{
lastTime += samplingrate;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if rounding errors could potentially introduce additional data points between state changes. Do you think it might cause problems downstream if points.Count > m_samples?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m_samples get's removed from the blob so I am not sure that even matters.

By definition we assume these "Digitals" are evenly sampled so it will generate an evenly sampled flat line until it hits a state change. In theory it is possible that we Loose some points in between or generate extra ones, but since it's a flat line anyway I am not too concerned about that.

Occasionally we may compress a latched analog this way, which I suppose could be an issue, but I believe we already assume fixed sampling rate in a few places when doing math so not sure that really matters either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off the top of my head, my main concern would be this.

// If the data being added matches the parameters for this data group, add the data to the data group
// Note that it does not have to match Asset
if (startTime == m_startTime && endTime == m_endTime && samples == m_samples)
{
m_dataSeries.Add(dataSeries);
return true;
}
return false;

If the actual number of decoded data points doesn't match DataGroup.m_samples, then the DataGroup.Add() method will fail to even include the digital series in the DataGroup. This is called directly by DataGroup.FromData() so the decoded data would just be silently dropped on the floor, and we would be wondering why it's not getting returned to the visualization.

Also, I suppose I'm assuming that there will be a follow-up PR that includes changes to DataGroup.FromData()?

@clackner-gpa
Copy link
Member Author

Note with the switch to use ulong for Timestamps, we add an extra byte per state change. We could further reduce this and optimize for space, but I don't think it's worth it considering there will be very few state changes in a true digital. Anything we save over a full standard blob is already a bonus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants