You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was trying to serialize ~150 of overlapping binary masks (think SAM autogenerated masks) / Image to Streaming format. With Overlapping masks I cant safe as Img (one pixel, several values), or n-layer Tiff (too big), so I opted for RLE format, (usually Json)
As saving 200+ Json dicts/Image also seems inefficient, I thought about saving Each RLE as a 1D Vector [Size_X,Size_Y,RLE_Int64] and save them as Numpy Array. This array needs to be 0 padded at the Moment because RLE encoding has different length depending on Mask Size/location.
The abouve encoding seems to work fine and seems fast.
Problem is: RLE Indexes get big, so INT64 is nessesary, making the padding to longest RLE quite wastefull, if I could use sparse Numpy arrays I would not need to pad the array to longest sequence.
[Optional] Implementation
Additional context
With streaming trying to put data belonging to each other as close as possible, I dont even know if sparse arrays is achievable. Maybe there is a workaround with custom datastructures that already exists but i dont think that would be optimal. Inbuild Compression might also already be good enough.
The text was updated successfully, but these errors were encountered:
🚀 Feature Request
Add sparse Numpy Arrays as supported field
Motivation
I was trying to serialize ~150 of overlapping binary masks (think SAM autogenerated masks) / Image to Streaming format. With Overlapping masks I cant safe as Img (one pixel, several values), or n-layer Tiff (too big), so I opted for RLE format, (usually Json)
As saving 200+ Json dicts/Image also seems inefficient, I thought about saving Each RLE as a 1D Vector [Size_X,Size_Y,RLE_Int64] and save them as Numpy Array. This array needs to be 0 padded at the Moment because RLE encoding has different length depending on Mask Size/location.
The abouve encoding seems to work fine and seems fast.
Problem is: RLE Indexes get big, so INT64 is nessesary, making the padding to longest RLE quite wastefull, if I could use sparse Numpy arrays I would not need to pad the array to longest sequence.
[Optional] Implementation
Additional context
With streaming trying to put data belonging to each other as close as possible, I dont even know if sparse arrays is achievable. Maybe there is a workaround with custom datastructures that already exists but i dont think that would be optimal. Inbuild Compression might also already be good enough.
The text was updated successfully, but these errors were encountered: