Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse Numpy Arrays #780

Open
Matagi1996 opened this issue Sep 10, 2024 · 0 comments
Open

Sparse Numpy Arrays #780

Matagi1996 opened this issue Sep 10, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@Matagi1996
Copy link

🚀 Feature Request

Add sparse Numpy Arrays as supported field

Motivation

I was trying to serialize ~150 of overlapping binary masks (think SAM autogenerated masks) / Image to Streaming format. With Overlapping masks I cant safe as Img (one pixel, several values), or n-layer Tiff (too big), so I opted for RLE format, (usually Json)
As saving 200+ Json dicts/Image also seems inefficient, I thought about saving Each RLE as a 1D Vector [Size_X,Size_Y,RLE_Int64] and save them as Numpy Array. This array needs to be 0 padded at the Moment because RLE encoding has different length depending on Mask Size/location.

The abouve encoding seems to work fine and seems fast.
Problem is: RLE Indexes get big, so INT64 is nessesary, making the padding to longest RLE quite wastefull, if I could use sparse Numpy arrays I would not need to pad the array to longest sequence.

[Optional] Implementation

Additional context

With streaming trying to put data belonging to each other as close as possible, I dont even know if sparse arrays is achievable. Maybe there is a workaround with custom datastructures that already exists but i dont think that would be optimal. Inbuild Compression might also already be good enough.

@Matagi1996 Matagi1996 added the enhancement New feature or request label Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant