Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory consumption #40

Open
jkh1 opened this issue Mar 11, 2019 · 5 comments
Open

Reduce memory consumption #40

jkh1 opened this issue Mar 11, 2019 · 5 comments

Comments

@jkh1
Copy link

jkh1 commented Mar 11, 2019

EBImage requires too much memory to read tiff files containing multiple images even though the images fit comfortably in memory. For example reading a tiff file of size 250x250x1000 requires 6 GB of memory although the resulting EBImage object takes 1.5 GB.
Would it be possible to implement a virtual image stack like in ImageJ? Or at least a form of lazy-loading that only accesses the required subset of the data ?

@aoles
Copy link
Owner

aoles commented Mar 11, 2019

Hi Jean-Karim,

thank you for you feedback! Could you maybe provide a test image illustrating the issue?

For working with large image stacks, in particular for subsetting them, you might want to try out RBioFormats.

Cheers,
Andrzej

@jkh1
Copy link
Author

jkh1 commented Mar 11, 2019

Hi Andrzej,

Here is a test image: https://oc.embl.de/index.php/s/heFlwuQwZOozPnS

This is to use with RStudio and/or Shiny. I am looking into RBioFormats to read one slice at a time but for the whole stack, under RStudio, using RBioFormats is worse because it looks more memory-hungry and quickly runs over the Java memory limit even after specifying a large one e.g. options(java.parameters = "-Xmx12g" ) (maybe there's a memory leak somewhere as it doesn't seem to release all the memory it used after rm(img) and gc()).

@aoles
Copy link
Owner

aoles commented Mar 11, 2019

Thank you! I will do some memory profiling and get back to you once I know more. The problem here is that both EBIOmage and RBioFormats use external libraries to read the data. However, their raw output needs to be further processed before being exposed as Image object. Maybe there are still some optimizations which could be done at our side.

@aoles
Copy link
Owner

aoles commented Mar 12, 2019

Interesting - apparently the problematic part is actually constructing the resulting S4 Image object via a call to new. It seems as if the core R initialization method was introducing some data duplication. I'm not sure about the details though, will need to look into this.

@muschellij2
Copy link

Just +1'ing here. In readImage, you can either set all = TRUE or all = FALSE. Would it be possible to send and index or set of indices to all, where only a subset of the images are loaded? This would be immensely useful for large image stacks.

For example, imread in MATLAB has the ability to read a set of files using https://www.mathworks.com/help/matlab/ref/imread.html#btnczv9-1-Index and also a specific region https://www.mathworks.com/help/matlab/ref/imread.html#btnczv9-1-PixelRegion-dup1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants