You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Problem
When using a transform that uses lightly.transforms.gaussian_blur.GaussianBlur or lightly.transforms.solarization.RandomSolarization, e.g. DINOTransform, the input must be transformed to a PIL.Image first, because the Gaussian blur and the solarization use PIL.ImageFilter.GaussianBlur and PIL.ImageOps.solarize respectively. For training, the PIL.Image has to be transformed back to a tensor.
Background
It might be beneficial to get rid of this conversion for performance purposes. Sometimes, using a different reader like pyvips/openslide/cucim is necessary to extract only a patch of an image because images are too big to fit in memory (e.g. in computational pathology or remote sensing). Here, often patches are extracted and immediately transformed to torch tensors for torchvision to transform the patches. Doing all the transforms on tensors overcomes converting to other formats, like PIL, increasing performance.
Alternative
Torchvision provides seemingly fast (jitted) implementations for solarization [1] and gaussian blurring [2].
I'm not aware of any other transforms in the lightly package that rely on PIL.
There is a previous discussion on supporting tensors as transform inputs #791
I think we can switch to torchvision solarization and blur implementations. When making the change we should take #1052 into account and make sure we don't re-introduce a change in the blurring.
Problem
When using a transform that uses
lightly.transforms.gaussian_blur.GaussianBlur
orlightly.transforms.solarization.RandomSolarization
, e.g.DINOTransform
, the input must be transformed to aPIL.Image
first, because the Gaussian blur and the solarization usePIL.ImageFilter.GaussianBlur
andPIL.ImageOps.solarize
respectively. For training, thePIL.Image
has to be transformed back to a tensor.Background
It might be beneficial to get rid of this conversion for performance purposes. Sometimes, using a different reader like pyvips/openslide/cucim is necessary to extract only a patch of an image because images are too big to fit in memory (e.g. in computational pathology or remote sensing). Here, often patches are extracted and immediately transformed to torch tensors for torchvision to transform the patches. Doing all the transforms on tensors overcomes converting to other formats, like PIL, increasing performance.
Alternative
Torchvision provides seemingly fast (jitted) implementations for solarization [1] and gaussian blurring [2].
I'm not aware of any other transforms in the lightly package that rely on PIL.
References
[1] implementation docs
[2] implementation docs
The text was updated successfully, but these errors were encountered: