Visual saliency is the ability to differentiate the important parts of the image so that they stand out from their neighbors and grab our attention. It decreases the amount of visual data that is to be processed. It is very crucial, as images can be processed without knowing their actual contents. In this project, we have detected saliency regions of an image using dynamic mode decomposition (DMD) and its variants (rSVD-DMD, TDMD). A novel method of image representation was employed to make DMD applicable to static images. We have utilized color and luminance information and generated a saliency map. The complex and non-linear nature of the human visual system is modeled by making use of the power of the different color spaces to generate a color based saliency map. The fact that the sensitivity of the human eye towards brightness is higher than color is utilized to generate luminance saliency maps. The method employed is computationally less expensive, straightforward, and produces full-resolution saliency maps. The validity of the generated saliency map is estimated using standard performance measures such as F-measure, precision, recall, mean absolute error (MAE), and area under the ROC curve
Need for DMD:
The conventional methods take either the spatial components or the temporal components to generate saliency maps, while the DMD takes both the spatial and temporal components into account to generate saliency maps. It is computationally efficient and gives better results than the conventional methods.
The main objective of this project is to exploit the analytical power of DMD and its variants, such as randomized SVD DMD, total DMD, to generate saliency maps and to visually and quantitatively analyze the effectiveness of each of these variants. This project also aims to analyze color spaces in order to improve image representation by separating luminance and chrominance and generating the saliency map. Also, the spatial information of the generated saliency map is utilized to further enhance it.
The dynamic mode decomposition (DMD) is the foremost algorithm to decompose complex systems into spatiotemporal coherent structures, or modes. These are then used for short term future state prediction and control. Generally, dynamic mode decomposition identifies modes that are spatially correlated and oscillate at a fixed frequency in time with a growing or decaying envelope. As a result DMD is considered as an ideal combination of
- Model reduction technique
- Fourier transforms in time
In this project, we investigate the efficacy of the DMD and its variants, such as the randomized singular value based DMD and the total DMD, for saliency region detection of an image.
Singular value decomposition is a matrix factorization technique that computes the low rank matrix approximation of the data points. In order to obtain a faster and more efficient low rank approximation, randomization concepts are introduced. The rSVD factorization is done as follows:
- For a given input matrix, a low dimensional subspace is constructed
- Standard factorization is performed for the constructed basis by confining the input matrix to the low dimensional subspace.
Here, an RGB image of size (mxn) is converted to different color spaces such as CIELab, YUV and YCbCr. With the different color space transformations, the images are better represented and have the capability of separating luminance and chrominance. The chrominance and the luminance are based on the degree of receptivity towards brightness and color information.
- Channels used to generate saliency map based on luminance
- Y1 from YUV
- Y2 from YCbCr
- L from CIELab
- Channels used to generate saliency map based on color
- U and V from YUV
- Cb and Cr from YCbCr
- a and b from CIELab
We can see from this that salient regions in an image have varying degrees of apparentness in different color channels. The salient regions are more prominent in more than one color channel. From the above analysis, we observe that salient regions are clearer in b from CIELab, V from YUV and Cr from YCbCr channels for the image that have clear distinct differentiation between salient part and background. Similarly, the salient regions are clearer in a from CIELab, U from YUV and Cb from YCbCr channels for the image that have a blurred background
We can observe that the SVD-reconstructed images have more salient features. Also, as the number of singular values used for reconstruction increases, the salient region becomes clearer while the background remains relatively static after a set of iterations.
In this project, we have used the ECSSD (Extended Complex Scene Saliency Dataset). It consists of 1000 images, which includes many semantically meaningful and structurally complex images for evaluation. The images that fall into the category of natural images are present here. These images have textures and structures common to real world images. Several examples with their masks are provided. We used 100 random images from this dataset for quantitative comparison.
Link For Dataset: Click here
Standard DMD:
The proposed method in this project is especially targeted for the natural scenes. It widely uses the general information of the images and is suboptimal for the images with highly textures background having the same color distribution as foreground. This method fails when the images have a highly structured background with the same color distributions as the foreground. The dynamic mode decomposition method captures the difference between consecutive columns in the data matrix and the modes bounded away from the origin are detected as the salient region. In scenarios where the background and foreground distributions in the image are highly overlapping and the background is highly textured, the color and intensity-based image representations become undesirable for the application of DMD.
From the experiments performed we can conclude that standard DMD offers satisfactory performance. The worst performance is obtained for rSVD-DMD because it is affected by the fluctuations and uncertainties in the measured data.