Skip to content

Commit fb5d0db

Browse files
committed
version 2.1 (retuned weights)
1 parent 9851c9a commit fb5d0db

File tree

2 files changed

+265
-154
lines changed

2 files changed

+265
-154
lines changed

README.md

Lines changed: 107 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# SSIMULACRA 2 - Structural SIMilarity Unveiling Local And Compression Related Artifacts
22

3-
Perceptual metric developed by Jon Sneyers (Cloudinary) in July-October 2022.
3+
Perceptual metric developed by Jon Sneyers (Cloudinary) in July-October 2022, updated in April 2023.
44

55
## Usage
66
```
@@ -22,7 +22,7 @@ SSIMULACRA 2 is based on the concept of the multi-scale structural similarity in
2222
computed in a perceptually relevant color space, adding two other (asymmetric) error maps, and
2323
aggregating using two different norms.
2424

25-
- XYB color space (X+0.5, Y, Y-B+1.0)
25+
- XYB color space (rescaled to a 0..1 range and with B-Y)
2626
- Three error maps:
2727
- SSIM map (with a corrected SSIM formula that avoids applying gamma correction twice)
2828
- 'blockiness/ringing' map (error means distorted has edges where original is smooth)
@@ -31,15 +31,44 @@ aggregating using two different norms.
3131
- Downscaling is done in linear color (i.e. the perceptually correct way)
3232
- For each of these `6*3*3=54` maps, two norms are computed: 1-norm (mean) and 4-norm
3333
- A weighted sum of these `54*2=108` norms leads to the final score
34-
- Weights were tuned based on a large set of subjective scores for images compressed
35-
with JPEG, JPEG 2000, JPEG XL, WebP, AVIF, and HEIC.
34+
- Weights were tuned based on a large set of subjective scores (CID22, TID2013, Kadid10k, KonFiG-IQA),
35+
including images compressed with JPEG, JPEG 2000, JPEG XL, WebP, AVIF, HEIC, and various artificial distortions.
36+
37+
38+
Final results after tuning:
39+
40+
SSIMULACRA 2.1:
41+
42+
Dataset | KRCC | SRCC | PCC |
43+
-- | -- | -- | --
44+
CID22| 0.6903 | 0.8805 | 0.8583
45+
TID2013| 0.6590 | 0.8445 | 0.8471
46+
KADID-10k| 0.6175 | 0.8133 | 0.8030
47+
KonFiG(F)| 0.7668 | 0.9194 | 0.9136
48+
49+
SSIMULACRA 2.0:
50+
51+
Dataset | KRCC | SRCC | PCC |
52+
-- | -- | -- | --
53+
CID22| 0.6934 | 0.8820 | 0.8601
54+
TID2013| 0.6322 | 0.8194 | 0.8103
55+
KADID-10k| 0.5870 | 0.7851 | 0.7018
56+
KonFiG(F)| 0.7813 | 0.9280 | 0.8710
57+
58+
3659

3760
The weight tuning was done by running Nelder-Mead simplex search, optimizing to minimize MSE and to
38-
maximize Kendall and Pearson correlation for training data consisting of 17611 subjective quality scores,
39-
validated on separate validation data consisting of 4292 scores.
61+
maximize Kendall and Pearson correlation for training data consisting of the CID22 training data,
62+
TID2013, KADID-10k and KonFiG (F boosting).
4063

64+
Changes compared to SSIMULACRA 2.0:
4165

42-
Changes compared to the [original version](https://github.com/cloudinary/ssimulacra):
66+
- weights retuned to correlate better with other datasets
67+
- changed the range of the 3 components to ensure they are in 0..1 so the SSIM formula makes sense
68+
- added a polynomial remapping of the error score to allow a better fit to datasets with higher distortions
69+
70+
71+
Changes compared to the [original version (SSIMULACRA 1)](https://github.com/cloudinary/ssimulacra):
4372

4473
- works in XYB color space instead of CIE Lab
4574
- linear downscaling
@@ -54,47 +83,90 @@ Changes compared to the [original version](https://github.com/cloudinary/ssimula
5483

5584
## Metric performance
5685

57-
Results for just the validation set (4292 subjective scores):
5886

59-
Metric | Kendall correlation | Spearman | Pearson |
60-
-- | -- | -- | --
61-
PSNR | 0.34911 | 0.49951 | 0.50134
62-
SSIM | 0.45788 | 0.63796 | 0.56530
63-
VMAF | 0.58789 | 0.78829 | 0.75012
64-
DSSIM | -0.6806 | -0.8721 | -0.8219
65-
Butteraugli max-norm | -0.5499 | -0.7408 | -0.6832
66-
Butteraugli 2-norm | -0.6213 | -0.8089 | -0.7795
67-
SSIMULACRA | -0.5939 | -0.7912 | -0.7862
68-
SSIMULACRA 2 | 0.70330 | 0.88541 | 0.87448
87+
These tables show the correlation of various metrics in terms of
88+
Kendall Rank Correlation Coefficient (KRCC),
89+
Spearman Rank Correlation Coefficient (SRCC),
90+
and Pearson Correlation Coefficient (PCC), for various IQA datasets.
6991

92+
TID2013:
7093

71-
72-
<img src="metric_correlation-scatterplots-MCOS-validation.svg" width="100%"
73-
alt="2D histograms showing correlation between metrics (PSNR, SSIM, VMAF, DSSIM, Butteraugli (max-norm and 2-norm), SSIMULACRA (v1 and v2) and subjective scores on the validation set (4.3k images from 49 originals)">
94+
Metric | KRCC | SRCC | PCC |
95+
-- | -- | -- | --
96+
PSNR-Y | 0.4699 | 0.6394 | 0.428
97+
PSNR-HVS | 0.5464 | 0.698 | 0.6846
98+
SSIM | 0.5707 | 0.7552 | 0.764
99+
MS-SSIM | 0.6068 | 0.7868 | 0.7802
100+
VMAF | 0.5608 | 0.7439 | 0.7728
101+
SSIMULACRA 2 | 0.6322 | 0.8194 | 0.8103
102+
SSIMULACRA 2.1 | 0.659 | 0.8445 | 0.8471
103+
DSSIM | -0.6984 | -0.871 | -0.8021
104+
Butteraugli (3-norm) | -0.4935 | -0.6639 | -0.4878
105+
PSNR (ImageMagick) | 0.4958 | 0.6869 | 0.6601
106+
107+
KADID-10k:
108+
109+
Metric | KRCC | SRCC | PCC |
110+
-- | -- | -- | --
111+
PSNR-Y | 0.4555 | 0.6319 | 0.5932
112+
PSNR-HVS | 0.4229 | 0.5927 | 0.5949
113+
SSIM | 0.5889 | 0.7806 | 0.6576
114+
MS-SSIM | 0.6466 | 0.8359 | 0.6836
115+
VMAF | 0.5343 | 0.7253 | 0.7185
116+
SSIMULACRA 2 | 0.587 | 0.7851 | 0.7018
117+
SSIMULACRA 2.1 | 0.6175 | 0.8133 | 0.803
118+
DSSIM | -0.6679 | -0.8561 | -0.6544
119+
Butteraugli (3-norm) | -0.3846 | -0.543 | -0.4424
120+
PSNR (ImageMagick) | 0.4876 | 0.6757 | 0.6214
74121

75122

76-
Results for the full dataset (almost 22k subjective scores):
123+
KonFiG-IQA: (Experiment I, F boosting, clamping negative JND (better than reference) to zero)
77124

78-
Metric | Kendall correlation | Spearman | Pearson |
125+
Metric | KRCC | SRCC | PCC |
79126
-- | -- | -- | --
80-
PSNR | 0.34721 | 0.50021 | 0.48171
81-
SSIM | 0.41971 | 0.59406 | 0.53003
82-
VMAF | 0.61764 | 0.81639 | 0.77992
83-
DSSIM | -0.6427 | -0.8399 | -0.7813
84-
Butteraugli max-norm | -0.5842 | -0.7738 | -0.7073
85-
Butteraugli 2-norm | -0.6575 | -0.8455 | -0.8088
86-
SSIMULACRA | -0.5255 | -0.7174 | -0.6939
87-
SSIMULACRA 2 | 0.69339 | 0.88203 | 0.86007
127+
PSNR-Y | 0.5871 | 0.7598 | 0.6968 |
128+
PSNR-HVS | 0.7798 | 0.9277 | 0.8453 |
129+
SSIM | 0.6156 | 0.7795 | 0.7052 |
130+
MS-SSIM | 0.6635 | 0.8299 | 0.6834 |
131+
VMAF | 0.3866 | 0.4906 | 0.463 |
132+
SSIMULACRA 2 | 0.7813 | 0.928 | 0.871 |
133+
SSIMULACRA 2.1 | 0.7668 | 0.9194 | 0.9136 |
134+
DSSIM | -0.7595 | -0.9147 | -0.673 |
135+
Butteraugli (3-norm) | -0.771 | -0.9238 | -0.7587 |
136+
PSNR (ImageMagick) | 0.6531 | 0.8248 | 0.7218 |
137+
88138

139+
CID22 full set: (22k subjective scores)
89140

141+
Metric | KRCC | SRCC | PCC |
142+
-- | -- | -- | --
143+
PSNR-Y | 0.4452 | 0.6246 | 0.5901
144+
PSNR-HVS | 0.6076 | 0.81 | 0.7559
145+
SSIM | 0.5628 | 0.7577 | 0.7005
146+
MS-SSIM | 0.5596 | 0.7551 | 0.7035
147+
VMAF | 0.6176 | 0.8163 | 0.7799
148+
SSIMULACRA 2 | 0.6934 | 0.882 | 0.8601
149+
SSIMULACRA 2.1 | 0.6903 | 0.8805 | 0.8583
150+
DSSIM | -0.6428 | -0.8399 | -0.7813
151+
Butteraugli 3-norm | -0.6547 | -0.8387 | -0.7903
152+
PSNR (ImageMagick) | 0.3472 | 0.5002 | 0.4817
90153

91-
<img src="metric_correlation-scatterplots-MCOS-all.svg" width="100%"
92-
alt="2D histograms showing correlation between metrics (PSNR, SSIM, VMAF, DSSIM, Butteraugli (max-norm and 2-norm), SSIMULACRA (v1 and v2) and subjective scores on the full data (22k images from 250 originals)">
93154

155+
CID22 validation set: (4292 subjective scores, not used for tuning)
94156

157+
Metric | KRCC | SRCC | PCC |
158+
-- | -- | -- | --
159+
PSNR-Y | 0.4734 | 0.6577 | 0.6354
160+
PSNR-HVS | 0.6199 | 0.8224 | 0.7848
161+
SSIM | 0.6028 | 0.7871 | 0.7647
162+
MS-SSIM | 0.5915 | 0.7781 | 0.7601
163+
VMAF | 0.588 | 0.7884 | 0.7502
164+
SSIMULACRA 2 | 0.7033 | 0.8854 | 0.8745
165+
SSIMULACRA 2.1 | 0.7077 | 0.8904 | 0.8787
166+
DSSIM | -0.6807 | -0.8722 | -0.822
167+
Butteraugli 3-norm | -0.6102 | -0.7938 | -0.745
168+
PSNR (ImageMagick) | 0.3491 | 0.4995 | 0.5013
95169

96-
Computing the mean absolute error between opinion scores (on a scale of 0 to 100) and SSIMULACRA 2 results,
97-
on the full set the MAE is 5.32 and on the validation set the MAE is 4.97.
98170

99171

100172
## Building

0 commit comments

Comments
 (0)