-
Notifications
You must be signed in to change notification settings - Fork 1
/
project.html
391 lines (371 loc) · 17.2 KB
/
project.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
<meta name="description" content="Video Event Detection" />
<meta name="keywords" content="Constraint Flow" />
<meta name="author" content="Wei Zhen" />
<link rel="stylesheet" href="projects/css/SPR.css" type="text/css" />
<title>Semantic Preserving Retargeting</title>
</head>
<body>
<!-- --------------------------------
-
- header
-
---- --------------------------------
-->
<div id="header">
<div class="wrap">
<div id="intro">
<h1 align="center" id="logo">Semantic Preserving Retargeting</h1>
<div align="center">
<table width="60%" border="0" align="center" cellpadding="0" cellspacing="0">
<tr>
<!--
Hyper links to authors' homepages
<td width="25%" height="30" align='center'><a href="http://cv.postech.ac.kr/~maga33" target="_blank">Seunghoon Hong<sup>1</sup></a></td>
<td width="25%" height="30" align='center'><a href="https://sites.google.com/a/umich.edu/junhyuk-oh/" target="_blank">Junhyuk Oh<sup>2</sup></a></td>
<td width="25%" height="30" align='center'><a href="http://cv.postech.ac.kr/~bhhan" target="_blank">Bohyung Han<sup>1</sup></a></td>
<td width="25%" height="30" align='center'><a href="http://web.eecs.umich.edu/~honglak" target="_blank">Honglak Lee<sup>2</sup></a></td>
-->
<td width="25%" height="30" align='center'><a target="_blank">Si Liu<sup>1</sup></a></td>
<td width="25%" height="30" align='center'><a target="_blank">Zhen Wei<sup>2</sup></a></td>
<td width="25%" height="30" align='center'><a target="_blank">Yao Sun<sup>1</sup></a></td>
<td width="25%" height="30" align='center'><a target="_blank">Xinyu Ou<sup>3</sup></a></td>
</tr>
</table>
<table>
<tr>
<td height="20" colspan="4" align='center'><sup>1</sup>Institute of Information Engineering, Chinese Academy of Sciences, Beijing, P.R.China</td>
<p align="center">
</p>
</tr>
<tr>
<td height="20" colspan="4" align='center'><sup>2</sup>YingCai Honors School, University of Electronic Science and Technology of China, Sichuan, P.R.China</td>
<p align="center">
</p>
</tr>
<tr>
<td height="20" colspan="4" align='center'><sup>3</sup>School of Computer Science and Technology, Huazhong University of
Science and Technology, Hubei, P.R.China</td>
<p align="center">
</p>
</tr>
</table>
</div>
</div>
<div class="nline1"></div>
</div>
</div>
<!-- --------------------------------
-
- Abstract
-
---- --------------------------------
-->
<div id="cont">
<div class="wrap">
<h2 id="subject">Abstract</h1>
<p align="justify" style="text-indent:2em">
Image retargeting is applied to display any size of images via devices (e.g., cell phone, TV monitors) with
possibly different resolutions. In order to fit the target resolution,
certain less important pixels are distorted. Therefore,
the key problem is to determine the importance of each
pixel. All pixels’ importance scores form an importance map.
Different from traditional methods which generate the importance
map in a bottom-up manner such as estimating eye
fixation, the proposed Semantic Preserving Retargeting
(SPR) method generates the importance map
based on a top-down criterion: the target image retains the
semantic meaning embedded in the original image as much
as possible. To this end, we first extract the semantic meaning
conveyed in the original image using 5 state-of-the-art deep image understanding modules, including image, scene,
action classification, object detection and image parsing. All
the modules generate their own importance maps, where
bigger values indicate more semantic meaning carried by the
corresponding pixels. Extensive experiments are conducted
on the benchmark RetargetMe (80 images) and our collected
Semantic-Retarget dataset (1,080 images). Results from the
Amazon Mechanical Turk show the significant advantage of
our DIR methods over the state-of-the-art image retargeting
methods.
</p>
<div class="line"></div>
</div>
</div>
<!-- --------------------------------
-
- The S-Retarget dataset
-
---- --------------------------------
-->
<div id="cont">
<div class="wrap">
<h2 id="subject">The S-Retarget Dataset</h2>
<p align="justify" style="text-indent:2em">
To comprehensively evaluate different image retargeting methods, we
build a dataset much larger than existing retarget datasets called Semantic-Retarget
(S-Retarget) dataset with 1527 images. We manually selected images from existing datasets
and search engines. Each image has an aspect ratio greater than 1 to better fit the
retargeting problem.According to the contents, all images are divided into 6 categories,
including single person, multiple people, single object, multiple objects,
indoor scene and outdoor scene.
</p>
<h3 id="subject">Annotation</h3>
<p align="justify" style="text-indent:2em">
To annotate the importance maps for each images, we asked 5 annotators firstly write down
the semantic meaning of the image. Then the annotators rate all pixels according to the
degrees of relevance to the global semantics with different values, i.e. 0 (not relevant)
, 0.5 (moderately relevant) and 1 (very relevant) are used here. The final importance map
is obtained by averaging all annotations of each pixel.
</p>
<p align="justify" style="text-indent:2em">
Comparing with existing salinecy datasets, images in S-Retarget dataset are not bound to
contain dominantly large objects and ground truthes are <strong class="fig-label">annotations with weights</strong>
with more information, rather than binary masks.
</p>
<p style="text-align:center">
<img src="projects/images/SPR/annotation_extended.png" alt="" width="800" height="461" align="bottom" />
</p>
<div class="caption">
<p class="caption-content">
<strong class="fig-label">Figure 1</strong>.
Some instances of annotation in S-Retarget dataset. a) The original image; b) Annotations
from two annotators; c) Final annotation; d) Image description. Each captions under every images describe the
identity of <strong class="fig-label">annotations with weights</strong> while
<font color="red">red and bold</font> words stand for essentially salient objects
and <font color="blue">blue and italic</font> words stand for less salient objects.
</p>
</div>
<h3 id="subject">Download</h3>
<p align="justify" style="text-indent:2em">
The S-Retarget dataset will be released soon.
</p>
<div class="line"></div>
</div>
</div>
<!-- --------------------------------
-
- Architecture Overview
-
---- --------------------------------
-->
<div id="cont">
<div class="wrap">
<h2 id="subject">Architecture Overview</h2>
<p align="justify" style="text-indent:2em">
The goal of SPR is that the target image keeps
the global semantics as much as possible. Technically, we
need to generate a global importance map assigning higher
weights to the pixels expressing the global semantics.
</p>
<p align="justify" style="text-indent:2em">
Generating a global importance map is composed of two steps:
1) decomposing global semantics into several local semantics. 2) fusing
the local semantics induced importance maps into a global importance
map.The two processes are shown in the following two figures.
</p>
<p style="text-align:center">
<img src="projects/images/SPR/framework.png" alt="" width="800" height="282" align="bottom" />
</p>
<div class="caption">
<p class="caption-content">
<strong class="fig-label">Figure 1</strong>.
The framework of generating local importance maps.
</p>
</div>
<p style="text-align:center">
<img src="projects/images/SPR/fusion_network.png" alt="" width="850" height="280" align="bottom" />
</p>
<div class="caption">
<p class="caption-content">
<strong class="fig-label">Figure 2</strong>.
The fusion network to synthesize global importance map.
</p>
</div>
<p align="justify" style="text-indent:2em">
The fused importance map, which captures the
context among local semantics and has higher weights
in the pixels corresponding to the global semantics, is fed
into any subsequent target image generation methods.
</p>
<div class="line"></div>
</div>
</div>
<div id="footer">
</div>
<div id="cont">
<div class="wrap">
<h2 id="subject">Performance</h2>
<h3 id="subject">Quantified Evaluation</h3>
<p align="justify" style="text-indent:2em">
We compare our importance maps with state-of-the-art saliency
generating methods under the four evaluation criteria of the MIT saliency
benchmark<sup>[1]</sup> and the Mean Average Error (MAE). For EMD, KL, MAE, the lower
the better while for CC and SIM, the higher the better.
</p>
<p align="justify" style="text-indent:2em">
It can be observed that our fusion network performs best on
the S-Retarget dataset under all evaluation metrics.
</p>
<p align="center">
<strong class="fig-label">Table 1. </strong>
Quantified evaluation of importance map regression on the validation-set in S-Retarget dataset.
<img src="projects/images/SPR/table_importance_map_baseline.png" alt="" width="500" height="334" align="bottom" />
</p>
<h3 id="subject">Retargeting Results</h3>
<p align="justify" style="text-indent:2em">
We applied our system on S-Retarget dataset as well as the RetargetMe<sup>[2]</sup> dataset. The following retargeting
results show that our method can better preserve the semantic meanings in images.
</p>
<p align="center">
<img src="projects/images/SPR/system_compare2.png" alt="" width="800" height="677" align="bottom" />
</p>
<div class="caption">
<p class="caption-content">
<strong class="fig-label">Figure 3</strong>.
Comparisons with SOAT, ISC, Multi-operator, Warp, AAD , OSS on S-Retarget dataset. 6 rows show the results for single person,
multiple people, single object, multiple objects, indoor scene and outdoor scene, respectively
</p>
</div>
<p align="center">
<img src="projects/images/SPR/figure12-2.png" alt="" width="800" height="365" align="bottom" />
</p>
<div class="caption">
<p class="caption-content">
<strong class="fig-label">Figure 4</strong>.
Results on RetargetMe dataset. Target images are obtained by using 3 retargeting methods (AAD, Multi-Op, and IF, and 9 importance
maps ( eDN, GC, oriIF, DNEF, RCC, fine-tuned MC, fine-tuned Mr-cnn, fine-tuned SalNet and our method).
</p>
</div>
<p align="justify" style="text-indent:2em">
We also conducted human evaluations on the Amazon Mechanical Turk (AMT). Our target image and the result
by a baseline are shown in randomly order to the AMT workers, who are asked to select the better one.
The evaluation results are shown below. Each element stands for a contrastive result, for example, the
number “2985(255)” means our result is preferred by 2,985 times while the corresponding baseline method
is favored by 255 times.
</p>
<p align="center">
<strong class="fig-label">Table 2. </strong>
Comparison between our importance map and 8 baseline maps when combined with 3 different carriers on S-Retarget dataset.
<img src="projects/images/SPR/Table2.png" alt="" width="600" height="548" align="bottom" />
</p>
<p align="center">
<strong class="fig-label">Table 3. </strong>
Comparisons with state-of-the-art retargeting systems on S-Retarget dataset.
<img src="projects/images/SPR/Table3.png" alt="" width="500" height="48" align="bottom" />
</p>
<p align="center">
<strong class="fig-label">Table 4. </strong>
Comparison between our importance map and 8 baseline maps when combined with 3 carriers on RetargetMe dataset.
<img src="projects/images/SPR/Table4.png" alt="" width="600" height="101" align="bottom" />
</p>
<div class="line"></div>
</div>
<!-- --------------------------------
-
- Paper
-
---- --------------------------------
-->
<div id="cont">
<div class="wrap">
<h2 id="subject">Paper</h2>
<div class="paper-info-box" align="center">
<div class="paper-title-box">
Semantic Preserving Retargeting
</div>
<div class="paper-author-box">
Si Liu, Zhen Wei, Yao Sun, Xinyu Ou, Xiaochun Cao, Shuicheng Yan
</div>
</div>
<div align="center">
<table width="100%" border="0" align="center" cellpadding="0" cellspacing="0">
<tr>
<td width="18%" align='center'> <a target="_blank"> <div align="center"> <img src='projects/images/SPR/paper.png' width="140" height="198"></div></a></td>
<td width="82%">
<pre>
<code>
@inproceedings{liu2016Semantic,
title={Semantic Preserving Retargeting},
author={Liu, Si and Zhen, Wei and Yao, Sun and Xinyu Ou and Xiaochun Cao and Shuicheng Yan},
journal = {arXiv preprint arXiv:1605.xxxxxx},
year={2016}
}
</code>
</pre>
</td>
<!--
<td width="41%" align='center'> <a href="data/poster_CVPR2011_event.pdf"> <div align="center"> <img src='images/poster.png'></div></a> </td>
<td width="41%" align='center'> <a href="data/constflow.zip"> <div align="center"> <img src='images/code.png'></div></a> </td>
-->
</tr>
<tr>
<!--
Paper download link
<td width="19%" align='center'> <a target="_blank" href="http://arxiv.org/abs/1512.07928">[arxiv preprint]</a></td>
-->
<td width="19%" align='center'> <a target="_blank">[arxiv preprint]</a></td>
</tr>
</table>
</div>
<div class="line"></div>
</div>
</div>
<!-- --------------------------------
-
- Code & Model
-
---- --------------------------------
-->
<div id="cont">
<div class="wrap">
<h2 id="subject">Code & Model</h2>
<p align="justify">
The code and trained model for the proposed method will be released soon.
</p>
<div class="line"></div>
</div>
</div>
<!-- --------------------------------
-
- References
-
---- --------------------------------
-->
<div id="cont">
<div class="wrap">
<h2 id="subject">References</h2>
<ul class="ref-list" align="justify">
<!--
<li class="ref-item">G. Papandreou, L.-C. Chen, K. Murphy, and A. L. Yuille. Weakly-and semi-supervised learning of a DCNN for semantic image segmentation. In ICCV, 2015.</li>
<li class="ref-item">D. Pathak, P. Krahenbuhl, and T. Darrell. Constrained convolutional neural networks for weakly supervised segmentation. In ICCV, 2015</li>
<li class="ref-item">P. O. Pinheiro and R. Collobert. From image-level to pixel-level labeling with convolutional networks. In CVPR, 2015.</li>
<li class="ref-item">S. Hong, H. Noh, and B. Han. Decoupled deep neural network for semi-supervised semantic segmentation. In NIPS, 2015. </li>
-->
<li class="ref-item">Z. Bylinskii, T. Judd, A. Borji, L. Itti, F. Durand, A. Oliva and A. Torralba. MIT Saliency Benchmark. <a href="http://saliency.mit.edu/">http://saliency.mit.edu/</a>.</li>
<li class="ref-item">R. Michael, G. Diego, S. Olga and S. Ariel. A comparative study of image retargeting. In TOG, 2010.</li>
</ul>
<div class="line"></div>
</div>
</div>
<div align="center">
<table width="40%" border="0" align="center" cellpadding="0" cellspacing="0">
<tr>
<td width="25%" height="30" align='center'><font size="2" color="gray">2016, by 魏震<font></td>
<td width="25%" height="30" align='center'>
<!--
<a href="http://www.clustrmaps.com/map/weizhen.science/projects/SPR.html" title="Visitor Map">
<img src="//www.clustrmaps.com/map_v2.png?u=TG9s&d=VrEYRZexUtmR19lKOjXE3S2cx6XmOOdSV63F6f6x778" />
</a>
-->
<a href="http://www.clustrmaps.com/map/spretarget.com/project.html" title="Visitor Map for spretarget.com/project.html">
<img src="//www.clustrmaps.com/map_v2.png?u=TG9s&d=aaM25bS88IavThrLJBjr1ALgexcUhP_ucMybKb4ZGwU" />
</a>
</td>
</tr>
</table>
</div>
</body>
</html>