Skip to content

Commit

Permalink
header
Browse files Browse the repository at this point in the history
  • Loading branch information
as6325400 committed Apr 23, 2024
1 parent c78c99a commit 95634c8
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 11 deletions.
22 changes: 12 additions & 10 deletions components/Header.vue
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,22 @@ const config = reactive({
title: 'MeDM<div>Mediating Image Diffusion Models for Video-to-Video Translation with Temporal Correspondence Guidance</div>',
venue: 'AAAI 2024',
authors: [
{ text: 'Ernie Chu', homepage: 'https://ernestchu.github.io', mark: '' },
{ text: 'Tzuhsuan Huang', homepage: '', mark: '' },
{ text: 'Shuo-Yen Lin', homepage: '', mark: '' },
{ text: 'Jun-Cheng Chen', homepage: 'https://www.citi.sinica.edu.tw/pages/pullpull', mark: '' },
{ text: 'Jia-Wei Liao', homepage: 'https://jwliao1209.github.io/', mark: '' },
{ text: 'Winston Wang', homepage: '', mark: '' },
{ text: 'Tzu-Sian Wang', homepage: '', mark: '' },
{ text: 'Li-Xuan Peng', homepage: '', mark: '' },
{ text: 'Cheng-Fu Chou', homepage: '', mark: ''},
{ text: 'Jun-Cheng Chen', homepage: 'https://www.citi.sinica.edu.tw/pages/pullpull/', mark: '' },
],
affiliations: [
{ text: 'Research Center for Information Technology Innovation, Academia Sinica', mark: '' }
{ text: 'Research Center for Information Technology Innovation, Academia Sinica National Taiwan University', mark: '' }
],
links: [
{ text: 'Paper', url: 'https://doi.org/10.1609/aaai.v38i2.27899', icon: ['ai', 'ai-doi'] },
{ text: 'Poster', url: '/medm-poster.pdf', icon: ['fa-regular', 'fa-file-pdf'] },
{ text: 'arXiv', url: 'https://arxiv.org/abs/2308.10079', icon: ['ai', 'ai-arxiv'] },
{ text: 'Code', url: 'https://github.com/aiiu-lab/MeDM', icon: ['fa-brands', 'fa-github'] },
{ text: 'Citation', url: 'https://github.com/aiiu-lab/MeDM?tab=readme-ov-file#citation', icon: ['fa-brands', 'fa-github'] },
// { text: 'Paper', url: 'https://doi.org/10.1609/aaai.v38i2.27899', icon: ['ai', 'ai-doi'] },
// { text: 'Poster', url: '/medm-poster.pdf', icon: ['fa-regular', 'fa-file-pdf'] },
{ text: 'arXiv', url: 'https://arxiv.org/abs/2403.15878', icon: ['ai', 'ai-arxiv'] },
{ text: 'Code', url: 'https://github.com/jwliao1209/DiffQRCode', icon: ['fa-brands', 'fa-github'] },
{ text: 'Citation', url: 'https://github.com/jwliao1209/DiffQRCode', icon: ['fa-brands', 'fa-github'] },
],
})
Expand Down
2 changes: 1 addition & 1 deletion components/Main.vue
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ function shuffle(array) {

<div class="section-title">Abstract</div>
<p>
MeDM utilizes pre-trained image Diffusion Models for video-to-video translation with consistent temporal flow. The proposed framework can render videos from scene position information, such as a normal G-buffer, or perform text-guided editing on videos captured in real-world scenarios. We employ explicit optical flows to construct a practical coding that enforces physical constraints on generated frames and mediates independent frame-wise scores. By leveraging this coding, maintaining temporal consistency in the generated videos can be framed as an optimization problem with a closed-form solution. To ensure compatibility with Stable Diffusion, we also suggest a workaround for modifying observed-space scores in latent-space Diffusion Models. Notably, MeDM does not require fine-tuning or test-time optimization of the Diffusion Models.
QR codes, prevalent in daily applications, lack visual appeal due to their conventional black-and-white design. Integrating aesthetics while maintaining scannability poses a challenge. In this paper, we introduce a novel diffusion-model-based aesthetic QR code generation pipeline, utilizing pre-trained ControlNet and guided iterative refinement via a novel classifier guidance (SRG) based on the proposed Scanning-Robust Loss (SRL) tailored with QR code mechanisms, which ensures both aesthetics and scannability. To further improve the scannability while preserving aesthetics, we propose a two-stage pipeline with Scanning-Robust Perceptual Guidance (SRPG). Moreover, we can further enhance the scannability of the generated QR code by postprocessing it through the proposed Scanning-Robust Projected Gradient Descent (SRPGD) post-processing technique based on SRL with proven convergence. With extensive quantitative, qualitative, and subjective experiments, the results demonstrate that the proposed approach can generate diverse aesthetic QR codes with flexibility in detail. In addition, our pipelines outperforming existing models in terms of Scanning Success Rate (SSR) 86.67% (+40%) with comparable aesthetic scores. The pipeline combined with SRPGD further achieves 96.67% (+50%).
</p>
<div class="image">
<ImageZoom src="/images/girl.jpg" :options="{ background: imageOverlayColor }" />
Expand Down

0 comments on commit 95634c8

Please sign in to comment.