teaser.mp4
StyleMaster: Stylize Your Video with Artistic Generation and Translation
Zixuan Ye1 †, Huijuan Huang2✉, Xintao Wang2, Pengfei Wan2, Di Zhang2, Wenhan Luo1✉
1 Hong Kong University of Science and Technology
2 Kuaishou Technology
† Intern at KwaiVGI, Kuaishou Technology
✉ Corresponding Author
- [2024.12.11] arXiv preprint is available.
Welcome to StyleMaster! StyleMaster focuses on style control, i.e., generating or translating a video to match the style of a given reference image. StyleMaster preserves local textures and enhance global style representations. Additionally, a motion adapter and gray tile ControlNet are employed to enhance motion quality and provide precise content guidance.
- Local Patch Selection: Overcomes content leakage in style transfer by selecting patches with less similarity to text prompts.
- Global Style Extraction: Uses a projection module after CLIP supervised by illusion datasets.
- Motion Adapter: Enhances motion quality during inference and helps to enhance the style extent.
- Gray Tile ControlNet: Provides accessible yet precise content guidance for video style transfer.
- High-Quality Video Generation: Generates videos with high style similarity to the reference image and achieves ideal translation results.
We also encourage readers to follow other exciting master-series works:
- 3DTrajMaster: control multiple entity motions in 3D space (6DoF) for text-to-video generation
- SynCamMaster: multi-camera synchronized video generation from diverse viewpoints
and other exciting stylization methods:
- StyleCrafter: first achieves style control in video generation
- InstantStyle: exciting image stylization work
@article{ye2024stylemaster,
title={StyleMaster: Stylize Your Video with Artistic Generation and Translation},
author={Ye, Zixuan and Huang, Huijuan and Wang, Xintao and Wan, Pengfei and Zhang, Di and Luo, Wenhan},
journal={arXiv preprint arXiv:2412.07744},
year={2024}
}