Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
- Dec 12, 2024: 💻 Add Project Page
- Dec 5, 2024: 🤗 Paper release
- Infinity-2B (Text-to-Image Model)
- Web Demo
- Inference
- Checkpoints
We present Infinity, a Bitwise Visual AutoRegressive Modeling capable of generating high-resolution, photorealistic images following language instruction. Infinity refactors visual autoregressive model under a bitwise token prediction framework with an infinite-vocabulary classifier and bitwise self-correction mechanism. By theoretically expanding the tokenizer vocabulary size to infinity in Transformer, our method significantly unleashes powerful scaling capabilities to infinity compared to vanilla VAR. Extensive experiments indicate Infinity outperforms AutoRegressive Text-to-Image models by large margins, matches or exceeds leading diffusion models. Without extra optimization, Infinity generates a 1024
This project is licensed under the MIT License - see the LICENSE file for details.