-
Global Attention Mechanism: Retain Information to Enhance Channel-Spatial Interactions (https://arxiv.org/pdf/2112.05561)
-
U-Net: Convolutional Networks for Biomedical Image Segmentation (https://arxiv.org/abs/1505.04597)
-
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets (https://arxiv.org/abs/1606.03657v1)
-
Transformers are Graph Neural Networks (https://arxiv.org/abs/2506.22084)
-
Xception: Deep Learning with Depthwise Separable Convolutions (https://arxiv.org/pdf/1610.02357)
-
Attention Is All You Need (https://arxiv.org/pdf/1706.03762)
-
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (https://arxiv.org/pdf/1810.04805)
-
Improving Language Understanding by Generative Pre-Training (https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)
-
Learning Transferable Visual Models From Natural Language Supervision (https://arxiv.org/pdf/2103.00020)
-
Mamba: Linear-Time Sequence Modeling with Selective State Spaces (https://arxiv.org/abs/2312.00752)
-
High-Resolution Image Synthesis with Latent Diffusion Models (https://arxiv.org/abs/2112.10752)
-
Denoising Diffusion Probabilistic Models (https://arxiv.org/pdf/2006.11239)
-
Diffusion Models Beat GANs on Image Synthesis (https://arxiv.org/pdf/2105.05233)
-
Flow Matching for Generative Modeling (https://arxiv.org/pdf/2210.02747)
-
High-Resolution Image Synthesis with Latent Diffusion Models (https://arxiv.org/pdf/2112.10752)
- Efficiently Modeling Long Sequences with Structured State Spaces (https://arxiv.org/pdf/2111.00396)