Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[19] EsViT : Efficient Self-supervised Vision Transformers for Representation Learning #19

Open
Dongwoo-Im opened this issue Jan 31, 2023 · 1 comment

Comments

@Dongwoo-Im
Copy link
Contributor

Dongwoo-Im commented Jan 31, 2023

Links

한 줄 요약

  • Microsoft에서 나온, Swin transformer와 같은 multi-stage ViT를 backbone으로 하고, region-based pre-training task를 추가하여 memory/연산 cost는 별로 추가되지 않은 채, self-sup 분야에서 SOTA를 달성한 논문입니다.

선택 이유

  • 몰랐던 region-based 라는 새로운 개념이 제시되어 있어서 읽어보았습니다. (읽어보니 완전히 새로운 개념은 아니네요. augmentation에 기반했던 기존의 self-supervised learning이 view-level 학습에 치중되었기 때문에 dense prediction 성능이 약한 것이라고 주장합니다.)
@Dongwoo-Im
Copy link
Contributor Author

Dongwoo-Im commented Feb 24, 2023

notion link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant