Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi, I recently read a paper ( Run Away From your Teacher: a New Self-Supervised Approach Solving the Puzzle of BYOL ) expanding the work on self-supervised algorithms and providing new perspective on the BYOL methodology. By reading the paper, i thought it would have been quite easy to implement their proposal in a usable version, using this repository as a base.
If anyone seeing this PR have read the paper in question, or want to take a look at it, you can have a look at the
alignment_loss
andcross_model_loss
function, as well as theforward
method in theRAFT
class, and verify that the implementation agrees with the description on the paper. Everything looks in order to me, but some confirmation never hurts.The only slight concern i have about this PR is that it may be beyond the scope of this repository, which would resemble a collection of BYOL-related paper implementations. @lucidrains the decision is up to you, but since 99% of the code is shared between this RAFT implementation and the BYOL one, i decided to make a PR instead of just copy-pasting all your code to a brand new project.
Another notable implementation detail is that i changed the
loss_fn
's name tobyol_loss
in order to avoid confusion and maintain a consistent naming withraft_loss
.What follows is the commit message, GitHub pasted this here automatically so i might as well leave this here:
Add implementation for the algorithm described by the paper
"Run Away From your Teacher: Understanding BYOL by a Novel
Self-Supervised Approach" ( https://arxiv.org/abs/2011.10944 )
The RAFT class is essentially a copy-paste of the BYOL class,
with slight changes to the
forward
method, which computes adifferent loss functionm making use of the new
raft_loss
function. RAFT's loss is the difference of two losses: an
"alignment loss" between the projection of two different
augmented views of the same image, to be minimized, and the
"cross-model loss", to be maximized, which is the distance
between the online and target representation of the same input,
averaged over the two different views.
Tell me what you think :)