You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the great work.
As asked before (here) I do not see why in several methods like GraNd and submodular functions, you use the concatenation of loss gradient and its multiplication with the last feature embedding as shown here:
You are basically using the last layer features scaled by the gradient. Do you have any reasons why you choose this instead of the ones common in the literature, like gradient with respect to the last layer parameters?
Thanks!
The text was updated successfully, but these errors were encountered:
…503)
We had our own version of
PatrickZH/DeepCore#11 because our version of
their implementation confused where the inversion is placed. I thought
it through and think we don't need to do any inversion. I added some
comments explaining the thoughts.
Note that this does not address
PatrickZH/DeepCore#13!
@PatrickZH@Chengcheng-Guo Hello we would also be curious why you didn't just use the last layer gradients but this form. Could you share with us some thoughts?
Hello,
Thanks for the great work.
As asked before (here) I do not see why in several methods like GraNd and submodular functions, you use the concatenation of loss gradient and its multiplication with the last feature embedding as shown here:
You are basically using the last layer features scaled by the gradient. Do you have any reasons why you choose this instead of the ones common in the literature, like gradient with respect to the last layer parameters?
Thanks!
The text was updated successfully, but these errors were encountered: