-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A question about reinitialize #9
Comments
Hi, |
@siahuat0727 I possibly found a bug: In the original paper, the author says there are two networks, the original network and the pruned one. But your code seems has just one network. One consequece is that ,the prune does not work properly, because you prune the conv kernels by setting these conv kernels to 0, but after that, you train the network as usual, and the conv-layers update all their weights including the pruned one! |
@siahuat0727 I think in order to solve this problem, you need to freeze these pruned channel inside kernel after pruning, or create another pruned network and copy the original network's weights (except the pruned channels) to it and train this small pruned network, as the author written in that paper. |
@sharpstill Lines 68 to 73 in b0f46f4
|
@siahuat0727 I still think this step has problem, because you use
but you freeze all the weights of conv-layers because Line 266 in b0f46f4
I mean you need to freeze the pruned-channels of the conv-kernel, not all of them. I recommend you to consider my suggestion in my last response. (use a newly-created small sub-network and copy the parameters of un-pruned weight to it, then train this sub-network.) Another question is that some conv-layers have bias, your W.grad[mask[name]] = 0 does not consider the bias case.
|
Furthermore, the BN layer will also be affected by the channel number, and you use all the channels to train, BN layer also won't work properly. |
Hi, Lines 161 to 166 in b0f46f4
I think the only difference is the bias term but I don't think that's the point. You can try for the difference. |
Thank you for your reply, I will try my solution and report the result to compare with yours. |
Thanks. Look forward to your experiment. |
Hi, I feel a little confused about reinitialize.
After reinitialize, the weights of the network have changed immediately. The accuracy may go down at the reinitializing point. Could you tell me why the curve is continuous?
The text was updated successfully, but these errors were encountered: