Naming layers and getting trainer weights #335
-
class BiLSTMClassificationModel(Block):
I want to attach trainer to only rest of the network WITHOUT adding the self.embedding. I m unable to define: bilstmclassificationmodel12_ ( |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments
-
if the goal is to freeze weights for embedding layer, see #333 |
Beta Was this translation helpful? Give feedback.
-
Even otherwise, can u mention an example of how to use with self.name_scope('REST'): without using Sequential() ? |
Beta Was this translation helpful? Give feedback.
-
Since collect_params() returns a ParameterDict, which is a dictionary, you can add elements to it. For example: d = net.layer1.collect_params()
d.update(net.layer2.collect_params()) As long as you don't add the embedding layer's parameter in the dictionary passed to the trainer, then it won't update. I'd still recommend setting the grad_req to null for the layers you don't intend to update so that the gradient calculation can be skipped. |
Beta Was this translation helpful? Give feedback.
-
Closing now. Let me know if you have more questions along the line. |
Beta Was this translation helpful? Give feedback.
Since collect_params() returns a ParameterDict, which is a dictionary, you can add elements to it. For example:
As long as you don't add the embedding layer's parameter in the dictionary passed to the trainer, then it won't update. I'd still recommend setting the grad_req to null for the layers you don't intend to update so that the gradient calculation can be skipped.