Naming layers and getting trainer weights #335

sravanbabuiitm · 2018-09-12T02:32:55Z

sravanbabuiitm
Sep 12, 2018
Maintainer

class BiLSTMClassificationModel(Block):

def __init__(self, hidden_size, num_classes, **kwargs):
    super(BiLSTMClassificationModel, self).__init__(**kwargs)
    with self.name_scope():
        self.embedding = None
        self.bilstm = gluon.rnn.LSTM(hidden_size=embedding_dim, bidirectional=True)

        num_output_units = num_classes
        if num_classes == 2:
            num_output_units = 1
        print('Number of output units in the last layer :%s',
                     num_output_units)
        self.dense = nn.Dense(num_output_units)

def forward(self, F, x, valid_length):  # pylint: disable=arguments-differ
    input = x.swapaxes(dim1=0, dim2=1)
    embeddings = self.embedding(input)
    bilstm_outputs, _ = self.bilstm(embeddings)
    mean_pooled = self.agg_layer(bilstm_outputs, valid_length)
    dense_output = self.dense(mean_pooled)
    return F.Dropout(dense_output, 0.1)

I want to attach trainer to only rest of the network WITHOUT adding the self.embedding.

I m unable to define:
with self.output.name_scope() to rest of all params, so that I can attach a trainer to rest of the network. I dont want to attach a separate trainer for each of the params of bilstm, dense, and then update both of them. My network overview

bilstmclassificationmodel12_ (
Parameter bilstmclassificationmodel12_lstm0_l0_i2h_weight (shape=(400, 0), dtype=<class 'numpy.float32'>)
Parameter bilstmclassificationmodel12_lstm0_l0_h2h_weight (shape=(400, 100), dtype=<class 'numpy.float32'>)
Parameter bilstmclassificationmodel12_lstm0_l0_i2h_bias (shape=(400,), dtype=<class 'numpy.float32'>)
Parameter bilstmclassificationmodel12_lstm0_l0_h2h_bias (shape=(400,), dtype=<class 'numpy.float32'>)
Parameter bilstmclassificationmodel12_lstm0_r0_i2h_weight (shape=(400, 0), dtype=<class 'numpy.float32'>)
Parameter bilstmclassificationmodel12_lstm0_r0_h2h_weight (shape=(400, 100), dtype=<class 'numpy.float32'>)
Parameter bilstmclassificationmodel12_lstm0_r0_i2h_bias (shape=(400,), dtype=<class 'numpy.float32'>)
Parameter bilstmclassificationmodel12_lstm0_r0_h2h_bias (shape=(400,), dtype=<class 'numpy.float32'>)
Parameter bilstmclassificationmodel12_dense0_weight (shape=(4, 0), dtype=float32)
Parameter bilstmclassificationmodel12_dense0_bias (shape=(4,), dtype=float32)
Parameter embedding1_weight (shape=(79447, 300), dtype=float32)
)

Answered by szha

Sep 13, 2018

Since collect_params() returns a ParameterDict, which is a dictionary, you can add elements to it. For example:

d = net.layer1.collect_params()
d.update(net.layer2.collect_params())

As long as you don't add the embedding layer's parameter in the dictionary passed to the trainer, then it won't update. I'd still recommend setting the grad_req to null for the layers you don't intend to update so that the gradient calculation can be skipped.

View full answer

szha · 2018-09-12T04:08:55Z

szha
Sep 12, 2018
Maintainer

if the goal is to freeze weights for embedding layer, see #333

0 replies

sravanbabuiitm · 2018-09-13T00:37:30Z

sravanbabuiitm
Sep 13, 2018
Maintainer Author

Even otherwise, can u mention an example of how to use

with self.name_scope('REST'):

without using Sequential() ?

0 replies

szha · 2018-09-13T03:05:27Z

szha
Sep 13, 2018
Maintainer

Since collect_params() returns a ParameterDict, which is a dictionary, you can add elements to it. For example:

d = net.layer1.collect_params()
d.update(net.layer2.collect_params())

As long as you don't add the embedding layer's parameter in the dictionary passed to the trainer, then it won't update. I'd still recommend setting the grad_req to null for the layers you don't intend to update so that the gradient calculation can be skipped.

0 replies

szha · 2018-09-18T17:13:32Z

szha
Sep 18, 2018
Maintainer

Closing now. Let me know if you have more questions along the line.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Naming layers and getting trainer weights #335

{{title}}

Replies: 4 comments

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Naming layers and getting trainer weights #335

sravanbabuiitm Sep 12, 2018 Maintainer

Replies: 4 comments

szha Sep 12, 2018 Maintainer

sravanbabuiitm Sep 13, 2018 Maintainer Author

szha Sep 13, 2018 Maintainer

szha Sep 18, 2018 Maintainer

sravanbabuiitm
Sep 12, 2018
Maintainer

szha
Sep 12, 2018
Maintainer

sravanbabuiitm
Sep 13, 2018
Maintainer Author

szha
Sep 13, 2018
Maintainer

szha
Sep 18, 2018
Maintainer