Integrate bolts + torch hub #442

edenlightning · 2020-12-10T16:20:01Z

No description provided.

Borda · 2020-12-23T00:01:36Z

well, we can set the Bolts as models to register, but still, for getting weights we need some heavy GPU machines...

oke-aditya · 2021-01-07T14:10:19Z

Can we do vice-versa too ?

Load a model from torch.hub.
train / finetune with PyTorch Lightning.

I would be highly interested in implementing such feature.

Programmer-RD-AI · 2021-10-24T01:29:46Z

hi,
I would like to help with this issue.

With best regards,
Ranuga

Borda · 2021-10-24T06:33:17Z

hi, I would like to help with this issue.

Great! Let's sync up also with the Bolts refactoring =)

oke-aditya · 2021-10-24T06:38:59Z

Just for information, currently there is a refactor in torchvision.models going on available in prototype folder.

So the API with the hub might change.

Edit:
Also a small note, torchvision detection models do not work with Hub.

Let me know if I can help.

P.S. A book on PyTorch Lightning will be out end of this year!

Programmer-RD-AI · 2021-10-24T07:27:12Z

I will start working on this.

:)

With best regards,
Ranuga

Programmer-RD-AI · 2021-10-25T04:26:44Z

Hi,
I want to know what the issue is to use a torch model in PyTorch Lightning and Fine Tune?

With best regards,
Ranuga

oke-aditya · 2021-10-25T19:10:48Z

Torch hub allows you to load the model, but you need to do model surgery for specifying number of classes, etc.

I have an example for DeTR.

https://github.com/oke-aditya/quickvision/blob/master/quickvision/models/detection/detr/model_factory.py

We can load the detr backbone, but need to adjust the head classifier for own number of classes.

Similarly for CNNs, One need to load the backbone, modify the head classifier for custom num_classes. You need to freeze / unfreeze layers while transfer learning and fine-tuning.

We can think about this little bit more, this is something Flash does well I think.

cc @Borda @kaushikb11 @Programmer-RD-AI @akihironitta

Programmer-RD-AI · 2021-10-26T03:05:12Z

ok, thank you @oke-aditya
I will try to fix the issue.

oke-aditya · 2021-10-26T04:46:16Z

Since a single PR will not be a solution.
I would suggest to propose a brief prototype (probably a branch here or new repo) and let maintainers have a look.
Also would suggest to check over slack / with Borda if this is part of PL plans moving ahead with bolts.

Programmer-RD-AI · 2021-10-26T05:32:08Z

ok thank you @oke-aditya

Programmer-RD-AI · 2021-10-26T07:04:19Z

hi,
I am currently building a demo of this and my question is I can.

from torchvision.models import googlenet

model = googlenet().to(device)
print(model) # Prints the model architecture
model.fc = Linear(1000,len(classes))

then use the model as usual.
I am just a bit confused that's why.
Thank you.

oke-aditya · 2021-10-26T08:07:21Z

Yes, you can and this is correct way, But note that fc layer is applicable for GoogleNet and Resnet, for models like mobilenet it is called classiifer or something else (please check). For CNNs it is simple to just modify the last layer to support more number of classes.

Programmer-RD-AI · 2021-10-26T08:41:01Z

hi,
I usually add
__init__
self.output = Linear(1000,len(classes))
forward

preds = self.tl_model(X)
preds = self.output(preds)

I don't know if this is the best way but when I am testing TL Models I use this

oke-aditya · 2021-10-26T08:50:14Z

Hi !
I think You are adding an additional Linear layer on top of fully connected layer. This is not the best way to transfer learning, it would work fine in practice as you get an extra fully connected layer. Which means a addition of 1000 * 1000 parameters, (as your previous fc layer Linear(x, 1000) and you have (1000, num_classes) now

Best way is to edit the existing layer and replace it with Linear(1000, num_classes). This does not increase the number of parameters drastically.

Thanks for asking

Programmer-RD-AI · 2021-10-26T09:12:38Z

hi,
Sorry for asking this many questions but I am confused that's why.

For freezing layers
model = googlenet()
model.some_fc.requires_grad = False

and for fine-tuning

model = goognet()
from model.some_fc = Linear(512,985) to model.some_fc = Sequential([Linear(512,1024),Linear(1024,985)])

So what are the feature I need to create?

I am sorry for asking this many questions.

Thank you.

oke-aditya · 2021-10-26T19:30:33Z

Ok so let me elaborate a bit more.

Let me explain the transfer learning scenarios. These examples are written for CNNs, but kind-of generalize over other models too. Note that when we are doing transfer learning, it means we are using the pre-trained weights. Hence pretrained=True for all cases.

First two scenarios are clearly well described in Transfer learning Tutorial. (Great one by @chsasank.
One of the best in this field!

Simply re-training the model with pretrained=True.

This is most simple approach, we aren't freezing the backbone. Refer here in the tutorial

model = resnet50(pretrained=True)
in_features = model.fc.in_features
model.fc = nn.Linear(in_features, num_classes)

Simply train the model. We train each and every parameter, with just the difference being that we have num_classes instead of 1000.
Naive approach, works fine, can give you decent results. It will take lot of time though (You are training a model anyway! and have lot of paremeters to train)

Training only the head feature extractor.

Refer here in the tutorial

This is what you tried above. Here we are interested in only training the classification head of the network. We freeze the backbone of the model.

model = resnet50(pretrained=True)

# Freeze all the parameters.

for param in model.parameters():
     param.requires_grad = False

# Unfreeze the head.
# This simply replaces the head with num_classes
in_features = model.fc.in_features
model.fc = nn.Linear(in_features, num_classes)

# You may prefer to add an extra fully connected layer, but that isn't needed in most cases.
# Left to you, many don't prefer, as it can cause large increase in parameters.
# This would work well if you have BERT / millions of params in the backbone and adding a few hundreds of params in the 
# head of the model won't make big difference. 
# Basically no of params with pre-trained weights >>> number of fully connected params. 

# Adding extra fc to head

in_features = model.fc.in_features
model.fc = nn.Sequential(
      [
                nn.Linear(in_features, hidden_params),
               # Many prefer dropout in between to avoid over-fitting
                # nn.Dropout(0.2)
                nn.Linear(hidden_params, num_classes)
      ]
)

Unfreezing layers / blocks. one by one.

This is where Fine-tuning comes into play, we really want to make most of every block of network.

You can first freeze the backbone and train the head with Strategy 2

This can be trained for a few epochs. with a decent learning rate of 1e-3

Here is the second training training routine.

Now you want to freeze each block / specific blocks, say 5 of last Conv layers (Or residual block) in ResNet.
You would unfreeze only them.
Continue training them with a slightly lower lr of 1e-4, and for much longer epochs.

You may unfreeze more blocks / probably stop here. It is very much left to you.
Note that after unfreezing a block you train them progressively. (You don't freeze the Linear layers when you unfreeze the conv blocks)

I don't know if there is any other way of transfer learning, (I haven't seen any other approach), These work well in practice.

P.S.

First of all, my appreciation to you! You are very young developer (I guess 14), and I'm super excited that you know so much stuff at such a tender age! At your age I was probably more interested in knowing how to install anti-virus and knew nothing about coding. (forget GitHub account, I didn't even know the word GitHub)
You have Have a great and bright future ahead! Wish you success.

Programmer-RD-AI · 2021-10-27T02:40:23Z

OK thank you I can understand the issue now.
I will start working on it.

Thank you very much @oke-aditya

Programmer-RD-AI · 2021-10-27T04:40:56Z

hi,

Again I am really sorry for asking this many questions but I am not understanding this correctly.

So what I need to implement is

Simply re-training the model with pretrained=True
Training only the head feature extractor.
Unfreezing layers/blocks. one by one.

I need to implement the above features in lightning bolts in an easier way.

Is my understanding correct?
I am so sorry for asking this many questions.

If not what are the specific thing I need to work on or implement.

With best regards,
Ranuga

edenlightning assigned Borda Dec 10, 2020

Borda added the enhancement New feature or request label Dec 23, 2020

stale bot added the won't fix This will not be worked on label Mar 8, 2021

Borda added this to the v0.4 milestone Mar 8, 2021

Lightning-Universe deleted a comment from stale bot Mar 8, 2021

Borda removed the won't fix This will not be worked on label Mar 8, 2021

Borda modified the milestones: v0.4, v0.5 Nov 26, 2021

Borda pinned this issue Mar 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate bolts + torch hub #442

Integrate bolts + torch hub #442

edenlightning commented Dec 10, 2020

Borda commented Dec 23, 2020

oke-aditya commented Jan 7, 2021 •

edited

Loading

Programmer-RD-AI commented Oct 24, 2021

Borda commented Oct 24, 2021

oke-aditya commented Oct 24, 2021 •

edited

Loading

Programmer-RD-AI commented Oct 24, 2021

Programmer-RD-AI commented Oct 25, 2021

oke-aditya commented Oct 25, 2021

Programmer-RD-AI commented Oct 26, 2021

oke-aditya commented Oct 26, 2021

Programmer-RD-AI commented Oct 26, 2021

Programmer-RD-AI commented Oct 26, 2021 •

edited by Borda

Loading

oke-aditya commented Oct 26, 2021

Programmer-RD-AI commented Oct 26, 2021 •

edited by Borda

Loading

oke-aditya commented Oct 26, 2021

Programmer-RD-AI commented Oct 26, 2021 •

edited

Loading

oke-aditya commented Oct 26, 2021 •

edited by Borda

Loading

Programmer-RD-AI commented Oct 27, 2021 •

edited

Loading

Programmer-RD-AI commented Oct 27, 2021

Integrate bolts + torch hub #442

Integrate bolts + torch hub #442

Comments

edenlightning commented Dec 10, 2020

Borda commented Dec 23, 2020

oke-aditya commented Jan 7, 2021 • edited Loading

Programmer-RD-AI commented Oct 24, 2021

Borda commented Oct 24, 2021

oke-aditya commented Oct 24, 2021 • edited Loading

Programmer-RD-AI commented Oct 24, 2021

Programmer-RD-AI commented Oct 25, 2021

oke-aditya commented Oct 25, 2021

Programmer-RD-AI commented Oct 26, 2021

oke-aditya commented Oct 26, 2021

Programmer-RD-AI commented Oct 26, 2021

Programmer-RD-AI commented Oct 26, 2021 • edited by Borda Loading

oke-aditya commented Oct 26, 2021

Programmer-RD-AI commented Oct 26, 2021 • edited by Borda Loading

oke-aditya commented Oct 26, 2021

Programmer-RD-AI commented Oct 26, 2021 • edited Loading

oke-aditya commented Oct 26, 2021 • edited by Borda Loading

Programmer-RD-AI commented Oct 27, 2021 • edited Loading

Programmer-RD-AI commented Oct 27, 2021

oke-aditya commented Jan 7, 2021 •

edited

Loading

oke-aditya commented Oct 24, 2021 •

edited

Loading

Programmer-RD-AI commented Oct 26, 2021 •

edited by Borda

Loading

Programmer-RD-AI commented Oct 26, 2021 •

edited by Borda

Loading

Programmer-RD-AI commented Oct 26, 2021 •

edited

Loading

oke-aditya commented Oct 26, 2021 •

edited by Borda

Loading

Programmer-RD-AI commented Oct 27, 2021 •

edited

Loading