Deal with overfitting #34

yangtcai · 2022-06-27T09:11:21Z

Hi, @williamstark01, I test two different hyperparameter to deal with overfitting.The orange one is the first time I trained our model, it's learning rate 0.001, transformer encoder-decorder 6 layers. When I find our model is overfitting, I find reletived issue in DETR, small datasets could lead to this problem, so I change the layers from 6 to 3. The blue one's learning rate is 0.001, transformer encoder-decorder 3 layers. And the red one, I change the dropout from 0.2 to 0.1, and also change the learning rate to 0.0001.

The related issue link: facebookresearch/detr#342
I think we can add more chromosome datasets into train our model, as COCO datasets are up to 330k, we only have 8.9k to train. Am I understanding is correct?

williamstark01 · 2022-06-27T11:06:27Z

Nice experimenting and troubleshooting! It looks to me that the learning rate was the major culprit for overfitting at this point. 1e-4 is a good value, very frequently used (and we might add lr decay as well later on). 3 layers for the transformer encoder and decoder also makes sense for now. The dropout should probably be increased since we observe overfitting, 0.3 up to 0.5 are potentially good values to try.

At this stage it would be good to organize how we track multiple training experiments. TensorBoard is a good option, we just need to also save hyperparameter values that will help us filter experiments. Could you share the one you are currently using?

I think at this point it would be worth looking at converting all tunable variables of the network into hyperparameters, for example number of transformer layers, number of attention heads, etc. Those could either be arguments to the training script or an experiment configuration file. New issue: #35

Good work, very encouraging initial results!

williamstark01 · 2022-06-27T11:23:29Z

Almost forgot:

I think we can add more chromosome datasets into train our model, as COCO datasets are up to 330k, we only have 8.9k to train. Am I understanding is correct?

This is correct, the training set is relatively small at this point. We should get better results if we add more chromosomes in the dataset. New issue: #36

yangtcai · 2022-06-28T12:53:48Z

At this stage it would be good to organize how we track multiple training experiments. TensorBoard is a good option, we just need to also save hyperparameter values that will help us filter experiments. Could you share the one you are currently using?
Hi, @williamstark01, I'm slightly confused with this part, do you mean the hyperparameters or TenserBoard files? :D

williamstark01 · 2022-06-28T13:14:06Z

I meant the URL to the TensorBoard dashboard if you are using the official public one ( https://tensorboard.dev/ ) and uploading the log there. Or are you running a TensorBoard locally?

yangtcai mentioned this issue Jun 30, 2022

add mAP metric #37

Merged

williamstark01 mentioned this issue Jun 30, 2022

d_model parameter #39

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deal with overfitting #34

Deal with overfitting #34

yangtcai commented Jun 27, 2022

williamstark01 commented Jun 27, 2022

williamstark01 commented Jun 27, 2022

yangtcai commented Jun 28, 2022

williamstark01 commented Jun 28, 2022

Deal with overfitting #34

Deal with overfitting #34

Comments

yangtcai commented Jun 27, 2022

williamstark01 commented Jun 27, 2022

williamstark01 commented Jun 27, 2022

yangtcai commented Jun 28, 2022

williamstark01 commented Jun 28, 2022