Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How retrain with the best cell in custom dataset? #97

Open
NdaAzr opened this issue May 5, 2020 · 8 comments
Open

How retrain with the best cell in custom dataset? #97

NdaAzr opened this issue May 5, 2020 · 8 comments

Comments

@NdaAzr
Copy link

NdaAzr commented May 5, 2020

I am wondering how can I re-train the best cell and best architecture for my custom dataset?

After train_autodeeplab.py that aims to find the best cell and best architecture search results, I used the train.py to re-train the model. I could not see any arg in config_utils.re_train_autodeeplab to feed the new cell and new architecture. However, in the retrain_model.new_model, I can see two function get_default_cell and get_default_arch.

Could you please provide how we should re-train for the custom dataset?

@Jayant1234
Copy link

Jayant1234 commented May 13, 2020

We have to give path of genotype.npy, network_path.npy and network_path_space.npy as cel_arch, net_arch and network_path arguments in config_utils.re_train_autodeeplab. He has not created network_path argument at all. You will need to add that in build_autodeeplab.py. So, if you just run python train.py the default architecture is used, instead of the best searched one. @NoamRosenberg Correct me if I am wrong.

@NdaAzr
Copy link
Author

NdaAzr commented May 17, 2020

@Jayant1234 Thanks a lot. Did you try to do the best search? I am wondering how was the timing? I did set the parameters for epochs same as paper so epoch = 40 and alpha epoch = 20, and when it takes to the alpha epoch stage takes about 37 hours, means for next 20 epoch would take more than 600 hours, however, the paper says 3 GPU day. My images are 2D grayscale medical images and the size of images are 256 * 256. Could you please let me know how was your timing if you have done the search stage?

@Jayant1234
Copy link

I have done the best search. Apart from a minor problem with decoder input size, which I needed to increase by a factor of 2, the code is working perfectly. I took epoch =60 and alpha epoch=20. I managed to search for the best model in less than 36 hours, but my dataset is really small. I used Tesla P100 with 50gb RAM memory. Is it taking you 37 hours to run 1 single alpha epoch? What is your dataset size, image size, gpu and RAM size? Are you using smaller images for the search process i.e 320x320? Try smaller, see if that works.

@NdaAzr
Copy link
Author

NdaAzr commented May 17, 2020

My dataset is 595 images. I cannot go smaller than 256 * 256 this. Yes--for each epoch after epoch 20 takes about 36 hours while from epoch one to 19 was very quick, take about 15 minutes for each epoch. I am using Titan V , 12GB memory.
Did you run it on multiple GPUs? Did you try this repository for multiple GPUs?

One more question, how you save genotype.npy, network_path.npy and network_path_space.npy as cel_arch, net_arch and network_path as npy file. How can I extract the best cell and network from the checkpoint.pth and the model_best.pth and save as npy file.

Thank you in advance,
Neda

@Jayant1234
Copy link

You have to run the 'decode' file as given below, see Readme's Architecture Search part. It will give all the .npys
CUDA_VISIBLE_DEVICES=0 python decode_autodeeplab.py --dataset cityscapes --resume /AutoDeeplabpath/checkpoint.pth.tar
After saving, just used np.load.

I just ran on one GPU. There seems to be some problem with your code. Check what's happening in that one epoch.

@NdaAzr
Copy link
Author

NdaAzr commented May 17, 2020

@Jayant1234 Thanks a lot. Sure, I am going to check what is happening in that epoch. Thanks again.

@NdaAzr
Copy link
Author

NdaAzr commented May 18, 2020

@Jayant1234, I am wondering what does each number mean in the output of cell structure? or how we can print(genotype) instead of decode after search stage?
For example, here is the output of the new cell structure (genotype), what does each number mean?
([[ 1, 5], [ 0, 4], [ 2, 4], [ 3, 0], [ 5, 4], [ 7, 4], [11, 7], [12, 4], [17, 4], [18, 2]], dtype=int64)

I need to produce something similar to this pic in auto-deeplab paper.
image

Greatly appreciated if you could provide some help.

@NdaAzr
Copy link
Author

NdaAzr commented Jan 19, 2021

@Jayant1234 I am wondering how can I visualise the best cell and best network that it found after search. After decoding, it only prints the number such as ([[ 1, 5], [ 0, 4], [ 2, 4], [ 3, 0], [ 5, 4], [ 7, 4], [11, 7], [12, 4], [17, 4], [18, 2]].
I would appreciate if you provide some advice. Many thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants