How to train a model with huge classes #13670
Replies: 5 comments
-
Hey, this is the MXNet Label Bot. |
Beta Was this translation helpful? Give feedback.
-
This also seems like a very good topic for discuss.mxnet.io . You might catch the interest of wider audience there. I think what you're looking for is the @mxnet-label-bot add [question] |
Beta Was this translation helpful? Give feedback.
-
@mxnet-label-bot add [question] |
Beta Was this translation helpful? Give feedback.
-
Thanks, it helps a lot. Put the huge layer on cpu would be ok, but the train speed may not be as fast as gpu, is there a way to split the single huge layer on gpus? |
Beta Was this translation helpful? Give feedback.
-
I think that you can do this using the I know that there are a few approaches at approximating softmax for speedup. Might be worth exploring if any of these will reduce the memory footprint as well. |
Beta Was this translation helpful? Give feedback.
-
for example, I am training a face recognition model with millions ids, beside use tripletloss, I would like to use softmax-based losses such as arcloss, amsoftmax and so on. However, with such huge classes, gpu meomery will be insufficient, is there a way that I can train a model like this? Maybe split the softmax layer on multi gpus would be work, I wonder whether mxnet support this. Or can I put this part on cpu, and other layers on gpus
Beta Was this translation helpful? Give feedback.
All reactions