How to train a model with huge classes #13670

HaoLiuHust · 2018-12-18T06:40:05Z

HaoLiuHust
Dec 18, 2018

for example, I am training a face recognition model with millions ids, beside use tripletloss, I would like to use softmax-based losses such as arcloss, amsoftmax and so on. However, with such huge classes, gpu meomery will be insufficient, is there a way that I can train a model like this? Maybe split the softmax layer on multi gpus would be work, I wonder whether mxnet support this. Or can I put this part on cpu, and other layers on gpus

mxnet-label-bot · 2018-12-18T06:40:07Z

mxnet-label-bot
Dec 18, 2018

Hey, this is the MXNet Label Bot.
Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it.

0 replies

andrewfayres · 2018-12-18T18:57:58Z

andrewfayres
Dec 18, 2018

This also seems like a very good topic for discuss.mxnet.io . You might catch the interest of wider audience there. I think what you're looking for is the group2ctx parameter. It lets you put different parts of your model into different contexts for model parallelism. I'm aware of one example which demonstrates the basic usage.

@mxnet-label-bot add [question]

0 replies

andrewfayres · 2018-12-18T19:23:50Z

andrewfayres
Dec 18, 2018

@mxnet-label-bot add [question]

0 replies

HaoLiuHust · 2018-12-19T01:40:58Z

HaoLiuHust
Dec 19, 2018
Author

Thanks, it helps a lot. Put the huge layer on cpu would be ok, but the train speed may not be as fast as gpu, is there a way to split the single huge layer on gpus?

0 replies

andrewfayres · 2018-12-20T06:07:42Z

andrewfayres
Dec 20, 2018

I think that you can do this using the group2ctx parameter. I thought that there was a lstm example showing how to do this but I don't see it anywhere.

I know that there are a few approaches at approximating softmax for speedup. Might be worth exploring if any of these will reduce the memory footprint as well.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to train a model with huge classes #13670

{{title}}

Replies: 5 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

How to train a model with huge classes #13670

HaoLiuHust Dec 18, 2018

Replies: 5 comments

mxnet-label-bot Dec 18, 2018

andrewfayres Dec 18, 2018

andrewfayres Dec 18, 2018

HaoLiuHust Dec 19, 2018 Author

andrewfayres Dec 20, 2018

HaoLiuHust
Dec 18, 2018

mxnet-label-bot
Dec 18, 2018

andrewfayres
Dec 18, 2018

andrewfayres
Dec 18, 2018

HaoLiuHust
Dec 19, 2018
Author

andrewfayres
Dec 20, 2018