DenseNet with shared memory #8683

FiveMinHack · 2017-11-16T14:24:12Z

FiveMinHack
Nov 16, 2017

Hi,

I have an issue with GPU memory running out building a larger densenet.
After lurking the web I did find that this was an issue in general due to implementation of densenets, see here:
https://arxiv.org/abs/1707.06990

Sadly the mxnet link in the paper is not existing.

As a novice just starting with mxnet and somewhat limited python experience I started with a simple example aiming to get a shared memory up, Fig. 3 in the paper indicates the goal well. The example first:

import mxnet as mx

batchsize = 1

# as example assume [batchsize x 3 x 32 x 32] input as cifar10

input = mx.sym.Variable("data")

conv1 = mx.sym.Convolution(data=input, kernel=(3, 3), num_filter=8, pad=(1, 1))

relu1 = mx.sym.Activation(data=conv1, act_type="relu")

# concat of input and output of of first layer

concat1 = mx.sym.Concat(input, relu1, dim=1, name='layer1')

conv2 = mx.sym.Convolution(data=concat1, kernel=(3, 3), num_filter=8, pad=(1, 1))

relu2 = mx.sym.Activation(data=conv2, act_type="relu")

# concat of concat1 and output from second layer

concat2 = mx.sym.Concat(input, relu1, dim=1, name='layer2')

# the size of concat2 here would be [batchsize x (3+8+8) x 32 x 32]

This works, but with memory issues on the GPU on larger models (due to, quote from paper: "quadratic memory dependency with respect to feature maps"). Now to my initial, and possibly naive, attempt to build a shared memory like this:

import mxnet as mx

batchsize = 1

# pre-make full volume

volume = mx.sym.Variable("volume", shape=(batchsize, 3 + 8 + 8, 32, 32))

input = mx.sym.Variable("data")

# assign image to first 3 channels

volume[:, 0:3, :, :] = input

Leding to this at the last line:
TypeError: 'Symbol' object does not support item assignment

Second step would be to do a convolution on the first three channels on the volume and so on.
Any guidance and help getting a shared memory, and GPU memory friendly, densenet in place would be appreciated.

eric-haibin-lin · 2017-11-18T23:36:10Z

eric-haibin-lin
Nov 18, 2017
Collaborator

That was a typo in the paper. Were you looking for this? https://github.com/taineleau/densenet.mxnet

0 replies

FiveMinHack · 2017-11-19T10:38:44Z

FiveMinHack
Nov 19, 2017
Author

No, that would be the original DenseNet implementation I believe.
Latest commit there is May 31 in densenet.py if I read it correct. This paper:
https://arxiv.org/abs/1707.06990
is from July 21.

Also, the fact that there is a Concat operation in the code (line 66 in densenet.py) like this

[line 66] data = mx.symbol.Concat(data, Block, name=name + '_concat%d' % (i + 1))

indicates that it is not with shared memory on the GPU.
(https://github.com/taineleau/densenet.mxnet/blob/master/densenet.py following your reference)

0 replies

szha · 2018-04-05T12:26:27Z

szha
Apr 5, 2018
Collaborator

@apache/mxnet-committers: This issue has been inactive for the past 90 days. It has no label and needs triage.

For general "how-to" questions, our user forum (and Chinese version) is a good place to get help.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DenseNet with shared memory #8683

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

DenseNet with shared memory #8683

FiveMinHack Nov 16, 2017

Replies: 3 comments

eric-haibin-lin Nov 18, 2017 Collaborator

FiveMinHack Nov 19, 2017 Author

szha Apr 5, 2018 Collaborator

FiveMinHack
Nov 16, 2017

eric-haibin-lin
Nov 18, 2017
Collaborator

FiveMinHack
Nov 19, 2017
Author

szha
Apr 5, 2018
Collaborator