Add only works for two layers of the same size #50

AndreJFBico · 2017-07-28T10:13:04Z

Hi, so im working on setting up a fast style transfer network in code instead of importing it from a .pb file, so i can start improving it.

However im having some issues trying to setup a residual layer, heres the network.

        styleNet.start
            ->> Convolution(convSize: ConvSize(outputChannels: 32, kernelSize: 9, stride: 1), neuronType: .none, id: "Variable_0")
            ->> InstanceNorm(shiftModifier: "1", scaleModifier: "2", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> Convolution(convSize: ConvSize(outputChannels: 64, kernelSize: 3, stride: 2), neuronType: .none, id: "Variable_3")
            ->> InstanceNorm(shiftModifier: "4", scaleModifier: "5", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 2), neuronType: .none, id: "Variable_6")
            ->> InstanceNorm(shiftModifier: "7", scaleModifier: "8", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> ResidualLayer(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), layers:
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, id: "Variable_9")
                    ->> InstanceNorm(shiftModifier: "10", scaleModifier: "11", id: "Variable_")
                    ->> Neuron( type: .relu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, id: "Variable_12")
                    ->> InstanceNorm(shiftModifier: "13", scaleModifier: "14", id: "Variable_"))

When initializing the following error appears: assertion failed: Add works for two layers of the same size: file /Users/Andre/Downloads/Bender-fix-issue-38/Sources/Layers/Add.swift, line 23

I also print the layers in the network.


"PRINTING LAYERS"
": Bender.Start"
": Bender.Convolution"
": Bender.InstanceNorm"
": Bender.Neuron"
": Bender.Convolution"
": Bender.InstanceNorm"
": Bender.Neuron"
": Bender.Convolution"
": Bender.InstanceNorm"
": Bender.Neuron"
": Bender.Dummy"
": Bender.Identity"
": Bender.Add"
assertion failed: Add works for two layers of the same size: file /Users/Andre/Downloads/Bender-fix-issue-38/Sources/Layers/Add.swift, line 23

If you wondering what kind of network im trying to emulate its this one:
https://github.com/lengstrom/fast-style-transfer/blob/master/src/transform.py

def net(image):
    conv1 = _conv_layer(image, 32, 9, 1)
    conv2 = _conv_layer(conv1, 64, 3, 2)
    conv3 = _conv_layer(conv2, 128, 3, 2)
    resid1 = _residual_block(conv3, 3)
    resid2 = _residual_block(resid1, 3)
    resid3 = _residual_block(resid2, 3)
    resid4 = _residual_block(resid3, 3)
    resid5 = _residual_block(resid4, 3)
    conv_t1 = _conv_tranpose_layer(resid5, 64, 3, 2)
    conv_t2 = _conv_tranpose_layer(conv_t1, 32, 3, 2)
    conv_t3 = _conv_layer(conv_t2, 3, 9, 1, relu=False)
    preds = tf.nn.tanh(conv_t3) * 150 + 255./2
    return preds

Note: i have changed instance norm input to allow specific shift/scale modifiers, its a temporary way to just setup the weight ids.

The text was updated successfully, but these errors were encountered:

bryant1410 · 2017-07-28T10:49:13Z

The layers do sanity checks of the sizes. In this case is the Add, which comes from ResidualLayer.

So it seems that what comes before the residual layer and what it yields have different size. I'm trying to figure out why, as it looks good.

bryant1410 · 2017-07-28T11:14:48Z

A few tips while I tackle it 😄

You can create your own blocks, as in the file you cite, by extending CompositeLayer and ResidualLayer.

Convolution should not use bias here, as the example you provide uses TF convolutions, which have no bias.

I'm noticing there's a useless convSize in ResidualLayer. Gonna remove it.

AndreJFBico · 2017-07-28T15:47:01Z

The outputsize variable is passed down from the start layer when initialize is called for every incoming[0].outputsize

However for the Add itself the incoming[1] is nil

bryant1410 · 2017-07-28T16:04:03Z

so Add is not having 2 inputs?

AndreJFBico · 2017-07-28T16:16:42Z

The Add has 2 inputs, the output size of the second input is null, while the output size of the first is correct.

dernster · 2017-07-31T17:18:45Z

@AndreJFBico Do you have a repo to take a look?

AndreJFBico · 2017-08-01T09:44:04Z

@dernster Yeah i can provide it, ill update the main post when it uploads.

In the mean time this is the network im running right now.

        //We have in total 48 layers and 48 weight variables
        styleNet.start
            ->> Convolution(convSize: ConvSize(outputChannels: 32, kernelSize: 9, stride: 1), neuronType: .none, useBias: false, id: "Variable_0")
            ->> InstanceNorm(shiftModifier: "1", scaleModifier: "2", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> Convolution(convSize: ConvSize(outputChannels: 64, kernelSize: 3, stride: 2), neuronType: .none, useBias: false, id: "Variable_3")
            ->> InstanceNorm(shiftModifier: "4", scaleModifier: "5", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 2), neuronType: .none, useBias: false, id: "Variable_6")
            ->> InstanceNorm(shiftModifier: "7", scaleModifier: "8", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> [Identity(), (
                 Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_9")
                    ->> InstanceNorm(shiftModifier: "10", scaleModifier: "11", id: "Variable_")
                    ->> Neuron( type: .relu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_12")
                    ->> InstanceNorm(shiftModifier: "13", scaleModifier: "14", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_15")
                    ->> InstanceNorm(shiftModifier: "16", scaleModifier: "17", id: "Variable_")
                    ->> Neuron( type: .relu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_18")
                    ->> InstanceNorm(shiftModifier: "19", scaleModifier: "20", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_21")
                    ->> InstanceNorm( shiftModifier: "22", scaleModifier: "23", id: "Variable_")
                    ->> Neuron( type: .relu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_24")
                    ->> InstanceNorm(shiftModifier: "25", scaleModifier: "26", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_27")
                    ->> InstanceNorm(shiftModifier: "28", scaleModifier: "29", id: "Variable_")
                    ->> Neuron( type: .relu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_30")
                    ->> InstanceNorm(shiftModifier: "31", scaleModifier: "32", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_33")
                    ->> InstanceNorm(shiftModifier: "34", scaleModifier: "35", id: "Variable_")
                    ->> Neuron( type: .relu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_36")
                    ->> InstanceNorm(shiftModifier: "37", scaleModifier: "38", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> ConvTranspose(size: ConvSize(outputChannels: 64, kernelSize: 3, stride: 2), id: "Variable_39")
            ->> InstanceNorm(shiftModifier: "40", scaleModifier: "41", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> ConvTranspose(size: ConvSize(outputChannels: 32, kernelSize: 3, stride: 2), id: "Variable_42")
            ->> InstanceNorm(shiftModifier: "43", scaleModifier: "44", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> Convolution(convSize: ConvSize(outputChannels: 3, kernelSize: 9, stride: 1), neuronType: .none, useBias: false, id: "Variable_45")
            ->> InstanceNorm( shiftModifier: "46", scaleModifier: "47", id: "Variable_")
            ->> Neuron(type: .tanh)
            ->> ImageLinearTransform()

It runs even though the end result still has some strange artifacts and it crashes on input resolution higher than 256(but thats other issues), the way i circumvented the add issue was to add a identity layer next to the instance norm layer, that way the outputSize is passed correctly to the Add layer.

bryant1410 · 2017-08-01T12:56:34Z

If the output is not ok, maybe what you can yield different layers outputs and compare them to the python implementation, to see if they match, in order to be able to find where the error is.

But have you tried saving the protobuf from the python code with benderthon and loading it with bender?

AndreJFBico · 2017-08-01T13:06:17Z

Good suggestion, in terms of trying to import a protobuf, yes i have done that and it works properly, it still crashes with input size of 1024 for some reason but for lower than 512 input size image it works ok.

The reason im trying to define the network in code is so that i can understand it better and also work with it.

dernster · 2017-08-10T19:53:21Z

@AndreJFBico Hi! We couldn't reproduce the original issue, could you please provide a repo with it? Additionally, if you can provide an example of a larger input size causing a crash would be helpful.

backnotprop · 2017-09-19T18:39:05Z

@AndreJFBico did you figure this out?

@dernster I recreated this by generating a model from fast-style-transfer, used benderthon to get the pb, and running the code taken from the example in my own project with the new pb. The only pb, g_and_w2, that works is the one given from the bender style example.

I also get the same error when I make the swap in the example project.

bryant1410 · 2017-09-19T19:01:36Z

Yeah, now we were able to reproduce the error, but we still don't know what is the cause. And we don't have an ETA for this currently.

If you want, you can go ahead and try to find it. We'll check it out when we have time (I hope it's in the following weeks).

AndreJFBico · 2017-09-19T22:58:45Z

Im sorry that i never got that repo ready, its just i started changing the core of bender itself and i no longer had things as they were done originally.

Well i sort of figured it out, however i couldn't make it work with the .pb file straight off the bat.
And since i wanted to experiment with the code itself i converted the network of lengstrom fast style transfer to benders.

        styleNet.start
            ->> Convolution(convSize: ConvSize(outputChannels: 32, kernelSize: 9, stride: 1), neuronType: .none, useBias: false, id: "Variable_0")
            ->> InstanceNorm(shiftModifier: "1", scaleModifier: "2", id: "Variable_")
            ->> Neuron( type: .elu)
            ->> Convolution(convSize: ConvSize(outputChannels: 64, kernelSize: 3, stride: 2), neuronType: .none, useBias: false, id: "Variable_3")
            ->> InstanceNorm(shiftModifier: "4", scaleModifier: "5", id: "Variable_")
            ->> Neuron( type: .elu)
            ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 2), neuronType: .none, useBias: false, id: "Variable_6")
            ->> InstanceNorm(shiftModifier: "7", scaleModifier: "8", id: "Variable_")
            ->> Neuron( type: .elu)
            ->> [Identity(), (
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_9")
                    ->> InstanceNorm(shiftModifier: "10", scaleModifier: "11", id: "Variable_")
                    ->> Neuron( type: .elu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_12")
                    ->> InstanceNorm(shiftModifier: "13", scaleModifier: "14", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_15")
                    ->> InstanceNorm(shiftModifier: "16", scaleModifier: "17", id: "Variable_")
                    ->> Neuron( type: .elu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_18")
                    ->> InstanceNorm(shiftModifier: "19", scaleModifier: "20", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_21")
                    ->> InstanceNorm( shiftModifier: "22", scaleModifier: "23", id: "Variable_")
                    ->> Neuron( type: .elu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_24")
                    ->> InstanceNorm(shiftModifier: "25", scaleModifier: "26", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_27")
                    ->> InstanceNorm(shiftModifier: "28", scaleModifier: "29", id: "Variable_")
                    ->> Neuron( type: .elu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_30")
                    ->> InstanceNorm(shiftModifier: "31", scaleModifier: "32", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> [Identity(),(
                Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_33")
                    ->> InstanceNorm(shiftModifier: "34", scaleModifier: "35", id: "Variable_")
                    ->> Neuron( type: .elu)
                    ->> Convolution(convSize: ConvSize(outputChannels: 128, kernelSize: 3, stride: 1), neuronType: .none, useBias: false, id: "Variable_36")
                    ->> InstanceNorm(shiftModifier: "37", scaleModifier: "38", id: "Variable_")
                    ->> Identity())]
            ->> Add()
            ->> ConvTranspose(size: ConvSize(outputChannels: 64, kernelSize: 3, stride: 2), id: "Variable_39")
            ->> InstanceNorm(shiftModifier: "40", scaleModifier: "41", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> ConvTranspose(size: ConvSize(outputChannels: 32, kernelSize: 3, stride: 2), id: "Variable_42")
            ->> InstanceNorm(shiftModifier: "43", scaleModifier: "44", id: "Variable_")
            ->> Neuron( type: .relu)
            ->> Convolution(convSize: ConvSize(outputChannels: 3, kernelSize: 9, stride: 1), neuronType: .none, useBias: false, id: "Variable_45")
            ->> InstanceNorm( shiftModifier: "46", scaleModifier: "47", id: "Variable_")
            ->> Neuron(type: .tanh)
            ->> ImageLinearTransform()

Its a one on one conversion from the following file https://github.com/lengstrom/fast-style-transfer/blob/master/src/transform.py

I also used benderthon to export the layer weights individually, i also had to alter its conversion script as only some weights require transposing.

I also changed a bit how the weights are searched but the principle should be the same.

About the add works for two layers of the same size issue, i figured it out that i had to add a identity layer next the instance normalization layer before an add layer, that way the issue disappeared.

I tried looking into the code of how the layers were setup but i found nothing problematic, its just somehow the second incoming node of the add layer looses its convsize atribute, i never really found out why.

If they manage to fix this somehow i bet your conversion should be simple and painless.

Cheers.

backnotprop · 2017-09-20T14:57:02Z

Does Bender by chance have notes on how they generated their pb files?

bryant1410 · 2017-09-20T15:45:12Z

The mnist sample came from the benderthon sample. The style transfer one probably came directly from the repo we were talking about, I don't know why it isn't working now. But we will take a look at this error as soon as we can.

bryant1410 · 2017-09-20T16:36:11Z

@mdramos I updated benderthon so the sample uses a simpler way to generate the protobuf. Maybe take a look at it.

backnotprop · 2017-09-20T17:23:34Z

ok thanks! yea I actually got benderthon working just fine yesterday

PIRANAVARUBAN · 2018-05-21T12:16:19Z

I am using Using lengstrom project. While using protobuf file i faced

"Fatal error: Index out of range"
var kernelWidth: Int {
return Int(dim[1].size)
}

What is the Issue ?

Attached my Pb model

test.pb.zip

PIRANAVARUBAN · 2018-05-21T12:26:27Z

@mdramos what changes you have made it to working ?

mats-claassen · 2018-05-21T13:00:29Z

The issue with your graph seems to be the ExpandDims at the beginning. I am looking into it.

mats-claassen · 2018-05-21T14:06:30Z

There are two issues with your graph. The first one is fixed with #111 and has to do with the ExpandDims.

The second one is that Mul and Add with scalar are not supported yet. From what I saw they are only used at the end to scale the final result. What you can do there is to cut the graph after the Tanh (when freezing) node and then add a postprocessing layer to do the scaling like this:

Neuron(type: ActivationNeuronType.custom(neuron: MPSCNNNeuronLinear(device: Device.shared, a: 2.0, b: -1)), id: "scale_neuron")

where a and b are the scale and offset.

If there are any other questions please open a new issue.

PIRANAVARUBAN · 2018-05-23T11:10:21Z

Working ....
preds= tf.add(tf.nn.tanh(conv_t3)*150, 255./2,name="preds")

how you calculated a and b as 2 and -1. there is a variation in outimage in python and ios ?
What value we need to put for a and b ?

mats-claassen · 2018-05-25T11:31:32Z

2 and -1 are an example.

MPSCNNNeuronLinear documentation says it calculates ax + b. So yours would be something like 150 and 255/2

dernster added the bug label Aug 10, 2017

bryant1410 changed the title ~~Add works for two layers of the same size~~ Add only works for two layers of the same size Sep 20, 2017

j005u mentioned this issue Oct 26, 2017

Where were the style transfer example models taken from? #49

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add only works for two layers of the same size #50

Add only works for two layers of the same size #50

AndreJFBico commented Jul 28, 2017 •

edited by santiagofm

Loading

bryant1410 commented Jul 28, 2017

bryant1410 commented Jul 28, 2017

AndreJFBico commented Jul 28, 2017 •

edited

Loading

bryant1410 commented Jul 28, 2017

AndreJFBico commented Jul 28, 2017 •

edited

Loading

dernster commented Jul 31, 2017

AndreJFBico commented Aug 1, 2017 •

edited by santiagofm

Loading

bryant1410 commented Aug 1, 2017

AndreJFBico commented Aug 1, 2017 •

edited

Loading

dernster commented Aug 10, 2017

backnotprop commented Sep 19, 2017 •

edited

Loading

bryant1410 commented Sep 19, 2017

AndreJFBico commented Sep 19, 2017 •

edited by santiagofm

Loading

backnotprop commented Sep 20, 2017

bryant1410 commented Sep 20, 2017

bryant1410 commented Sep 20, 2017

backnotprop commented Sep 20, 2017

PIRANAVARUBAN commented May 21, 2018 •

edited

Loading

PIRANAVARUBAN commented May 21, 2018

mats-claassen commented May 21, 2018

mats-claassen commented May 21, 2018

PIRANAVARUBAN commented May 23, 2018

mats-claassen commented May 25, 2018

Add only works for two layers of the same size #50

Add only works for two layers of the same size #50

Comments

AndreJFBico commented Jul 28, 2017 • edited by santiagofm Loading

bryant1410 commented Jul 28, 2017

bryant1410 commented Jul 28, 2017

AndreJFBico commented Jul 28, 2017 • edited Loading

bryant1410 commented Jul 28, 2017

AndreJFBico commented Jul 28, 2017 • edited Loading

dernster commented Jul 31, 2017

AndreJFBico commented Aug 1, 2017 • edited by santiagofm Loading

bryant1410 commented Aug 1, 2017

AndreJFBico commented Aug 1, 2017 • edited Loading

dernster commented Aug 10, 2017

backnotprop commented Sep 19, 2017 • edited Loading

bryant1410 commented Sep 19, 2017

AndreJFBico commented Sep 19, 2017 • edited by santiagofm Loading

backnotprop commented Sep 20, 2017

bryant1410 commented Sep 20, 2017

bryant1410 commented Sep 20, 2017

backnotprop commented Sep 20, 2017

PIRANAVARUBAN commented May 21, 2018 • edited Loading

PIRANAVARUBAN commented May 21, 2018

mats-claassen commented May 21, 2018

mats-claassen commented May 21, 2018

PIRANAVARUBAN commented May 23, 2018

mats-claassen commented May 25, 2018

AndreJFBico commented Jul 28, 2017 •

edited by santiagofm

Loading

AndreJFBico commented Jul 28, 2017 •

edited

Loading

AndreJFBico commented Jul 28, 2017 •

edited

Loading

AndreJFBico commented Aug 1, 2017 •

edited by santiagofm

Loading

AndreJFBico commented Aug 1, 2017 •

edited

Loading

backnotprop commented Sep 19, 2017 •

edited

Loading

AndreJFBico commented Sep 19, 2017 •

edited by santiagofm

Loading

PIRANAVARUBAN commented May 21, 2018 •

edited

Loading