Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom cost function is not being used for training #218

Open
xafilox opened this issue Mar 27, 2017 · 8 comments
Open

Custom cost function is not being used for training #218

xafilox opened this issue Mar 27, 2017 · 8 comments

Comments

@xafilox
Copy link

xafilox commented Mar 27, 2017

Hi.

I have implemented my own cost function but I have realized it has only being used for printing, not for calculating the error for the gradient descend, and in fact the metric that is being used during the training is the default one (the accuracy).

The reason I am saying this is because I have tried returning always value 1 (return [(:EuclideanDist, 1)])in the mx.get method and I still get exactly the same results (when in this case it should not be able to even learn anything).

Thanks for your help.

using MXNet
using Distances

import MXNet.mx: get, reset!, update!

redirect_stderr(STDOUT)

srand(1234)

type EuclideanDist <: mx.AbstractEvalMetric
    loss_sum  :: Float64
    n_sample :: Int

    EuclideanDist() = new(0.0, 0)
end

function mx.update!(metric :: EuclideanDist, labels :: Vector{mx.NDArray}, preds :: Vector{mx.NDArray})
    preds  = copy(preds)
    labels  = copy(labels)
    
    @assert length(labels) == length(preds)
    
    loss = 0.0
    for (label, pred) in zip(labels, preds)
        @mx.nd_as_jl ro=(label, pred) begin
            for elem in 1:size(label)[2]
                _label = label[:, elem]
                _pred = pred[:, elem]
                _euc = euclidean([_label[1]/10000 * training_deg_to_m_lat, _label[2]/10000 * training_deg_to_m_long], [_pred[1]/10000 * training_deg_to_m_lat, _pred[2]/10000 * training_deg_to_m_long])
                loss += _euc
            end
        end
    end

    metric.loss_sum += loss
    metric.n_sample += size(labels[1])[2]
end

function mx.get(metric :: EuclideanDist)
    distance  = metric.loss_sum / metric.n_sample
    return [(:EuclideanDist, distance)]
end

function mx.reset!(metric :: EuclideanDist)
    metric.loss_sum  = 0.0
    metric.n_sample = 0
end

data = mx.Variable(:data)  # Do not change the name
lbl  = mx.Variable(:softmax_label) # Do not change the name
fc1  = mx.FullyConnected(data, name=:fc1, num_hidden=512)
act1 = mx.Activation(fc1, name=:relu1, act_type=:relu)
fc2  = mx.FullyConnected(act1, name=:fc2, num_hidden=512)
act2 = mx.Activation(fc2, name=:relu2, act_type=:relu)
fc3  = mx.FullyConnected(act2, name=:fc3, num_hidden=128)
act3 = mx.Activation(fc3, name=:relu3, act_type=:relu)
fc4  = mx.FullyConnected(act3, name=:fc4, num_hidden=32)
act4 = mx.Activation(fc4, name=:relu4, act_type=:relu)
fc5  = mx.FullyConnected(act4, name=:fc5, num_hidden=2)
mlp  = mx.LinearRegressionOutput(fc5, lbl, name=:linear)

# data provider
train_provider = mx.ArrayDataProvider(Array(training_data)', Array(training_labels)', batch_size = 100, shuffle = true)
eval_provider = mx.ArrayDataProvider(Array(validation_data)', Array(validation_labels)', batch_size = 100, shuffle = true)

# setup model
model = mx.FeedForward(mlp, context=mx.gpu(1))

# optimizer
optimizer = mx.ADAM()

# Initializer
#initializer = mx.XavierInitializer(distribution = mx.xv_uniform, regularization = mx.xv_avg, magnitude = 3)
initializer = mx.UniformInitializer(0.01)

# fit parameters
a = mx.fit(model, optimizer, train_provider, eval_data=eval_provider, initializer=initializer, n_epoch=200, eval_metric=EuclideanDist())
@iblislin
Copy link
Member

I guess you need this: http://mxnet.io/how_to/new_op.html

@iblislin
Copy link
Member

iblislin commented Mar 27, 2017

Note that mx.LinearRegressionOutput is used as a loss function. It will trigger the BP and update weigth, IIRC, so you need to write your own one and replace it.

@iblislin
Copy link
Member

Just discovered that there is a MakeLoss helper for creating custom loss function, but seems it is still buggy now.
So... seems writing your own layer in Cpp is the only way to make it currently.

@xafilox
Copy link
Author

xafilox commented Mar 27, 2017

Thanks a lot! Is it possible to create the new operators with Julia? I have tried mx.operator.CustomOp and mx.CustomOp and none of them seem to exist.

@Petterhg
Copy link

I have the same issue! Is there really no simpler way?
I have a multi-label regression issue where I need to specify my own loss function. Where can you find the implemented loss function for all the available output functions?

@iblislin
Copy link
Member

Seems no wrapper for creating CustomOp .... 😕

@xafilox
Copy link
Author

xafilox commented Mar 27, 2017

Ohh, what a pity :(
Thanks for your help @iblis17

@vchuravy
Copy link
Collaborator

CustomOp support is a longer project see #166 .
If anybody is interested in having this helping with #173 is a good place to start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants