-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
loss function #70
Comments
|
Hi @saeedalahmari3, def dice_coef(y_true, y_pred):
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth) Compared to its traditional definition it is already smooth and differentiable. Since The def mean_squared_error(y_true, y_pred):
return K.mean(K.square(y_pred - y_true), axis=-1)
Of course, you could even try to use the For more loss functions see here. But maybe I misunderstand. Why would you need any As @jocicmarko says it could work, your function stays differentiable. |
@jmargeta, I'm sorry if this is a trivial question, but how does the addition of the smooth term make this function differentiable? |
@BrownPanther, adding a small constant to the denominator prevents the possibility of division by zero when K.sum(y_true_f) + K.sum(y_pred_f) equals to 0 . It would otherwise be a point with no defined derivative. Even if the sum is a very small positive number, adding eps dampens the dramatic changes in gradient that could be even caused by single pixel changes in the prediction/groundtruth. |
@jmargeta thanks for the explanation! but, even before the addition of the smooth term, the dice function is differentiable because the last layer's output is a sigmoid/softmax ie probabilities rather than 0 or 1, correct? The smooth term just helps with the gradient flow - is this understanding correct? |
@BrownPanther Yes, even without the smooth term the function itself would be differentiable almost everywhere (except for Using sigmoid/softmax in the last layer does not influence the differentiability of the dice function itself. Passing a continuous input into a differentiable function results in a continuous change of its output and can indeed help with the gradient flow. In the end, perfect differentiability is often not such a big deal. |
hello, |
I didn't understand this loss functoin
return -dice_coef(y_true, y_pred)
. For backpropagation I think we need a differentiable loss function for instancereturn 0.5*math.pow(1-dice_coef(y_true, y_pred),2)
Is it true?
The text was updated successfully, but these errors were encountered: