Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a loss function for classification models #117

Open
DilumAluthge opened this issue Sep 15, 2020 · 7 comments · May be fixed by #118
Open

Implement a loss function for classification models #117

DilumAluthge opened this issue Sep 15, 2020 · 7 comments · May be fixed by #118

Comments

@DilumAluthge
Copy link
Collaborator

DilumAluthge commented Sep 15, 2020

We currently have an example of a loss function for regression models. Specifically, we implement the root mean squared error.

However, we don't currently have an example of a loss function for classification models.

We need to:

  1. Decide which loss function we want to implement.
  2. Implement it.
  3. Add it to the multinomial logistic regression example.
@DilumAluthge
Copy link
Collaborator Author

Here is our implementation of the RMSE:

import MLJBase
import MonteCarloMeasurements
import Statistics
const rms_distribution = RMSDistribution()
const rms_expected = RMSExpected()
const rms_median = RMSMedian()
function (::RMSDistribution)(ŷ::MonteCarloMeasurements.MvParticles, y::Vector{<:Real})
return MLJBase.rms(ŷ, y)
end
function (::RMSExpected)(ŷ::MonteCarloMeasurements.MvParticles, y::Vector{<:Real})
return Statistics.mean(rms_distribution(ŷ, y))
end
function (::RMSMedian)(ŷ::MonteCarloMeasurements.MvParticles, y::Vector{<:Real})
return Statistics.median(rms_distribution(ŷ, y))
end

@DilumAluthge
Copy link
Collaborator Author

@cscherrer Any thoughts on a good loss function for the multinomial classification problem? Some options include:

  1. Brier score
  2. Cross entropy loss

Any other options?

@cscherrer
Copy link
Owner

Either of those would be good, or an asymmetric loss could be interesting. I'd think this must come up a lot in medical applications, right?

@DilumAluthge
Copy link
Collaborator Author

Yeah in binary classification problems (e.g. mortality prediction), we often want to use a loss function that e.g. penalizes underprediction more than overprediction.

I think for the multinomial example, we can just use something simple and symmetric. Then later we can add a binary class problem with class imbalance, and then we can think about some kind of asymmetric loss function for that problem.

@DilumAluthge
Copy link
Collaborator Author

Let's go with the Brier score. For consistency with MLJ, we should implement it the same way they do (https://github.com/alan-turing-institute/MLJBase.jl/blob/5e5d1cda3b555510df1de4b125a5e320c11f6256/src/measures/finite.jl#L103-L131):

"""
BrierScore(; distribution=UnivariateFinite)(ŷ, y [, w])
Given an abstract vector of distributions of type distribution,
and an abstract vector of true observations y, return the
corresponding Brier (aka quadratic) scores. Weight the scores using
w if provided.
Currently only distribution=UnivariateFinite is supported, which is
applicable to superivised models with Finite target scitype. In this
case, if p(y) is the predicted probability for a single
observation y, and C all possible classes, then the corresponding
Brier score for that observation is given by
2p(y) - \\left(\\sum_{η ∈ C} p(η)^2\\right) - 1
Note that BrierScore()=BrierScore{UnivariateFinite} has the alias
brier_score.
Warning. Here BrierScore is a "score" in the sense that bigger is
better (with 0 optimal, and all other values negative). In Brier's
original 1950 paper, and many other places, it has the opposite sign,
despite the name. Moreover, the present implementation does not treat
the binary case as special, so that the score may differ, in that
case, by a factor of two from usage elsewhere.
For more information, run info(BrierScore).
"""

@DilumAluthge
Copy link
Collaborator Author

I think this is blocked by #93

Once #93 is solved, I can just get the prediction for μ in the form of particles. Once I have the particles for μ, I can just put that directly into the formula for the Brier score.

@cscherrer
Copy link
Owner

Sounds good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants