Implement a loss function for classification models #117

DilumAluthge · 2020-09-15T22:47:43Z

We currently have an example of a loss function for regression models. Specifically, we implement the root mean squared error.

However, we don't currently have an example of a loss function for classification models.

We need to:

Decide which loss function we want to implement.
Implement it.
Add it to the multinomial logistic regression example.

DilumAluthge · 2020-09-15T22:48:29Z

Here is our implementation of the RMSE:

SossMLJ.jl/src/loss-functions.jl

Lines 1 to 19 in f0b38b0

    
           import MLJBase 
        
           import MonteCarloMeasurements 
        
           import Statistics 
        
           const rms_distribution = RMSDistribution() 
        
           const rms_expected = RMSExpected() 
        
           const rms_median = RMSMedian() 
        
           function (::RMSDistribution)(ŷ::MonteCarloMeasurements.MvParticles, y::Vector{<:Real}) 
        
               return MLJBase.rms(ŷ, y) 
        
           end 
        
           function (::RMSExpected)(ŷ::MonteCarloMeasurements.MvParticles, y::Vector{<:Real}) 
        
               return Statistics.mean(rms_distribution(ŷ, y)) 
        
           end 
        
           function (::RMSMedian)(ŷ::MonteCarloMeasurements.MvParticles, y::Vector{<:Real}) 
        
               return Statistics.median(rms_distribution(ŷ, y)) 
        
           end

DilumAluthge · 2020-09-15T22:53:26Z

@cscherrer Any thoughts on a good loss function for the multinomial classification problem? Some options include:

Brier score
Cross entropy loss

Any other options?

cscherrer · 2020-09-15T23:01:44Z

Either of those would be good, or an asymmetric loss could be interesting. I'd think this must come up a lot in medical applications, right?

DilumAluthge · 2020-09-15T23:08:16Z

Yeah in binary classification problems (e.g. mortality prediction), we often want to use a loss function that e.g. penalizes underprediction more than overprediction.

I think for the multinomial example, we can just use something simple and symmetric. Then later we can add a binary class problem with class imbalance, and then we can think about some kind of asymmetric loss function for that problem.

DilumAluthge · 2020-09-15T23:13:02Z

Let's go with the Brier score. For consistency with MLJ, we should implement it the same way they do (https://github.com/alan-turing-institute/MLJBase.jl/blob/5e5d1cda3b555510df1de4b125a5e320c11f6256/src/measures/finite.jl#L103-L131):

"""
BrierScore(; distribution=UnivariateFinite)(ŷ, y [, w])
Given an abstract vector of distributions ŷ of type distribution,
and an abstract vector of true observations y, return the
corresponding Brier (aka quadratic) scores. Weight the scores using
w if provided.
Currently only distribution=UnivariateFinite is supported, which is
applicable to superivised models with Finite target scitype. In this
case, if p(y) is the predicted probability for a single
observation y, and C all possible classes, then the corresponding
Brier score for that observation is given by
2p(y) - \\left(\\sum_{η ∈ C} p(η)^2\\right) - 1
Note that BrierScore()=BrierScore{UnivariateFinite} has the alias
brier_score.
Warning. Here BrierScore is a "score" in the sense that bigger is
better (with 0 optimal, and all other values negative). In Brier's
original 1950 paper, and many other places, it has the opposite sign,
despite the name. Moreover, the present implementation does not treat
the binary case as special, so that the score may differ, in that
case, by a factor of two from usage elsewhere.
For more information, run info(BrierScore).
"""

DilumAluthge · 2020-09-15T23:35:31Z

I think this is blocked by #93

Once #93 is solved, I can just get the prediction for μ in the form of particles. Once I have the particles for μ, I can just put that directly into the formula for the Brier score.

cscherrer · 2020-09-15T23:39:38Z

Sounds good

DilumAluthge added the loss functions and evaluation metrics label Sep 15, 2020

DilumAluthge linked a pull request Sep 16, 2020 that will close this issue

Fix predict_particles, and implement the Brier score #118

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a loss function for classification models #117

Implement a loss function for classification models #117

DilumAluthge commented Sep 15, 2020 •

edited

Loading

DilumAluthge commented Sep 15, 2020

DilumAluthge commented Sep 15, 2020

cscherrer commented Sep 15, 2020

DilumAluthge commented Sep 15, 2020

DilumAluthge commented Sep 15, 2020

DilumAluthge commented Sep 15, 2020

cscherrer commented Sep 15, 2020

Implement a loss function for classification models #117

Implement a loss function for classification models #117

Comments

DilumAluthge commented Sep 15, 2020 • edited Loading

DilumAluthge commented Sep 15, 2020

DilumAluthge commented Sep 15, 2020

cscherrer commented Sep 15, 2020

DilumAluthge commented Sep 15, 2020

DilumAluthge commented Sep 15, 2020

DilumAluthge commented Sep 15, 2020

cscherrer commented Sep 15, 2020

DilumAluthge commented Sep 15, 2020 •

edited

Loading