Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Add full likelihood components for likelihoods, random effects and priors/penalties #553

Open
Cole-Monnahan-NOAA opened this issue Jan 12, 2024 · 6 comments
Assignees
Labels
hard substantial (days-weeks) work statistics related to logL wishlist request new feature; bigger than revision; OK to remove after adding to Milestone

Comments

@Cole-Monnahan-NOAA
Copy link

Cole-Monnahan-NOAA commented Jan 12, 2024

Describe the solution you would like.

I suggest that SS3 should modify all objective function components (i.e., data likelihoods, priors, and hierarchical penalties) to use the full log-densities statements. E.g. dnorm and dmultinom type statements should be used in place of norm2 which drops the constants (example here).

Describe alternatives you have considered

One could meticulously add constants back but that is not ideal and would make the code harder to read and be more bug prone.

Statistical validity, if applicable

The approaches are equivalent up to a constant in the penalized maximum likelihood approach. However, this change would have three benefits as I see it.

  1. If a user wants to estimate a process error (sigmaR, SD for time-varying quantities, etc.) then the SD in those statements are no longer constant and the log-density calculations are incorrect and the estimate will be biased. Converting to the R-style dnorm statements will allow for improved options for estimating process errors in e.g. a Bayesian context.
  2. Improved interpretation of the NLL value when using more than one data likelihood. If a user tries the same model with multinomial and the D-M and the constants are not added then it's not clear what the change in NLL means. The constants must be included to compare among likelihoods (e.g., Burnham and Anderson 2014).
  3. The other advantage is code readability and transparency. Most users will be more familiar with dnorm(0,x,SD) and how to interpret that, and it is more clear what is a data likelihood vs prior.

Describe if this is needed for a management application

No response

Additional context

I realize this would require a big change and break all the tests (presuming they check for changes in the NLL). But the inference should be identical for all penalized ML models.

Also see request by quang-huynh for proper lognormal prior #558. Which should be done as an added prior option and the documentation will need full explanation of the difference.

@Rick-Methot-NOAA Rick-Methot-NOAA added wishlist request new feature; bigger than revision; OK to remove after adding to Milestone hard substantial (days-weeks) work statistics related to logL labels Jan 12, 2024
@Rick-Methot-NOAA
Copy link
Collaborator

This indeed would be a large undertaking. SS3 has been penalized likelihood from day 1. Jim Thorson also has asked for this change.

@kellijohnson-NOAA
Copy link
Contributor

@Rick-Methot-NOAA with your permission, I would like to attempt to work on this once I have the hake stock assessment finished (mid-February 2024). I am sure that I will have lots of questions and need help but at least you wouldn't be tasked with doing the whole thing.

@Rick-Methot-NOAA
Copy link
Collaborator

Glad you asked because I almost assigned it to you as a co-lead myself. So, you are welcome to be fully involved. First step can be assembling a list of current code features that will be affected.

@Cole-Monnahan-NOAA
Copy link
Author

I don't think any features will be affected unless I'm missing something? I'm happy to help where I can. It seems prudent to list the changes needed and then walk through them one at a time using the test suite to ensure no changes to parameter estimates.

@Rick-Methot-NOAA
Copy link
Collaborator

features affected:
multinomial logL includes offset for logL of a perfect fit so final value approaches 0 from below
some logL components have a -log(s) that can be invoked by the sd_offset setting
etc.

@Cole-Monnahan-NOAA
Copy link
Author

Great points Rick. Many of these are features of the objfun I don't understand so that'll be good to have some others put eyes on it and understand it better.

@kellijohnson-NOAA I'll wait for your lead on how to proceed. I consider this a priority but non-urgent issue, personally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hard substantial (days-weeks) work statistics related to logL wishlist request new feature; bigger than revision; OK to remove after adding to Milestone
Projects
Status: No status
Status: No status
Development

No branches or pull requests

3 participants