Add log(Pval) term to lognormal prior #558

quang-huynh · 2024-01-31T02:14:52Z

Concisely describe what has been changed/addressed in the pull request.

Add log(pval) to negative log prior for lognormal distribution

What tests have been done?

Where are the relevant files?

What tests/review still need to be done?

Is there an input change for users to Stock Synthesis?

No input change.

Additional information (optional).

Current negative log prior is of the form 0.5 * square(log(x) - mu). Since x is estimated in normal space, an additional term log(x) from the lognormal distribution is needed.

Some R code to verify:

x <- seq(0.01, 1, 0.005)
logmu <- 0
sd <- 1

# True mode of lognormal distribution
f_x <- dlnorm(x, logmu, sd)
x[which.max(f_x)]

# Discrepancy in the mode of the distribution with SS3 lognormal prior
g_x <- -0.5 * (log(x) - logmu)^2/sd/sd
x[which.max(g_x)]

# Mode from proposed correction matches the answer
h_x <- -0.5 * (log(x) - logmu)^2/sd/sd - log(x)
x[which.max(h_x)]

Rick-Methot-NOAA

Hello Quang. Thanks for considering this issue. It cannot be accepted as is because it does not provide backward compatibility for all existing model applications. I also will want the logic and rationale of the proposed change to be documented more thoroughly.

Cole-Monnahan-NOAA · 2024-01-31T20:29:33Z

My initial impression is to agree with this request from a statistical standpoint. When used as a prior, the pdf is a function of the state variable which in this case is some parameter say 'p'. The mu and sd terms are fixed (inputted by the user). The log(p) component thus varies as p does during optimization (or MCMC) and thus the prior density needs it to be accurate.

@quang-huynh 's code shows this, but another way to demonstrate it is to integrate the model. With no other sources of information the posterior will just be the prior. So I modified the example to get posteriors and compare to the analytical prior. See code below which produces the following plot.

I think this could be wrapped into issue #553 which will overhaul the objective function calculations. This is a good example of why using dlnorm would be preferable. Probably the shortest path forward is to provide a new prior option that uses dlnorm and leave the old ones in with a note in the documentation.

Curious to hear if there are other considerations.

library(adnuts)

tpl <- '
DATA_SECTION
  init_number logmu;
  init_number sd;

PARAMETER_SECTION
  init_bounded_number p1(0,500);
  init_bounded_number p2(0,500);
  init_bounded_number p3(0,500);
  objective_function_value jnll;

PROCEDURE_SECTION
  jnll=  0.5*square(log(p1)-logmu)/square(sd);
  jnll += dlnorm(p2,logmu,sd);
  jnll += 0.5*square(log(p3)-logmu)/square(sd)+log(p3);
'

logmu <- 1.1
logsd <- 1.12
writeLines(as.character(c(logmu, sd)), con='lnpriors.dat')
## prior mean and SD in log space
writeLines(tpl, con='lnpriors.tpl')
system("admb lnpriors.tpl")

## sample lots of draws for clarity
post <- sample_nuts('lnpriors', cores=1,
                   init=function() c(1,1,1)) |>
  as.data.frame()

par(mfcol=c(2,3))
labs <- c('SS3', 'dlnorm', 'SS3 corrected')
logp <- seq(qnorm(.001, logmu, logsd), qnorm(.999, logmu,logsd), len=1000)
p <- exp(logp)
for(i in 1:3){
  hist(post[,i], freq=FALSE, breaks=30,
       main=paste(labs[i], 'natural space'), xlab='p')
  lines(p, dlnorm(p,logmu,logsd), lwd=2)
  hist(log(post[,i]), freq=FALSE, breaks=30,
       main=paste(labs[i], 'log space'), xlab='log p')
  lines(log(p), dnorm(log(p),logmu,logsd), lwd=2)
}

Rick-Methot-NOAA · 2024-01-31T22:57:14Z

closing this PR and redirecting discussion to #553

quang-huynh · 2024-02-01T22:49:25Z

The explanation from @Cole-Monnahan-NOAA is much better than my terse code!

On the math side, if we take the log of the lognormal probability density function and drop constants, then 0.5*square(log(x)-logmu)/square(sd) is fine as a likelihood (x is a data point and logmu is a model prediction), but the log(x) term can't be dropped if the pdf is used as a prior (x is estimated, logmu is fixed).

Rick-Methot-NOAA · 2024-02-01T23:01:42Z

I understand, but SS3 is rarely used in a Bayesian mode and the value of 0.5*square(log(parm)-logmu)/square(sd) is simply a component of the objective function. So, does the correct approach depend upon whether the model is operating in bayesian mode vs maximum posterior density mode? SS3 was designed as a MPD model and has never been thoroughly examined for bayesian use, as Cole (and Jim Thorson) has proposed for #553

quang-huynh · 2024-02-01T23:18:59Z

No, I don't think the approach changes with MPD vs. MCMC.

The current SS3 equation generates a larger prior mode and SD than what the user specifies (both my and Cole's figures show this). This has been my experience with a couple assessments where the posterior mean and MPD for natural mortality was much higher than the prior mean for what I thought would be a relatively uninformative data set.

x <- seq(0.01, 3, 0.01)
logmu <- 0
sd <- 1

# True mode of lognormal distribution
f_x <- dlnorm(x, logmu, sd, log = TRUE)
x[which.max(f_x)]

# Discrepancy in the mode of the distribution with SS3 lognormal prior
g_x <- -0.5 * (log(x) - logmu)^2/sd/sd
x[which.max(g_x)]

# Mode from proposed correction matches the answer
h_x <- -0.5 * (log(x) - logmu)^2/sd/sd - log(x)
x[which.max(h_x)]

plot(x, f_x - max(f_x), ylab = "Prior density function", xlab = "Parameter value", ylim = c(-2, 0))
lines(x, g_x - max(g_x), col = 2)
lines(x, h_x - max(h_x), col = 3)
legend('bottomright', c("Lognormal distribution", "Current SS3", "SS3 corrected"), col = 1:3, pch = c(1, NA, NA), lty = 1)

Cole-Monnahan-NOAA · 2024-02-02T00:59:27Z

I agree that it does not depend on how the objective function will be used. The fundamental problem is that it's putting a normal prior in log space, but the manual suggests it's a lognormal prior. So the user will get a different answer than the expect, even if they are optimizing. You can see this from my example where those independent parameters the MLE will just be the mode which you can visualize, and the current one (on the left) would give a biased MLE for the parameter without any data. I turned on estimation in the above example and got

# Number of parameters = 3 Objective function value = 1.97786721851170  Maximum gradient component = 4.38275630568530e-06
# p1:
3.00416521397
# p2:
0.856929171346
# p3:
0.856929162964

The mode of this distribution (meanlog=1.1, meansd=1.12) is exp(mu-sd^2) = 0.8569292. So you can see that the estimate will be off. It's not convening the right information in the prior that the user expects.

That's my interpretation at least.

Add log(Pval) term to lognormal prior

499fe67

e-perl-NOAA requested review from Rick-Methot-NOAA and iantaylor-NOAA January 31, 2024 15:35

Rick-Methot-NOAA reviewed Jan 31, 2024

View reviewed changes

Rick-Methot-NOAA mentioned this pull request Jan 31, 2024

[Feature]: Add full likelihood components for likelihoods, random effects and priors/penalties #553

Open

Rick-Methot-NOAA closed this Jan 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add log(Pval) term to lognormal prior #558

Add log(Pval) term to lognormal prior #558

quang-huynh commented Jan 31, 2024

Rick-Methot-NOAA left a comment

Cole-Monnahan-NOAA commented Jan 31, 2024

Rick-Methot-NOAA commented Jan 31, 2024

quang-huynh commented Feb 1, 2024

Rick-Methot-NOAA commented Feb 1, 2024

quang-huynh commented Feb 1, 2024

Cole-Monnahan-NOAA commented Feb 2, 2024 •

edited

Loading

Add log(Pval) term to lognormal prior #558

Add log(Pval) term to lognormal prior #558

Conversation

quang-huynh commented Jan 31, 2024

Concisely describe what has been changed/addressed in the pull request.

What tests have been done?

Where are the relevant files?

What tests/review still need to be done?

Is there an input change for users to Stock Synthesis?

Additional information (optional).

Rick-Methot-NOAA left a comment

Choose a reason for hiding this comment

Cole-Monnahan-NOAA commented Jan 31, 2024

Rick-Methot-NOAA commented Jan 31, 2024

quang-huynh commented Feb 1, 2024

Rick-Methot-NOAA commented Feb 1, 2024

quang-huynh commented Feb 1, 2024

Cole-Monnahan-NOAA commented Feb 2, 2024 • edited Loading

Cole-Monnahan-NOAA commented Feb 2, 2024 •

edited

Loading