Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowing _lp in the transformed parameters block and jacobian in the model block #1482

Open
spinkney opened this issue Dec 18, 2024 · 8 comments
Labels
feature New feature or request

Comments

@spinkney
Copy link
Contributor

The pr #979 added the jacobian += and _jacobian function keyword to indicate adding the change of variables correction for a transformed parameter. The jacobian functionality can only be used in the transformed parameters block. That same pr also introduced a warning that _lp type functions will no longer be allowed in the transformed parameters block and will only be supported in the model block. I think the introduction of the jacobian functionality is useful but I believe restricting the _lp to only the model block is a step backwards for Stan.

In merging #979, which users will first see in cmdstan 2.36, any user defined function ending with _lp that is used in the transformed parameters block will receive a warning that this functionality will not be allowed in future releases. The initial reasoning in the discussion for #979 is that allowing functions to access the target was a bug in stanc2 that we continued to support in stanc3. The merged pr is indicating the "fix". However, there are times where it is useful to both add the jacobian correction and to add a log density. I don't believe this is widespread but in matrix parameters I encounter it. For example, below I show the c-vine parameterization of a correlation matrix. I use both the jacobian and the target keywords here. I'm not sure this will compile because I believe it would need to end in _jacobian. I have been using target everywhere where jacobian now is which does compile in previous versions.

  vector lb_ub_lp (vector y, real lb, real ub) {
    target += log(ub - lb) + log_inv_logit(y) + log1m_inv_logit(y);
    return lb + (ub - lb) * inv_logit(y);
  }
  
  real lb_ub_lp (real y, real lb, real ub) {
    target += log(ub - lb) + log_inv_logit(y) + log1m_inv_logit(y);
    return lb + (ub - lb) * inv_logit(y);
  }
  
  matrix corr_constrain_cvine_lp (vector col_one_raw, vector off_raw, real eta) {
    int K = num_elements(col_one_raw) + 1;
    vector[K - 1] z = lb_ub_lp(col_one_raw, -1, 1);
    matrix[K, K] C = identity_matrix(K);
    matrix[K, K] P;
    C[1, 1] = 1;
    C[2:K, 1] = z[1:K - 1];
    C[1, 2:K] = C[2:K, 1]';
    P = C;
    int cnt = 1;
    
    target += beta_lpdf(0.5 * (z + 1) | eta + 0.5 * (K - 1), eta + 0.5 * (K - 1));
    jacobian += -0.5 * log1m(P[1, 2:K]^2);
    
    for (i in 2:K - 1) {
      for (j in i + 1:K) {
        P[i, j] = lb_ub_lp(off_raw[cnt], lb, ub);
        cnt += 1;
        real tem = P[i, j];
         jacobian += -0.5 * i * log1m(P[i, j]^2);
         target += beta_lpdf(0.5 * (P[i, j] + 1) | eta + 0.5 * (K - 1), eta + 0.5 * (K - 1));
        int m = i;
        for (k in 1:i - 1) {
          m += -1;
          tem = P[m, i] * P[m, j] + tem * sqrt( (1 - P[m, i]^2) * (1 - P[m, j]^2) ); 
        }
        
        C[i, j] = tem;
        C[j, i] = C[i, j];
        }
      }
        return C;
  }

The first thing to note is that I often nest _lp functions, here I have lb_ub_lp called within the corr_constrain_cvine_lp function. The next thing is that I'm traversing the lower triangular part of the matrix and I'm adding a beta prior to a transformed parameter created in this function. However, it is an intermediate parameter used to calculate the final correlation matrix, C.

If the deprecation goes through then I will have to have two functions, one for the jacobian that constructs C while also returning a tuple of matrices for both P and C. Then in the model block I'll have to loop through P to increment the log density by the beta_lpdf. I don't even need P and I definitely don't want to do the double loop again!

The current ability of having _lp works great. Let's keep it and, while we're at it, let's make the language a bit more flexible and add the ability for users to use the jacobian keyword in the model block.

@spinkney spinkney added the feature New feature or request label Dec 18, 2024
@WardBrian
Copy link
Member

I think walking back the deprecation is fine. I'd like to allow target+= directly in transformed parameters if we did, for consistency.

I think your function is interesting, because ideally it would contain both target += statements and jacobian += statements, but right now there is no function (or block) that allows both of those. So you're forced to split it up, or, if we walk back the deprecation, only use target +=.

So, it might also be worth considering if we should have some way of writing a function that allows both.

@spinkney
Copy link
Contributor Author

Having the option to do both is ideal. If users also want to inspect the jacobian or turn it off in the above function, even having a target += only option means they won't be able to do this. It seems like having _lp in the function name and then being able to call jacobian += makes the most sense to me. Not sure how difficult it is to support this functionality though.

@WardBrian
Copy link
Member

Relatively easy. I've also wondered if _target should be an option, since we've dropped the term lp basically everywhere else above the C++ layer.

@nhuurre
Copy link
Collaborator

nhuurre commented Dec 18, 2024

I'd like to allow target+= directly in transformed parameters if we did, for consistency.

So is the idea to just allow both jacobian/target in transformed parameters block, model block, and _target (or _lp?) function bodies. At that point we might as well deprecate both _lp and _jacobian functions in favor of using _target functions for everything. Deprecating _jacobian functions one release after they were added would be kind of funny; the design doc did not discuss if jacobian += should be allowed in _lp functions.

@WardBrian
Copy link
Member

At that point we might as well deprecate both _lp and _jacobian functions in favor of using _target functions for everything

Maybe? It still seems useful from some perspective to have lesser-powered versions of these things floating around. I think they also end up code-genned differently than each other due to the extra Jacobian template bool.

@bob-carpenter
Copy link
Contributor

We need jacobian += to be separated out if we want to be able to recreate the behavior of our built-in transforms. The Jacobian gets turned on or off depending on whether you are doing an MLE or a MAP of the unconstrained density used for sampling. That's the only difference of which I'm aware in our current usage. As @WardBrian points out, this shows up in a template parameter in the generated C++ code (this comment is for everyone else following along---I'm sure everyone commenting so far knows this!).

@spinkney: In your example, why are you incrementing target and jacobian? Aren't both part of the change-of-variables adjustment? I know part of it's not technically a Jacobian, but think of that as shorthand for "change of variables adjustment" as it's what gets turned off/on during optimization.

@spinkney
Copy link
Contributor Author

@spinkney: In your example, why are you incrementing target and jacobian? Aren't both part of the change-of-variables adjustment? I know part of it's not technically a Jacobian, but think of that as shorthand for "change of variables adjustment" as it's what gets turned off/on during optimization.

The above is the same as putting a uniform prior on a correlation matrix, that is an LKJ(1). All the math in the LKJ paper shows that these simplify to powers of the diagonal of the Cholesky factor. However, I've put the correlation constraint in this form because I may have knowledge about some of the correlations that deviates from the uniform! In this case, the form does not simplify thus I need P. Wouldn't you still want that information conveyed in optimization, even though it's not a Jacobian? Then I think one may want to turn off the "jacobian" part but keep the prior part.

@bob-carpenter
Copy link
Contributor

I'm not following the details, but the question to ask is what the MLE would be. In the other cases, it's clear we want to drop all the stuff that leads to the distribution being uniform on the unconstrained scale as that monkeys with the underlying density being optimized and pulls it away from the pure likelihood.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants