add multiplier keyword to unit vector #3105

spinkney · 2024-09-20T09:05:47Z

In https://discourse.mc-stan.org/t/a-better-unit-vector/26989/30 Seth Axen lays out the reasoning for adding an additional parameter which repels values away from 0.

The current implementation is equivalent to

data {
  int<lower=0> N;
}
parameters {
  vector[N] u_raw;
}
transformed parameters {
  real r = dot_self(u_raw);
  vector[N] u = u_raw / sqrt(r);
}
model {
  target += -0.5 * r;
}

The proposal is to parameterize the unit vector as

data {
  int<lower=0> N;
}
parameters {
  vector[N] u_raw;
}
transformed parameters {
  real r = dot_self(u_raw);
  vector[N] u = u_raw / sqrt(r);
}
model {
 target += 0.5 * (a * log(r) - r);
}

where a >= 0 and is given as a multiplier keyword. As Seth notes in the post, it corresponds to the chi distribution with a + 1 degrees of freedom. The user would see:

unit_vector<multiplier=a>[N] u;

Setting a = 0 is equivalent to what is currently in Stan and will still be the default when the multiplier keyword is not used.

How to select `a`?

From the same thread the issue manifests with smaller dimension sizes of the unit vector. Heuristically, I'm finding that setting a = N^(6/N) where N is the dimension of the unit-vector works well.

         N a
 [1,]    1 1.000000
 [2,]    2 8.000000
 [3,]    3 9.000000
 [4,]    4 8.000000
 [5,]    5 6.898648
 [6,]    6 6.000000
 [7,]    7 5.301146
 [8,]    8 4.756828
 [9,]    9 4.326749
[10,]   10 3.981072

Here's a test model for 2d unit vectors where divergences occur in the current parameterization. When the data size, M, is small there are fewer divergences then when it's larger (M > 50).

data {
  int<lower=0> M;
  vector<lower=-pi(), upper=pi()>[M] y;
  real a;
}
parameters {
  vector[2] u_raw;
  real<lower=0> kappa;
}
transformed parameters {
  real r = dot_self(u_raw);
  vector[2] u = u_raw / sqrt(r);
  real mu = atan2(u[2], u[1]);
}
model {
  target += 0.5 * (a * log(r) - r);
  kappa ~ exponential(1);
  y ~ von_mises(mu, kappa);
}

The data for this can be created as

M <- 100
y <- runif(M, min = -pi/4, max = pi/4)
true_mean <- 0

mod_unit <- cmdstan_model("unit.stan")

mod_unit_out <- mod_unit$sample(
  data = list(M = M,
              y = y,
              a = 2^(6/2),
  seed = 2309423,
  parallel_chains = 4
)

The text was updated successfully, but these errors were encountered:

WardBrian · 2024-09-20T19:33:57Z

If I'm reading this correctly, I think using the keyword multiplier for this would be a mistake. I think that would be interpreted by most users as some sort of scaling on vector itself, such that e.g unit_vector<multiplier=4> would yield a vector that has norm 4

SteveBronder · 2024-09-20T20:38:06Z

From the comment

This repels y from y = 0, and the degree of repulsion can be tuned by the user by increasing a.

unit_vector<repulsion=...>?

From the code, a is added to the jacobian.
target += -r^2/2 + a * log(r);

Since it's not applied to the parameters idt, bias, scale, or multiplier fit right with the rest of the constrain keywords.

spinkney · 2024-09-21T18:12:34Z

I'm ok with it just being something simple like $k$ as in unit_vector<k=...> we note that this is related to the chisquare distribution degrees of freedom. The log-pdf of the chisquare distribution is

$$ \left(\frac{k}{2} - 1 \right) \log(x) - \frac{x}{2} + C $$

What I have above is 0.5 * (a * log(r) - r). Clearly, $r == x$ thus

$$ \begin{aligned} \frac{a}{2} &= \frac{k}{2} - 1 \\ k &= a + 2. \end{aligned} $$

When a = 0 this corresponds to 2 degrees of freedom, ie the sum of the square of two standard normal variates. When k is allowed to be non-integer values then this is a gamma RV with

$$ f(x) = \Gamma \left( \frac{k}{2}, \frac{1}{2} \right) $$

The mean of this is $\mu = k / 4$ and the variance is $\sigma^2 = k / 8$. As $k$ increases both the mean and the variance increase. But this doesn't effect the estimate of the unit_vector (unless sampling just fails because we're at the singularity).

spinkney assigned SteveBronder Sep 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add multiplier keyword to unit vector #3105

add multiplier keyword to unit vector #3105

spinkney commented Sep 20, 2024 •

edited

Loading

WardBrian commented Sep 20, 2024

SteveBronder commented Sep 20, 2024 •

edited

Loading

spinkney commented Sep 21, 2024

add multiplier keyword to unit vector #3105

add multiplier keyword to unit vector #3105

Comments

spinkney commented Sep 20, 2024 • edited Loading

How to select a?

WardBrian commented Sep 20, 2024

SteveBronder commented Sep 20, 2024 • edited Loading

spinkney commented Sep 21, 2024

spinkney commented Sep 20, 2024 •

edited

Loading

How to select `a`?

SteveBronder commented Sep 20, 2024 •

edited

Loading