diff --git a/src/reference-manual/transforms.qmd b/src/reference-manual/transforms.qmd index 85083fadd..9fa0e69b9 100644 --- a/src/reference-manual/transforms.qmd +++ b/src/reference-manual/transforms.qmd @@ -331,6 +331,7 @@ $$ The default value for the offset $\mu$ is $0$ and for the multiplier $\sigma$ is $1$ in case not both are specified. +For a container variable, the affine transform is applied to each element of that variable. ### Affine inverse transform {-} diff --git a/src/reference-manual/types.qmd b/src/reference-manual/types.qmd index 9e23f92e3..df4dd4b22 100644 --- a/src/reference-manual/types.qmd +++ b/src/reference-manual/types.qmd @@ -438,8 +438,8 @@ $1$ and multiplier $2$. real x; ``` -As an example, we can give `x` a normal distribution with non-centered -parameterization as follows. +As an example, we can give `x` a normal distribution with non-centered parameterization. +In this program, the affine transform is applied to every element of vector `x`. ```stan parameters { @@ -450,7 +450,7 @@ model { } ``` -Recall that the centered parameterization is achieved with the code +Recall the Stan code for the centered parameterization of this model. ```stan parameters { @@ -461,17 +461,57 @@ model { } ``` -or equivalently +Adding the offset, multiplier transform results in the equivalent non-centered parameterization. ```stan parameters { - real x; + real x; } model { x ~ normal(mu, sigma); } ``` +Sampling is done on the unconstrained parameters. +After applying the affine transform, the unconstrained parameters are standard normal, +thus the above model is equivalent to the hand-coded non-centered parameterization. + +```stan +parameters { + real x_raw; +} +transformed parameters { + real x = mu + x_raw * sigma; +} +model { + x_raw ~ std_normal(); +} +``` + +Use of the affine transform removes the overhead of declaring additional transformed parameters +and directly expresses the hierarchical relationship between parameters. + +For a container variable, the affine transform is applied to each element of that variable. +As an example, the non-centered parameterization of Neal's Funnel in the +[Stan User's Guide reparameterization section](https://mc-stan.org/docs/stan-users-guide/reparameterization.html), +$$ +p(y,x) = \textsf{normal}(y \mid 0,3) \times \prod_{n=1}^9 +\textsf{normal}(x_n \mid 0,\exp(y/2)). +$$ +can be written as: + +```stan +parameters { + real y; + vector[9] x; +} +model { + y ~ normal(0, 3); + x ~ std_normal(0, 0.5 * y); +} +``` +where the affine transform is applied to every element of vector `x`. + ### Expressions as bounds and offset/multiplier {-} Bounds (and offset and multiplier) diff --git a/src/stan-users-guide/efficiency-tuning.qmd b/src/stan-users-guide/efficiency-tuning.qmd index af26da697..9af043392 100644 --- a/src/stan-users-guide/efficiency-tuning.qmd +++ b/src/stan-users-guide/efficiency-tuning.qmd @@ -266,8 +266,6 @@ funnel's neck is particularly sharp because of the exponential function applied to $y$. A plot of the log marginal density of $y$ and the first dimension $x_1$ is shown in the following plot. - - The funnel can be implemented directly in Stan as follows. ```stan @@ -295,13 +293,13 @@ inefficient in the body. This can be seen in the following plots. ![](img/funnel-fit.png) -Neal's funnel. (Left) The marginal density of Neal's funnel for the upper-level variable $y$ and one lower-level variable $x_1$ (see the text for the formula). The blue region has log density greater than -8, the yellow region density greater than -16, and the gray background a density less than -16. -(Right) 4000 draws are taken from a run of Stan's sampler with default settings. -Both plots are restricted to the shown window of $x_1$ and $y$ values; -some draws fell outside of the displayed area as would be expected given -the density. The samples are consistent with the marginal density -$p(y) = \textsf{normal}(y \mid 0,3)$, which has mean 0 and standard -deviation 3. +Neal's funnel. (Left) The marginal density of Neal's funnel for the upper-level variable $y$ and +one lower-level variable $x_1$ (see the text for the formula). The blue region has log density +greater than -8, the yellow region density greater than -16, and the gray background a density less +than -16. (Right) 4000 draws are taken from a run of Stan's sampler with default settings. +Both plots are restricted to the shown window of $x_1$ and $y$ values; some draws fell outside of +the displayed area as would be expected given the density. The samples are consistent with the +marginal density $p(y) = \textsf{normal}(y \mid 0,3)$, which has mean 0 and standard deviation 3. ::: In this particular instance, because the analytic form of the density @@ -314,11 +312,8 @@ parameters { vector[9] x_raw; } transformed parameters { - real y; - vector[9] x; - - y = 3.0 * y_raw; - x = exp(y/2) * x_raw; + real y = 3.0 * y_raw; + vector[9] x = exp(0.5 * y) * x_raw; } model { y_raw ~ std_normal(); // implies y ~ normal(0, 3) @@ -327,14 +322,27 @@ model { ``` In this second model, the parameters `x_raw` and `y_raw` are -sampled as independent standard normals, which is easy for Stan. These -are then transformed into samples from the funnel. In this case, the -same transform may be used to define Monte Carlo samples directly -based on independent standard normal samples; Markov chain Monte Carlo -methods are not necessary. If such a reparameterization were used in -Stan code, it is useful to provide a comment indicating what the -distribution for the parameter implies for the distribution of the -transformed parameter. +sampled as independent standard normals, which is easy for Stan, +and then transformed into samples from the funnel. +When this transform is used in Stan code, a comment indicating what the +distribution for the parameter implies for the distribution of the transformed parameter +will improve readibility and maintainability. + +As of Stan release v2.19.0, this program can be written using Stan's +[affinely transformed real type](https://mc-stan.org/docs/reference-manual/types.html#affine-transform.section). +The affine transform on the vector `x` is applied to each element of `x`. + +```stan +parameters { + real y; + vector[9] x; +} +model { + y ~ normal(0, 3); + x ~ normal(0, 0.5 * y); +} +``` + ### Reparameterizing the Cauchy {-}