Merge pull request #807 from stan-dev/feature/row-stochastic-matrix

row/col stochastic matrix documentation
stan-dev · Dec 10, 2024 · 1782f1a · 1782f1a
2 parents 74557ef + f49798c
commit 1782f1a
Show file tree

Hide file tree

Showing 2 changed files with 143 additions and 0 deletions.
diff --git a/src/reference-manual/transforms.qmd b/src/reference-manual/transforms.qmd
@@ -754,6 +754,72 @@ z_k
 .
 $$
 
+## Stochastic Matrix {#stochastic-matrix-transform.section}
+
+The `column_stochastic_matrix[N, M]` and `row_stochastic_matrix[N, M]` type in
+Stan represents an \(N \times M\) matrix where each column (row) is a unit simplex
+of dimension \(N\). In other words, each column (row) of the matrix is a vector
+constrained to have non-negative entries that sum to one.
+
+### Definition of a Stochastic Matrix {-}
+
+A column stochastic matrix \(X \in \mathbb{R}^{N \times M}\) is defined such
+that each column is a simplex. For column \(m\) (where \(1 \leq m \leq M\)):
+
+$$
+X_{n, m} \geq 0 \quad \text{for } 1 \leq n \leq N,
+$$
+
+and
+
+$$
+\sum_{n=1}^N X_{n, m} = 1.
+$$
+
+A row stochastic matrix is any matrix whose transpose is a column stochastic matrix
+(i.e. the rows of the matrix are simplexes)
+
+
+$$
+X_{n, m} \geq 0 \quad \text{for } 1 \leq n \leq N,
+$$
+
+and
+
+$$
+\sum_{m=1}^N X_{n, m} = 1.
+$$
+
+This definition ensures that each column (row) of the matrix \(X\) lies on the
+\(N-1\) dimensional unit simplex, similar to the `simplex[N]` type, but
+extended across multiple columns(rows).
+
+### Inverse Transform for Stochastic Matrix {-}
+
+For the column and row stochastic matrices the inverse transform is the same
+as simplex, but applied to each column (row).
+
+### Absolute Jacobian Determinant for the Inverse Transform {-}
+
+The Jacobian determinant of the inverse transform for each column \(m\) in
+the matrix is given by the product of the diagonal entries \(J_{n, m}\) of
+the lower-triangular Jacobian matrix. This determinant is calculated as:
+
+$$
+\left| \det J_m \right| = \prod_{n=1}^{N-1} \left( z_{n, m} (1 - z_{n, m}) \left( 1 - \sum_{n'=1}^{n-1} X_{n', m} \right) \right).
+$$
+
+Thus, the overall Jacobian determinant for the entire `column_stochastic_matrix` and `row_stochastic_matrix`
+is the product of the determinants for each column (row):
+
+$$
+\left| \det J \right| = \prod_{m=1}^{M} \left| \det J_m \right|.
+$$
+
+### Transform for Stochastic Matrix {-}
+
+For the column and row stochastic matrices the transform is the same
+as simplex, but applied to each column (row).
 
 ## Unit vector {#unit-vector.section}
 

diff --git a/src/reference-manual/types.qmd b/src/reference-manual/types.qmd
@@ -674,6 +674,83 @@ iterations, and in either case, with less dispersed parameter
 initialization or custom initialization if there are informative
 priors for some parameters.
 
+### Stochastic Matrices {-}
+
+A stochastic matrix is a matrix where each column or row is a
+unit simplex, meaning that each column (row) vector has non-negative
+values that sum to 1. The following example is a \(3 \times 4\)
+column-stochastic matrix.
+
+$$
+\begin{bmatrix}
+0.2 & 0.5 & 0.1 & 0.3 \\
+0.3 & 0.3 & 0.6 & 0.4 \\
+0.5 & 0.2 & 0.3 & 0.3
+\end{bmatrix}
+$$
+
+An example of a \(3 \times 4\) row-stochastic matrix is the following.
+
+$$
+\begin{bmatrix}
+0.2 & 0.5 & 0.1 & 0.2 \\
+0.2 & 0.1 & 0.6 & 0.1 \\
+0.5 & 0.2 & 0.2 & 0.1
+\end{bmatrix}
+$$
+
+
+In the examples above, each column (or row) sums to 1, making the matrices
+valid `column_stochastic_matrix` and `row_stochastic_matrix` types.
+
+Column-stochastic matrices are often used in models where
+each column represents a probability distribution across a
+set of categories such as in multiple multinomial distributions,
+factor models, transition matrices in Markov models,
+or compositional data analysis.
+They can also be used in situations where you need multiple simplexes
+of the same dimensionality.
+
+The `column_stochastic_matrix` and `row_stochastic_matrix` types are declared
+with row and column sizes. For instance, a matrix `theta` with
+3 rows and 4 columns, where each
+column is a 3-simplex, is declared like a matrix with 3 rows and 4 columns.
+
+```stan
+column_stochastic_matrix[3, 4] theta;
+```
+
+A matrix `theta` with 3 rows and 4 columns, where each row is a 4-simplex,
+is similarly declared as a matrix with 3 rows and 4 columns.
+
+```stan
+row_stochastic_matrix[3, 4] theta;
+```
+
+As with simplexes, `column_stochastic_matrix` and `row_stochastic_matrix`
+variables are subject to validation, ensuring that each column (row)
+satisfies the simplex constraints. This validation accounts for
+floating-point imprecision, with checks performed up to a statically
+specified accuracy threshold \(\epsilon\).
+
+#### Stability Considerations {-}
+
+In high-dimensional settings, `column_stochastic_matrix` and `row_stochastic_matrix`
+types may require careful tuning of the inference
+algorithms. To ensure stability:
+
+- **Smaller Step Sizes:** In samplers like Hamiltonian Monte Carlo (HMC),
+smaller step sizes can help maintain stability, especially in high dimensions.
+- **Higher Target Acceptance Rates:** Setting higher target acceptance
+rates can improve the robustness of the sampling process.
+- **Longer Warmup Periods:** Increasing the warmup period allows the sampler
+to better explore the parameter space before the actual sampling begins.
+- **Tighter Optimization Tolerances:** For optimization-based inference,
+tighter tolerances with more iterations can yield more accurate results.
+- **Custom Initialization:** If prior information about the parameters is
+available, custom initialization or less dispersed initialization can lead
+to more efficient inference.
+
 ### Unit vectors {-}
 
 A unit vector is a vector with a norm of one.  For instance, $[0.5,