Skip to content

Commit

Permalink
Merge pull request #807 from stan-dev/feature/row-stochastic-matrix
Browse files Browse the repository at this point in the history
row/col stochastic matrix documentation
  • Loading branch information
WardBrian authored Dec 10, 2024
2 parents 74557ef + f49798c commit 1782f1a
Show file tree
Hide file tree
Showing 2 changed files with 143 additions and 0 deletions.
66 changes: 66 additions & 0 deletions src/reference-manual/transforms.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -754,6 +754,72 @@ z_k
.
$$

## Stochastic Matrix {#stochastic-matrix-transform.section}

The `column_stochastic_matrix[N, M]` and `row_stochastic_matrix[N, M]` type in
Stan represents an \(N \times M\) matrix where each column (row) is a unit simplex
of dimension \(N\). In other words, each column (row) of the matrix is a vector
constrained to have non-negative entries that sum to one.

### Definition of a Stochastic Matrix {-}

A column stochastic matrix \(X \in \mathbb{R}^{N \times M}\) is defined such
that each column is a simplex. For column \(m\) (where \(1 \leq m \leq M\)):

$$
X_{n, m} \geq 0 \quad \text{for } 1 \leq n \leq N,
$$

and

$$
\sum_{n=1}^N X_{n, m} = 1.
$$

A row stochastic matrix is any matrix whose transpose is a column stochastic matrix
(i.e. the rows of the matrix are simplexes)


$$
X_{n, m} \geq 0 \quad \text{for } 1 \leq n \leq N,
$$

and

$$
\sum_{m=1}^N X_{n, m} = 1.
$$

This definition ensures that each column (row) of the matrix \(X\) lies on the
\(N-1\) dimensional unit simplex, similar to the `simplex[N]` type, but
extended across multiple columns(rows).

### Inverse Transform for Stochastic Matrix {-}

For the column and row stochastic matrices the inverse transform is the same
as simplex, but applied to each column (row).

### Absolute Jacobian Determinant for the Inverse Transform {-}

The Jacobian determinant of the inverse transform for each column \(m\) in
the matrix is given by the product of the diagonal entries \(J_{n, m}\) of
the lower-triangular Jacobian matrix. This determinant is calculated as:

$$
\left| \det J_m \right| = \prod_{n=1}^{N-1} \left( z_{n, m} (1 - z_{n, m}) \left( 1 - \sum_{n'=1}^{n-1} X_{n', m} \right) \right).
$$

Thus, the overall Jacobian determinant for the entire `column_stochastic_matrix` and `row_stochastic_matrix`
is the product of the determinants for each column (row):

$$
\left| \det J \right| = \prod_{m=1}^{M} \left| \det J_m \right|.
$$

### Transform for Stochastic Matrix {-}

For the column and row stochastic matrices the transform is the same
as simplex, but applied to each column (row).

## Unit vector {#unit-vector.section}

Expand Down
77 changes: 77 additions & 0 deletions src/reference-manual/types.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -674,6 +674,83 @@ iterations, and in either case, with less dispersed parameter
initialization or custom initialization if there are informative
priors for some parameters.

### Stochastic Matrices {-}

A stochastic matrix is a matrix where each column or row is a
unit simplex, meaning that each column (row) vector has non-negative
values that sum to 1. The following example is a \(3 \times 4\)
column-stochastic matrix.

$$
\begin{bmatrix}
0.2 & 0.5 & 0.1 & 0.3 \\
0.3 & 0.3 & 0.6 & 0.4 \\
0.5 & 0.2 & 0.3 & 0.3
\end{bmatrix}
$$

An example of a \(3 \times 4\) row-stochastic matrix is the following.

$$
\begin{bmatrix}
0.2 & 0.5 & 0.1 & 0.2 \\
0.2 & 0.1 & 0.6 & 0.1 \\
0.5 & 0.2 & 0.2 & 0.1
\end{bmatrix}
$$


In the examples above, each column (or row) sums to 1, making the matrices
valid `column_stochastic_matrix` and `row_stochastic_matrix` types.

Column-stochastic matrices are often used in models where
each column represents a probability distribution across a
set of categories such as in multiple multinomial distributions,
factor models, transition matrices in Markov models,
or compositional data analysis.
They can also be used in situations where you need multiple simplexes
of the same dimensionality.

The `column_stochastic_matrix` and `row_stochastic_matrix` types are declared
with row and column sizes. For instance, a matrix `theta` with
3 rows and 4 columns, where each
column is a 3-simplex, is declared like a matrix with 3 rows and 4 columns.

```stan
column_stochastic_matrix[3, 4] theta;
```

A matrix `theta` with 3 rows and 4 columns, where each row is a 4-simplex,
is similarly declared as a matrix with 3 rows and 4 columns.

```stan
row_stochastic_matrix[3, 4] theta;
```

As with simplexes, `column_stochastic_matrix` and `row_stochastic_matrix`
variables are subject to validation, ensuring that each column (row)
satisfies the simplex constraints. This validation accounts for
floating-point imprecision, with checks performed up to a statically
specified accuracy threshold \(\epsilon\).

#### Stability Considerations {-}

In high-dimensional settings, `column_stochastic_matrix` and `row_stochastic_matrix`
types may require careful tuning of the inference
algorithms. To ensure stability:

- **Smaller Step Sizes:** In samplers like Hamiltonian Monte Carlo (HMC),
smaller step sizes can help maintain stability, especially in high dimensions.
- **Higher Target Acceptance Rates:** Setting higher target acceptance
rates can improve the robustness of the sampling process.
- **Longer Warmup Periods:** Increasing the warmup period allows the sampler
to better explore the parameter space before the actual sampling begins.
- **Tighter Optimization Tolerances:** For optimization-based inference,
tighter tolerances with more iterations can yield more accurate results.
- **Custom Initialization:** If prior information about the parameters is
available, custom initialization or less dispersed initialization can lead
to more efficient inference.

### Unit vectors {-}

A unit vector is a vector with a norm of one. For instance, $[0.5,
Expand Down

0 comments on commit 1782f1a

Please sign in to comment.