diff --git a/sycl/doc/extensions/experimental/sycl_ext_matrix/sycl_ext_oneapi_matrix.asciidoc b/sycl/doc/extensions/experimental/sycl_ext_matrix/sycl_ext_oneapi_matrix.asciidoc index 1af6f7a72de88..7be806575e41f 100644 --- a/sycl/doc/extensions/experimental/sycl_ext_matrix/sycl_ext_oneapi_matrix.asciidoc +++ b/sycl/doc/extensions/experimental/sycl_ext_matrix/sycl_ext_oneapi_matrix.asciidoc @@ -401,9 +401,14 @@ of the link:sycl_ext_intel_matrix.asciidoc[sycl_ext_intel_matrix] Besides the `Group` and the `joint_matrix` arguments, `joint_matrix_apply` takes a C++ Callable object which is invoked once -for each element of the matrix. This callable object must be invocable -with a single parameter of type `T&`. Commonly, applications pass a -lambda expression. +for each element of the matrix. There are two cases: (1) one matrix is +passed, (2) two matrices are passed. + +===== Unary Operation +In this case, `joint_matrix_apply` takes one `joint_matrix` +argument. The callable object must be invocable with a single +parameter of type `T&`. Commonly, applications pass a lambda +expression. ```c++ namespace sycl::ext::oneapi::experimental::matrix { @@ -427,6 +432,39 @@ joint_matrix_apply(sg, C, [=](T &x) { }); ``` +===== Binary Operation +In this case, `joint_matrix_apply` takes two `joint_matrix` arguments: +`jm0` and `jm1` that have the same `use`, number of rows, number of +columns, and `layout`. `jm0` and `jm1` can be read-only, write-only, +or read and write arguments. The callable object must be invocable +with two parameters `x` and `y` of types `T0&` amd `T1&`, where `x` is +an element from `jm0` and `y` is an element from `jm1`. Moreover, `x` +and `y` are guaranteed to have identical coordinates in their +respective matrices. Commonly, applications pass a lambda expression. + +```c++ +namespace sycl::ext::oneapi::experimental::matrix { + +template +void joint_matrix_apply(Group g, + joint_matrix& jm0, + joint_matrix& jm1, + F&& func); + +} // namespace sycl::ext::oneapi::experimental::matrix +``` + +In the following example, every element `x` of the matrix `C` is +multiplied by `alpha`. The result is returned into the element `y` of +the matrix `D`. + +```c++ +joint_matrix_apply(sg, C, D, [=](const T &x, T &y) { + y = x * alpha; +}); +``` + ==== Prefetch ```c++