This code demonstrates a usage of cuBLAS hpr2
function to compute a packed Hermitian rank-2 update
A = | 1.1 + 1.2j | 2.3 + 2.4j |
| 3.5 + 3.6j | 4.7 + 4.8j |
x = | 5.1 + 6.2j | 7.3 + 8.4j |
See documentation for further details.
All GPUs supported by CUDA Toolkit (https://developer.nvidia.com/cuda-gpus)
Linux
Windows
x86_64
ppc64le
arm64-sbsa
- A Linux/Windows system with recent NVIDIA drivers.
- CMake version 3.18 minimum
$ mkdir build
$ cd build
$ cmake ..
$ make
Make sure that CMake finds expected CUDA Toolkit. If that is not the case you can add argument -DCMAKE_CUDA_COMPILER=/path/to/cuda/bin/nvcc
to cmake command.
$ mkdir build
$ cd build
$ cmake -DCMAKE_GENERATOR_PLATFORM=x64 ..
$ Open cublas_examples.sln project in Visual Studio and build
$ ./cublas_hpr2_example
Sample example output:
AP
1.10 + 1.20j 2.30 + 2.40j
3.50 + 3.60j 4.70 + 4.80j
=====
x
5.10 + 6.20j 7.30 + 8.40j
=====
y
1.10 + 2.20j 3.30 + 4.40j
=====
AP
48.40 + 0.00j 133.20 + 0.00j
82.92 + 26.04j 4.70 + 4.80j
=====