RVV support qd8-f32-qc8w-gemm/igemm and qd8-f32-qc4w-gemm #7932

ken-unger · 2025-03-03T06:20:39Z

Add RVV production support for qd8-f32-qc8w-gemm/igemm
Add RVV production support for qd8-f32-qc4w-gemm
Add RVV test cases for qs8-qc8w-gemm/igemm missing from previous PR.

ken-unger · 2025-03-03T06:23:23Z

test/qd8-f32-qc8w-gemm-minmax.cc

@@ -277,254 +277,6 @@ std::vector<GemmTestParams> CreateTests1(
  return gemm_tests;
 }

-#if XNN_ENABLE_RISCV_VECTOR && XNN_ARCH_RISCV


We don't need the separate test version here, just declare nr properly in the test case as below.

This is generated code, the generator needs to be fixed instead.

BTW, this was missed in a previous review as well. I tried to fix it in #7888, but the newly generated tests are failing. Can you please take a look? You'll have to fix this in order to address this anyways.

ken-unger · 2025-03-03T06:24:31Z

test/qs8-qc8w-gemm-minmax-fp32.cc

@@ -3851,3 +3851,50 @@ INSTANTIATE_TEST_SUITE_P(
      return info.param.test_name;
    });

+#if XNN_ARCH_RISCV && XNN_ENABLE_RISCV_VECTOR


I thought I had committed this and below in a previous PR, but clearly not. Including now,

ken-unger · 2025-03-03T06:27:27Z

src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x4v-minmax-rvv.c

@@ -0,0 +1,116 @@
+// Auto-generated file. Do not edit!


Support for this was already included within src/qs8-igemm/rvv.c.in in a previous PR. No changes required.

ken-unger · 2025-03-03T06:33:26Z

@dsharlet please review. Thank you.

Test cases all pass on bananapif3 (vlen=256)
I've chosen 4x4v for the production version for all as it wasn't clear if 7x4v (max) was actually better with bench_model.
We should probably delete 8x4v since that is a little misleading, with vector register thrashing and hence significantly lower performance. 7x4v is the real maximum.

dsharlet · 2025-03-03T20:40:37Z

test/qd8-f32-qc8w-gemm-minmax.cc

@@ -277,254 +277,6 @@ std::vector<GemmTestParams> CreateTests1(
  return gemm_tests;
 }

-#if XNN_ENABLE_RISCV_VECTOR && XNN_ARCH_RISCV


This is generated code, the generator needs to be fixed instead.

BTW, this was missed in a previous review as well. I tried to fix it in #7888, but the newly generated tests are failing. Can you please take a look? You'll have to fix this in order to address this anyways.

dsharlet · 2025-03-03T20:41:48Z

cmake/gen/rvv_microkernels.cmake

+  src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x4v-minmax-rvv.c
+  src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-5x4v-minmax-rvv.c
+  src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-6x4v-minmax-rvv.c
+  src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-7x4v-minmax-rvv.c


.bzl file needs updating as well. This (and the .bzl files) are generated code, see file header for details (tools/update-microkernels.py)

RVV support qd8-f32-qc8w-gemm/igemm and qd8-f32-qc4w-gemm

f50be1b

ken-unger commented Mar 3, 2025

View reviewed changes

dsharlet reviewed Mar 3, 2025

View reviewed changes

update from tools/update-microkernels

0cc27f9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RVV support qd8-f32-qc8w-gemm/igemm and qd8-f32-qc4w-gemm #7932

RVV support qd8-f32-qc8w-gemm/igemm and qd8-f32-qc4w-gemm #7932

ken-unger commented Mar 3, 2025

ken-unger Mar 3, 2025

dsharlet Mar 3, 2025

ken-unger Mar 3, 2025

ken-unger Mar 3, 2025

ken-unger commented Mar 3, 2025

dsharlet Mar 3, 2025

dsharlet Mar 3, 2025

RVV support qd8-f32-qc8w-gemm/igemm and qd8-f32-qc4w-gemm #7932

Are you sure you want to change the base?

RVV support qd8-f32-qc8w-gemm/igemm and qd8-f32-qc4w-gemm #7932

Conversation

ken-unger commented Mar 3, 2025

ken-unger Mar 3, 2025

Choose a reason for hiding this comment

dsharlet Mar 3, 2025

Choose a reason for hiding this comment

ken-unger Mar 3, 2025

Choose a reason for hiding this comment

ken-unger Mar 3, 2025

Choose a reason for hiding this comment

ken-unger commented Mar 3, 2025

dsharlet Mar 3, 2025

Choose a reason for hiding this comment

dsharlet Mar 3, 2025

Choose a reason for hiding this comment