support hardware testing (#348)

shanjiang7 · web-flow · commit 526bca382137 · 2025-11-09T12:02:45.000+08:00
diff --git a/README.md b/README.md
@@ -96,6 +96,9 @@ python -m graph_net.plot_violin \
 
 The scripts are designed to process a file structure as `/benchmark_path/category_name/`, and items on x-axis are identified by name of the sub-directories. After executing, several summary plots of result in categories (model tasks, libraries...) will be exported to `$GRAPH_NET_BENCHMARK_PATH`.
 
+### Hardware Regression Testing
+We also provide a two-step workflow that validates compiler correctness and performance against a "golden" reference, which is crucial for hardware-specific testing and regression tracking. Details can be found in this [guide](./docs/hardware_test.md).
+
 ### 🧱 Construction & Contribution Guide
 Want to understand how GraphNet is built or contribute new samples?
 Check out the [Construction Guide](./docs/README_contribute.md) for details on the extraction and validation workflow.
diff --git a/docs/hardware_test.md b/docs/hardware_test.md
@@ -0,0 +1,20 @@
+## Hardware Regression Testing
+### Step 1: Generate Reference Data
+First, use `graph_net.paddle.test_reference_device` on a trusted setting (e.g., a specific hardware/compiler version) to generate baseline logs and output files.
+```bash
+python -m graph_net.paddle.test_reference_device \
+    --model-path /path/to/all_models/ \
+    --reference-dir ./gold_reference \
+    --compiler cinn \
+    --device cuda
+# --reference-dir: (Required) Directory where the output .log (performance/config) and .pdout (output tensors) files will be saved.
+# --compiler: Specifies the compiler backend.
+```
+### Step 2: Run Regression Test
+After changing hardware, run the correctness test script. This script reads the reference data, re-runs the models using the exact same configuration, and compares the new results against the "golden" reference.
+```bash
+python -m graph_net.paddle.test_device_correctness \
+    --reference-dir ./golden_reference \
+    --device cuda
+```
+This script will report any failures (e.g., compilation errors, output mismatches) and print a performance comparison (speedup/slowdown) against the reference log, allowing you to quickly identify regressions.