scripts: add HIP->FlyDSL multi-agent port orchestrator by fsx950223 · Pull Request #666 · ROCm/FlyDSL

fsx950223 · 2026-06-08T08:17:53Z

Summary

Adds scripts/port_hip_to_flydsl_agent.py — an Anthropic-API multi-agent
orchestrator that ports a HIP kernel to FlyDSL from a single natural-language
prompt. A parser agent turns the request into structured config; then a loop of
specialized agents runs until the port is accepted or max iterations is hit:

Analyzer (effort max): reads the HIP source + deps, plans an LLVM-1:1-aligned FlyDSL port; on later iterations diffs FlyDSL↔HIP LLVM IR and the captured kernel trace to refine the plan.
Implementer (effort xhigh): writes the FlyDSL kernel, then must pass a local COMPILE_ONLY gate (a self-written smoke harness; failures are fed back and retried) before going further.
Test author: if no test is given, generates a numerical-correctness pytest using the HIP/aiter kernel as the golden (can build the test from a reference harness).
Evaluator (effort medium): runs accuracy + performance + ATT trace + real device-IR export on GPU, records everything to a per-run performance_<hash>.md, and emits a structured verdict.

Supports local path / URL / git / GitHub-blob HIP sources (clones with deps),
and prompt-driven remote-GPU execution (ssh/srun/podman) for environments where
the test GPU is remote.

Validation — `gemm1_a4w4` (aiter MXFP4 MoE GEMM-1)

The tool ported the aiter MXFP4 MoE GEMM-1 kernel to FlyDSL and validated it
bit-exact against the aiter/HIP gemm1 golden (e8m0 scale bytes + packed-fp4
nibbles match exactly), with 1:1 LLVM intrinsic alignment (e.g.
mfma.scale.f32.16x16x128.f8f6f4 449=449, buffer.load.lds 33=33, barriers
344=344). Real KIMI MXFP4 inputs/layouts, measured on MI355 (gfx950) via
rocprofv3 --kernel-trace --stats (per-dispatch End−Start):

M	FlyDSL	HIP/aiter	ratio (fly/hip)
16	69.0 µs	87.6 µs	0.79 (~21% faster)
64	150.3 µs	163.8 µs	0.92 (~8% faster)

Notes: both kernels produce identical correct output. The HIP launcher
dispatches a MAX_M-derived grid where most workgroups early-exit, while FlyDSL
dispatches the exact needed grid — so part of the speedup is launching fewer
idle workgroups. FlyDSL uses VGPR=72 vs HIP=64 (same SGPR=112, LDS=32 KB). Only
M=16/64 measured.

🤖 Generated with Claude Code

Adds port_hip_to_flydsl_agent.py: an Anthropic-API multi-agent loop (analyze -> implement -> test-author -> evaluate) that ports a HIP kernel to FlyDSL, driven by a single natural-language prompt. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Adds a new Python CLI under scripts/ that orchestrates a multi-agent Anthropic Messages API workflow to port HIP kernels to FlyDSL, including optional test generation and iterative evaluation with artifact capture (IR dumps, perf notes, traces).

Changes:

Introduces scripts/port_hip_to_flydsl_agent.py, implementing the end-to-end analyzer → implementer → (optional) test-author → evaluator loop with local tool execution.
Adds HIP source fetching support (local path, URL download, git/GitHub clone) and structured prompt→config parsing via a dedicated “task parser” agent.
Records iteration artifacts (plan, per-run performance markdown, IR dump directory) and enforces a local COMPILE_ONLY smoke gate before evaluation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    test_reference = str(Path(fields["test_reference"]).resolve()) if fields["test_reference"] else ""
+
+    eval_mode = detect_eval_mode(fields["eval_mode"])
+    if fields["ssh_host"] and fields["eval_mode"] == "auto":
+        eval_mode = "gpu"  # a remote GPU is configured; don't fall back to compile_ir
+    print(f"Evaluation mode: {eval_mode}{' (remote GPU)' if fields['ssh_host'] else ''}")
+    if eval_mode == "gpu" and not trace_skill.exists():
+        print(f"WARNING: capture-kernel-trace skill not found at {trace_skill}; "
+              "the evaluator will record trace as unavailable.")
+
+    return Config(
+        hip_source=hip_source,
+        hip_root=hip_root,
+        kernel_name=fields["kernel_name"],
+        output=output,
+        test_file=Path(fields["test_file"]).resolve() if fields["test_file"] else None,
+        repo_root=repo_root,


…col) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

General task context now reaches all agents; test-construction stays isolated via the explicit per-agent guards + test_reference field. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderfeli · 2026-06-15T11:16:09Z

We don't need this script?

Copilot AI review requested due to automatic review settings June 8, 2026 08:17

Copilot started reviewing on behalf of fsx950223 June 8, 2026 08:18 View session

Copilot AI reviewed Jun 8, 2026

View reviewed changes

fsx950223 and others added 7 commits June 8, 2026 08:23

style: format port_hip_to_flydsl_agent.py (autoflake/ruff/black, 120 …

c88d71d

…col) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Potential fix for pull request finding

34d69c3

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

3be67ae

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

bf92b93

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

563314e

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

style: black-format Copilot auto-fixes

9b0ebf9

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fix: route extra_context to analyzer/implementer too

480ebd8

General task context now reaches all agents; test-construction stays isolated via the explicit per-agent guards + test_reference field. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scripts: add HIP->FlyDSL multi-agent port orchestrator#666

scripts: add HIP->FlyDSL multi-agent port orchestrator#666
fsx950223 wants to merge 8 commits into
mainfrom
add-port-hip-to-flydsl-agent

fsx950223 commented Jun 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderfeli commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

fsx950223 commented Jun 8, 2026

Summary

Validation — gemm1_a4w4 (aiter MXFP4 MoE GEMM-1)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderfeli commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Validation — `gemm1_a4w4` (aiter MXFP4 MoE GEMM-1)