Skip to content

Commit b37c56f

Browse files
sjarmakclaude
andcommitted
fix: align task 135 instruction text with oracle answer
The oracle for ccx-vuln-remed-135 (Chromium sandbox audit) had correct files/symbols but the instruction text referenced wrong class names and directories (BPFBasePolicy→BaselinePolicy, content/browser/→sandbox/policy/linux/). Verified all oracle paths against sg-evals/chromium--2d05e315 mirror. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 81b8f7d commit b37c56f

File tree

3 files changed

+3
-3
lines changed

3 files changed

+3
-3
lines changed

benchmarks/ccb_mcp_security/ccx-vuln-remed-135/instruction.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Your Task
44

5-
Audit the security sandbox implementation for Chromium's renderer processes. Find all C++ source files in `chromium/chromium` that implement the sandbox: 1. The file under `sandbox/linux/` that implements `BPFBasePolicy` — the seccomp-BPF base policy for Linux sandboxing. 2. The file under `sandbox/win/src/` that defines `TargetPolicy` — the Windows sandbox policy configuration. 3. The file under `content/browser/` that configures sandbox parameters for renderer process launch (look for `SetRendererSandboxPolicy` or `GetRendererSandboxType`). 4. The file under `sandbox/policy/` that defines `SandboxType` enum and sandbox profile mappings. Report each file path and key class/function.
5+
Audit the security sandbox implementation for Chromium's renderer processes. Find all C++ source files in `chromium/chromium` that implement the sandbox: 1. The file under `sandbox/linux/seccomp-bpf-helpers/` that implements `BaselinePolicy` — the seccomp-BPF baseline policy for Linux sandboxing. 2. The file under `sandbox/win/src/` that implements `PolicyBase` — the Windows sandbox policy configuration (concrete implementation of `TargetPolicy`). 3. The file under `sandbox/policy/linux/` that implements `RendererProcessPolicy` — the BPF policy specific to renderer process sandboxing (look for `EvaluateSyscall`). 4. The file under `sandbox/policy/` that declares sandbox type utilities and the `SandboxType` enum mapping. Report each file path and key class/function.
66

77
## Context
88

benchmarks/ccb_mcp_security/ccx-vuln-remed-135/instruction_mcp.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ If MCP search returns no results:
7474

7575
## Your Task
7676

77-
Audit the security sandbox implementation for Chromium's renderer processes. Find all C++ source files in `chromium/chromium` that implement the sandbox: 1. The file under `sandbox/linux/` that implements `BPFBasePolicy` — the seccomp-BPF base policy for Linux sandboxing. 2. The file under `sandbox/win/src/` that defines `TargetPolicy` — the Windows sandbox policy configuration. 3. The file under `content/browser/` that configures sandbox parameters for renderer process launch (look for `SetRendererSandboxPolicy` or `GetRendererSandboxType`). 4. The file under `sandbox/policy/` that defines `SandboxType` enum and sandbox profile mappings. Report each file path and key class/function.
77+
Audit the security sandbox implementation for Chromium's renderer processes. Find all C++ source files in `chromium/chromium` that implement the sandbox: 1. The file under `sandbox/linux/seccomp-bpf-helpers/` that implements `BaselinePolicy` — the seccomp-BPF baseline policy for Linux sandboxing. 2. The file under `sandbox/win/src/` that implements `PolicyBase` — the Windows sandbox policy configuration (concrete implementation of `TargetPolicy`). 3. The file under `sandbox/policy/linux/` that implements `RendererProcessPolicy` — the BPF policy specific to renderer process sandboxing (look for `EvaluateSyscall`). 4. The file under `sandbox/policy/` that declares sandbox type utilities and the `SandboxType` enum mapping. Report each file path and key class/function.
7878

7979
## Context
8080

benchmarks/ccb_mcp_security/ccx-vuln-remed-135/tests/oracle_answer.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
{"repo": "sg-evals/chromium--2d05e315", "path": "sandbox/linux/seccomp-bpf-helpers/baseline_policy.cc", "symbol": "BaselinePolicy"},
1818
{"repo": "sg-evals/chromium--2d05e315", "path": "sandbox/policy/linux/bpf_renderer_policy_linux.cc", "symbol": "RendererProcessPolicy"}
1919
],
20-
"text": "Audit the security sandbox implementation for Chromium's renderer processes. Find all C++ source files in `chromium/chromium` that implement the sandbox: 1. The file under `sandbox/linux/` that implements `BPFBasePolicy` — the seccomp-BPF base policy for Linux sandboxing. 2. The file under `sandbox/win/src/` that defines `TargetPolicy` — the Windows sandbox policy configuration. 3. The file under `content/browser/` that configures sandbox parameters for renderer process launch (look for `SetRendererSandboxPolicy` or `GetRendererSandboxType`). 4. The file under `sandbox/policy/` that defines `SandboxType` enum and sandbox profile mappings. Report each file path and key class/function.",
20+
"text": "Audit the security sandbox implementation for Chromium's renderer processes. Find all C++ source files in `chromium/chromium` that implement the sandbox: 1. The file under `sandbox/linux/seccomp-bpf-helpers/` that implements `BaselinePolicy` — the seccomp-BPF baseline policy for Linux sandboxing. 2. The file under `sandbox/win/src/` that implements `PolicyBase` — the Windows sandbox policy configuration (concrete implementation of `TargetPolicy`). 3. The file under `sandbox/policy/linux/` that implements `RendererProcessPolicy` — the BPF policy specific to renderer process sandboxing (look for `EvaluateSyscall`). 4. The file under `sandbox/policy/` that declares sandbox type utilities and the `SandboxType` enum mapping. Report each file path and key class/function.",
2121
"_metadata": {
2222
"task_id": "CCX-vuln-remed-135",
2323
"oracle_source": "github_verification",

0 commit comments

Comments
 (0)