Skip to content

Commit c861f42

Browse files
sjarmakclaude
andcommitted
feat: curate oracle_answer.json for all 130 new MCP tasks (142-271)
- Populated oracle_answer.json files for all 130 stub tasks across 10 suites: ccb_mcp_incident (142-150), ccb_mcp_domain (151-160), ccb_mcp_security (161-170), ccb_mcp_crossrepo_tracing (171-181), ccb_mcp_compliance (182-194), ccb_mcp_migration (195-207), ccb_mcp_crossorg (208-222), ccb_mcp_org (223-237), ccb_mcp_platform (238-252), ccb_mcp_crossrepo (253-271) - Synced required_files into task_spec.json via new sync_oracle_files.py (preserves evaluation.checks/search_pattern, unlike hydrate_task_specs.py) - Added 4 new fixture repo_sets: multi-org-go, nodejs-web-stack, prometheus-monitoring, python-ml-stack - Added oracle_curation_log.json to .gitignore - Fixed curate_oracle.py backoff: max 120s, base delay 1.0s Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 5f831b6 commit c861f42

File tree

238 files changed

+5430
-1974
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

238 files changed

+5430
-1974
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,9 @@ harbor-ccb_crossrepo-dataset/
4343
vendor/DependEval/
4444
vendor/dependeval_repos/
4545

46+
# Oracle curation logs (auto-generated, not needed for benchmark)
47+
oracle_curation_log.json
48+
4649
# Credentials
4750
*.key
4851
*.pem

benchmarks/ccb_mcp_compliance/ccx-compliance-057/tests/task_spec.json

Lines changed: 22 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,13 @@
55
"category": "F",
66
"mcp_suite": "ccb_mcp_compliance",
77
"prd": {
8-
"user_story": "As a developer, I want to: Find ALL files in `grafana/grafana` that form the SSO settings control across 4 layers: 1. Feature Flag Definition — the registry where `ssoSettingsApi` is defined and generated constants. 2. SSO Settings Infrastructure — Service interface, Reloadable interface, SSOSettings data model, and SSOSettingsStore database layer. 3. API and Authentication Wiring — REST API endpoint registration with access control middleware, SocialService provider, and authentication client registration. 4. Access Control and DI Registration — access control evaluators and the ProvideService dependency injection function.",
9-
"constraints": ["Provide specific file paths and repository names in your answer.", "Write your findings to /workspace/answer.json."],
8+
"user_story": "As a developer, I want to: Find ALL files in `grafana/grafana` that form the SSO settings control across 4 layers: 1. Feature Flag Definition \u2014 the registry where `ssoSettingsApi` is defined and generated constants. 2. SSO Settings Infrastructure \u2014 Service interface, Reloadable interface, SSOSettings data model, and SSOSettingsStore database layer. 3. API and Authentication Wiring \u2014 REST API endpoint registration with access control middleware, SocialService provider, and authentication client registration. 4. Access Control and DI Registration \u2014 access control evaluators and the ProvideService dependency injection function.",
9+
"constraints": [
10+
"Provide specific file paths and repository names in your answer.",
11+
"Write your findings to /workspace/answer.json."
12+
],
1013
"success_definition": "Agent successfully identifies relevant files and symbols across all repos in the grafana-observability fixture.",
11-
"seed_prompt": "Find ALL files in `grafana/grafana` that form the SSO settings control across 4 layers: 1. Feature Flag Definition the registry where `ssoSettingsApi` is defined and generated constants. 2. SSO Settings Infrastructure Service interface, Reloadable interface, SSOSettings data model, and SSOSettingsStore database layer. 3. API and Authentication Wiring REST API endpoint registration with access control middleware, SocialService provider, and authentication client registration. 4. Access Control and DI Registration access control evaluators and the ProvideService dependency injection function."
14+
"seed_prompt": "Find ALL files in `grafana/grafana` that form the SSO settings control across 4 layers: 1. Feature Flag Definition \u2014 the registry where `ssoSettingsApi` is defined and generated constants. 2. SSO Settings Infrastructure \u2014 Service interface, Reloadable interface, SSOSettings data model, and SSOSettingsStore database layer. 3. API and Authentication Wiring \u2014 REST API endpoint registration with access control middleware, SocialService provider, and authentication client registration. 4. Access Control and DI Registration \u2014 access control evaluators and the ProvideService dependency injection function."
1215
},
1316
"artifacts": {
1417
"repo_set_id": "grafana-observability",
@@ -20,20 +23,26 @@
2023
}
2124
},
2225
"evaluation": {
23-
"modes": ["deterministic"],
26+
"modes": [
27+
"deterministic"
28+
],
2429
"checks": [
25-
{
26-
"type": "file_set_match",
27-
"params": {
28-
"search_pattern": "",
29-
"file_filter": ""
30-
}
31-
}
32-
],
30+
{
31+
"type": "file_set_match",
32+
"params": {
33+
"search_pattern": "SocialService OR ProvideService",
34+
"file_filter": ""
35+
}
36+
}
37+
],
3338
"eval_script": "/tests/eval.sh",
3439
"pass_exit_code": 0
3540
},
3641
"logging": {
37-
"required_metrics": ["oracle_coverage", "time_to_first_oracle_hit_ms", "unique_repos_touched"]
42+
"required_metrics": [
43+
"oracle_coverage",
44+
"time_to_first_oracle_hit_ms",
45+
"unique_repos_touched"
46+
]
3847
}
3948
}

benchmarks/ccb_mcp_compliance/ccx-compliance-115/tests/task_spec.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@
5656
{
5757
"type": "file_set_match",
5858
"params": {
59-
"search_pattern": "",
59+
"search_pattern": "Compliance OR Audit OR Django",
6060
"file_filter": ""
6161
}
6262
}

benchmarks/ccb_mcp_compliance/ccx-compliance-118/tests/task_spec.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@
6868
{
6969
"type": "file_set_match",
7070
"params": {
71-
"search_pattern": "",
71+
"search_pattern": "RelatedFieldListFilter OR ForeignKey OR ListFilter OR ChangeList",
7272
"file_filter": ""
7373
}
7474
}

benchmarks/ccb_mcp_compliance/ccx-compliance-182/tests/task_spec.json

Lines changed: 32 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -6,33 +6,54 @@
66
"mcp_suite": "ccb_mcp_compliance",
77
"prd": {
88
"user_story": "As a developer, I want to: Audit kubernetes/kubernetes for completeness of RBAC verb handling. Find all Go source files that define the set of allowed verbs (get, list, watch, create, update, patch, delete, deletecollection) and validate that all API handlers enforce these verb constraints.",
9-
"constraints": ["Provide specific file paths and repository names in your answer.", "Write your findings to /workspace/answer.json."],
9+
"constraints": [
10+
"Provide specific file paths and repository names in your answer.",
11+
"Write your findings to /workspace/answer.json."
12+
],
1013
"success_definition": "Agent successfully identifies relevant files and symbols across all repos in the kubernetes-ecosystem fixture.",
1114
"seed_prompt": "Audit kubernetes/kubernetes for completeness of RBAC verb handling. Find all Go source files that define the set of allowed verbs (get, list, watch, create, update, patch, delete, deletecollection) and validate that all API handlers enforce these verb constraints."
1215
},
1316
"artifacts": {
1417
"repo_set_id": "kubernetes-ecosystem",
1518
"oracle": {
16-
"required_files": [],
19+
"required_files": [
20+
"sg-evals/kubernetes--v1.32.0/pkg/apis/rbac/v1alpha1/zz_generated.conversion.go",
21+
"sg-evals/kubernetes--v1.32.0/plugin/pkg/auth/authorizer/rbac/rbac_test.go",
22+
"sg-evals/kubernetes--v1.32.0/pkg/registry/rbac/clusterrole/policybased/storage_test.go",
23+
"sg-evals/kubernetes--v1.32.0/test/integration/authutil/authutil.go",
24+
"sg-evals/kubernetes--v1.32.0/test/e2e/storage/drivers/csi.go",
25+
"sg-evals/kubernetes--v1.32.0/cmd/kubeadm/app/phases/addons/dns/dns_test.go",
26+
"sg-evals/api--v0.32.0/rbac/v1/types_swagger_doc_generated.go",
27+
"sg-evals/api--v0.32.0/rbac/v1beta1/types.go",
28+
"sg-evals/api--v0.32.0/rbac/v1/generated.proto",
29+
"sg-evals/api--v0.32.0/rbac/v1beta1/generated.proto",
30+
"sg-evals/client-go--v0.32.0/applyconfigurations/internal/internal.go"
31+
],
1732
"required_symbols": [],
1833
"required_references": [],
1934
"dependency_chains": []
2035
}
2136
},
2237
"evaluation": {
23-
"modes": ["deterministic"],
38+
"modes": [
39+
"deterministic"
40+
],
2441
"checks": [
25-
{
26-
"type": "file_set_match",
27-
"params": {
28-
"search_pattern": ""
29-
}
30-
}
31-
],
42+
{
43+
"type": "file_set_match",
44+
"params": {
45+
"search_pattern": "ClusterRole"
46+
}
47+
}
48+
],
3249
"eval_script": "/tests/eval.sh",
3350
"pass_exit_code": 0
3451
},
3552
"logging": {
36-
"required_metrics": ["oracle_coverage", "time_to_first_oracle_hit_ms", "unique_repos_touched"]
53+
"required_metrics": [
54+
"oracle_coverage",
55+
"time_to_first_oracle_hit_ms",
56+
"unique_repos_touched"
57+
]
3758
}
3859
}

benchmarks/ccb_mcp_compliance/ccx-compliance-183/tests/task_spec.json

Lines changed: 33 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -6,33 +6,55 @@
66
"mcp_suite": "ccb_mcp_compliance",
77
"prd": {
88
"user_story": "As a developer, I want to: Find all Go source files in kubernetes/kubernetes that enforce the API deprecation policy: the deprecated API version warning injection, the removal version validation, and the minimum supported API version checks in the API server.",
9-
"constraints": ["Provide specific file paths and repository names in your answer.", "Write your findings to /workspace/answer.json."],
9+
"constraints": [
10+
"Provide specific file paths and repository names in your answer.",
11+
"Write your findings to /workspace/answer.json."
12+
],
1013
"success_definition": "Agent successfully identifies relevant files and symbols across all repos in the kubernetes-ecosystem fixture.",
1114
"seed_prompt": "Find all Go source files in kubernetes/kubernetes that enforce the API deprecation policy: the deprecated API version warning injection, the removal version validation, and the minimum supported API version checks in the API server."
1215
},
1316
"artifacts": {
1417
"repo_set_id": "kubernetes-ecosystem",
1518
"oracle": {
16-
"required_files": [],
19+
"required_files": [
20+
"sg-evals/kubernetes--v1.32.0/staging/src/k8s.io/apiserver/pkg/endpoints/deprecation/deprecation.go",
21+
"sg-evals/kubernetes--v1.32.0/staging/src/k8s.io/apiextensions-apiserver/test/integration/deprecation_test.go",
22+
"sg-evals/kubernetes--v1.32.0/staging/src/k8s.io/apiserver/pkg/registry/rest/update.go",
23+
"sg-evals/kubernetes--v1.32.0/pkg/registry/core/node/strategy.go",
24+
"sg-evals/kubernetes--v1.32.0/pkg/registry/core/serviceaccount/strategy.go",
25+
"sg-evals/kubernetes--v1.32.0/pkg/registry/discovery/endpointslice/strategy.go",
26+
"sg-evals/kubernetes--v1.32.0/cmd/kubeadm/app/util/config/upgradeconfiguration.go",
27+
"sg-evals/kubernetes--v1.32.0/cmd/kubeadm/app/cmd/options/generic.go",
28+
"sg-evals/kubernetes--v1.32.0/pkg/api/pod/warnings.go",
29+
"sg-evals/kubernetes--v1.32.0/pkg/api/service/warnings.go",
30+
"sg-evals/kubernetes--v1.32.0/pkg/api/persistentvolumeclaim/util.go",
31+
"sg-evals/etcd-io-etcd/server/embed/config.go"
32+
],
1733
"required_symbols": [],
1834
"required_references": [],
1935
"dependency_chains": []
2036
}
2137
},
2238
"evaluation": {
23-
"modes": ["deterministic"],
39+
"modes": [
40+
"deterministic"
41+
],
2442
"checks": [
25-
{
26-
"type": "file_set_match",
27-
"params": {
28-
"search_pattern": ""
29-
}
30-
}
31-
],
43+
{
44+
"type": "file_set_match",
45+
"params": {
46+
"search_pattern": "Kubernetes OR Deprecation OR Policy"
47+
}
48+
}
49+
],
3250
"eval_script": "/tests/eval.sh",
3351
"pass_exit_code": 0
3452
},
3553
"logging": {
36-
"required_metrics": ["oracle_coverage", "time_to_first_oracle_hit_ms", "unique_repos_touched"]
54+
"required_metrics": [
55+
"oracle_coverage",
56+
"time_to_first_oracle_hit_ms",
57+
"unique_repos_touched"
58+
]
3759
}
3860
}

benchmarks/ccb_mcp_compliance/ccx-compliance-184/tests/task_spec.json

Lines changed: 34 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -6,33 +6,56 @@
66
"mcp_suite": "ccb_mcp_compliance",
77
"prd": {
88
"user_story": "As a developer, I want to: Find all Go source files in kubernetes/kubernetes that validate pod security context fields: the runAsNonRoot enforcement, the privileged container check, the capabilities validation, and the seccompProfile enforcement.",
9-
"constraints": ["Provide specific file paths and repository names in your answer.", "Write your findings to /workspace/answer.json."],
9+
"constraints": [
10+
"Provide specific file paths and repository names in your answer.",
11+
"Write your findings to /workspace/answer.json."
12+
],
1013
"success_definition": "Agent successfully identifies relevant files and symbols across all repos in the kubernetes-ecosystem fixture.",
1114
"seed_prompt": "Find all Go source files in kubernetes/kubernetes that validate pod security context fields: the runAsNonRoot enforcement, the privileged container check, the capabilities validation, and the seccompProfile enforcement."
1215
},
1316
"artifacts": {
1417
"repo_set_id": "kubernetes-ecosystem",
1518
"oracle": {
16-
"required_files": [],
19+
"required_files": [
20+
"sg-evals/kubernetes--v1.32.0/pkg/apis/core/validation/validation.go",
21+
"sg-evals/kubernetes--v1.32.0/pkg/apis/core/validation/validation_test.go",
22+
"sg-evals/kubernetes--v1.32.0/pkg/apis/core/types.go",
23+
"sg-evals/kubernetes--v1.32.0/pkg/securitycontext/accessors.go",
24+
"sg-evals/kubernetes--v1.32.0/pkg/securitycontext/util.go",
25+
"sg-evals/kubernetes--v1.32.0/pkg/kubelet/kuberuntime/security_context_others.go",
26+
"sg-evals/kubernetes--v1.32.0/pkg/kubelet/kuberuntime/security_context_windows.go",
27+
"sg-evals/kubernetes--v1.32.0/pkg/kubelet/kuberuntime/security_context_others_test.go",
28+
"sg-evals/kubernetes--v1.32.0/pkg/kubelet/kuberuntime/security_context_windows_test.go",
29+
"sg-evals/kubernetes--v1.32.0/staging/src/k8s.io/pod-security-admission/policy/check_runAsNonRoot.go",
30+
"sg-evals/kubernetes--v1.32.0/staging/src/k8s.io/pod-security-admission/test/fixtures_runAsNonRoot.go",
31+
"sg-evals/kubernetes--v1.32.0/staging/src/k8s.io/api/core/v1/types.go",
32+
"sg-evals/kubernetes--v1.32.0/cmd/kubelet/app/server.go"
33+
],
1734
"required_symbols": [],
1835
"required_references": [],
1936
"dependency_chains": []
2037
}
2138
},
2239
"evaluation": {
23-
"modes": ["deterministic"],
40+
"modes": [
41+
"deterministic"
42+
],
2443
"checks": [
25-
{
26-
"type": "file_set_match",
27-
"params": {
28-
"search_pattern": ""
29-
}
30-
}
31-
],
44+
{
45+
"type": "file_set_match",
46+
"params": {
47+
"search_pattern": "Kubernetes OR Security OR Context"
48+
}
49+
}
50+
],
3251
"eval_script": "/tests/eval.sh",
3352
"pass_exit_code": 0
3453
},
3554
"logging": {
36-
"required_metrics": ["oracle_coverage", "time_to_first_oracle_hit_ms", "unique_repos_touched"]
55+
"required_metrics": [
56+
"oracle_coverage",
57+
"time_to_first_oracle_hit_ms",
58+
"unique_repos_touched"
59+
]
3760
}
3861
}

benchmarks/ccb_mcp_compliance/ccx-compliance-185/tests/task_spec.json

Lines changed: 33 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -6,33 +6,55 @@
66
"mcp_suite": "ccb_mcp_compliance",
77
"prd": {
88
"user_story": "As a developer, I want to: Find all Go source files in prometheus/prometheus that enforce Prometheus metric naming conventions: the metric name validation (snake_case, unit suffix), the label name validation, and the help text presence check in the client registration.",
9-
"constraints": ["Provide specific file paths and repository names in your answer.", "Write your findings to /workspace/answer.json."],
9+
"constraints": [
10+
"Provide specific file paths and repository names in your answer.",
11+
"Write your findings to /workspace/answer.json."
12+
],
1013
"success_definition": "Agent successfully identifies relevant files and symbols across all repos in the prometheus-monitoring fixture.",
1114
"seed_prompt": "Find all Go source files in prometheus/prometheus that enforce Prometheus metric naming conventions: the metric name validation (snake_case, unit suffix), the label name validation, and the help text presence check in the client registration."
1215
},
1316
"artifacts": {
1417
"repo_set_id": "prometheus-monitoring",
1518
"oracle": {
16-
"required_files": [],
19+
"required_files": [
20+
"prometheus/prometheus/config/config.go",
21+
"prometheus/prometheus/config/config_test.go",
22+
"prometheus/prometheus/scrape/scrape.go",
23+
"prometheus/prometheus/scrape/scrape_test.go",
24+
"prometheus/prometheus/scrape/manager_test.go",
25+
"prometheus/prometheus/model/labels/labels_common.go",
26+
"prometheus/prometheus/model/rulefmt/rulefmt.go",
27+
"prometheus/prometheus/model/textparse/protobufparse.go",
28+
"prometheus/prometheus/cmd/prometheus/main.go",
29+
"prometheus/prometheus/storage/remote/codec.go",
30+
"prometheus/prometheus/notifier/manager.go",
31+
"prometheus/prometheus/web/ui/mantine-ui/src/promql/utils.ts"
32+
],
1733
"required_symbols": [],
1834
"required_references": [],
1935
"dependency_chains": []
2036
}
2137
},
2238
"evaluation": {
23-
"modes": ["deterministic"],
39+
"modes": [
40+
"deterministic"
41+
],
2442
"checks": [
25-
{
26-
"type": "file_set_match",
27-
"params": {
28-
"search_pattern": ""
29-
}
30-
}
31-
],
43+
{
44+
"type": "file_set_match",
45+
"params": {
46+
"search_pattern": "Prometheus OR Metric OR Naming"
47+
}
48+
}
49+
],
3250
"eval_script": "/tests/eval.sh",
3351
"pass_exit_code": 0
3452
},
3553
"logging": {
36-
"required_metrics": ["oracle_coverage", "time_to_first_oracle_hit_ms", "unique_repos_touched"]
54+
"required_metrics": [
55+
"oracle_coverage",
56+
"time_to_first_oracle_hit_ms",
57+
"unique_repos_touched"
58+
]
3759
}
3860
}

0 commit comments

Comments
 (0)