Skip to content

Commit d41b803

Browse files
LoCoBench Botclaude
andcommitted
refactor: move configs/ and scripts/ to repo root
Ralph placed these under ralph/ but they belong at the repo root since they're primary project artifacts, not ralph-specific files. Updated README.md paths to match. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent e298e04 commit d41b803

21 files changed

+240
-10
lines changed

README.md

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -62,11 +62,9 @@ Each benchmark directory contains:
6262

6363
## Metrics Extraction Pipeline
6464

65-
The `ralph/scripts/` directory contains a stdlib-only Python 3.10+ pipeline for extracting deterministic metrics from Harbor run output:
65+
The `scripts/` directory contains a stdlib-only Python 3.10+ pipeline for extracting deterministic metrics from Harbor run output:
6666

6767
```bash
68-
cd ralph/
69-
7068
# Generate evaluation report from Harbor runs
7169
python3 scripts/generate_eval_report.py \
7270
--runs-dir /path/to/runs/official/ \
@@ -75,7 +73,7 @@ python3 scripts/generate_eval_report.py \
7573
# Generate LLM judge context files
7674
python3 -m scripts.ccb_metrics.judge_context \
7775
--runs-dir /path/to/runs/official/ \
78-
--benchmarks-dir ../benchmarks/ \
76+
--benchmarks-dir ./benchmarks/ \
7977
--output-dir ./judge_contexts/
8078
```
8179

@@ -85,23 +83,23 @@ The report generator produces:
8583
- `harness_configs.json` — exact harness configuration per run
8684
- CSV files per table for downstream analysis
8785

88-
See `python3 ralph/scripts/generate_eval_report.py --help` for all options.
86+
See `python3 scripts/generate_eval_report.py --help` for all options.
8987

9088
---
9189

9290
## Running with Harbor
9391

94-
Each benchmark has a shell runner in `ralph/configs/` that executes all tasks across the 3-config matrix:
92+
Each benchmark has a shell runner in `configs/` that executes all tasks across the 3-config matrix:
9593

9694
```bash
9795
# Run all 50 LoCoBench tasks across 3 configs
98-
bash ralph/configs/locobench_3config.sh
96+
bash configs/locobench_3config.sh
9997

10098
# Run only the baseline config
101-
bash ralph/configs/locobench_3config.sh --baseline-only
99+
bash configs/locobench_3config.sh --baseline-only
102100

103101
# Run only MCP-Full config
104-
bash ralph/configs/locobench_3config.sh --full-only
102+
bash configs/locobench_3config.sh --full-only
105103
```
106104

107105
Available runners: `locobench_3config.sh`, `swebenchpro_3config.sh`, `bigcode_3config.sh`, `k8s_docs_3config.sh`.

ralph/.last-branch

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
ralph/benchmark-execution-pipeline

0 commit comments

Comments
 (0)