-
Notifications
You must be signed in to change notification settings - Fork 231
Add auto-discovery for regression test datasets with --include-local flag #2312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: edge
Are you sure you want to change the base?
Conversation
2667ae5 to
f692f13
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR modernizes the regression testing infrastructure by implementing auto-discovery for test datasets and adding support for local, git-ignored datasets. The changes enable developers to test with confidential or large conversation data without risking accidental commits, while simplifying the process of adding new test datasets—just drop them in real_data/.local/ and they're automatically discovered.
Key changes:
- Auto-discovery mechanism that scans directories matching pattern
<report_id>-<name>/inreal_data/andreal_data/.local/ - New
--include-localpytest flag to opt-in to testing with local datasets - Refactored
datasets.pywithDatasetInfodataclass and discovery functions replacing hardcoded configuration - Updated download script to default to
.local/directory with--commitflag for public datasets
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| delphi/polismath/regression/datasets.py | Core auto-discovery implementation with DatasetInfo dataclass and directory scanning logic |
| delphi/tests/conftest.py | pytest hooks for --include-local flag, dynamic test parametrization, and dataset summary reporting |
| delphi/tests/test_regression.py | Removed hardcoded parametrization in favor of dynamic discovery via conftest.py |
| delphi/tests/test_datasets.py | Unit tests for directory pattern matching, file checking, and dataset discovery logic |
| delphi/tests/download_real_data.py | New positional arguments (report_id, dataset_name) with --commit flag to control download location |
| delphi/tests/README.md | Updated documentation explaining auto-discovery, local datasets, and new download patterns |
| delphi/pyproject.toml | Added local_dataset marker for pytest |
| delphi/.gitignore | Added real_data/.local/ to git ignore list |
| delphi/polismath/regression/init.py | Updated exports to include new discovery functions and DatasetInfo class |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
All Copilot review comments have been addressed: In commit 30edbff:
In commit 80115f9:
|
ca08d46 to
80115f9
Compare
|
Note: rebased all commits to sign them -- hence why the commits appear after the review. Same content, just added signature. |
…flag - Auto-discover datasets from real_data/ and real_data/.local/ based on directory naming pattern <report_id>-<name>/ - Add --include-local pytest flag to include git-ignored local datasets - Add .local/ to .gitignore for confidential/large datasets - Simplify datasets.py with DatasetInfo dataclass and discovery functions - Add conftest.py with pytest hooks for dynamic test parametrization - Update download_real_data.py to default to .local/ with --commit flag - Add unit tests for dataset discovery in test_datasets.py - Update tests/README.md with new documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Use any() instead of bool(list()) in _check_files for efficiency - Add multiple match validation in find_file - Fix pytest.skip() during collection (use empty parametrize instead) - Add directory context comment to test_regression.py usage - Remove unused list_regression_datasets import - Rename TestDirPattern to TestDirectoryPattern - Improve error message in regression_download.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Warn when local dataset shadows a committed dataset with same name - Add test for include_local=True behavior - Add test for name collision warning 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
80115f9 to
143fe28
Compare
Allow datasets to be valid for regression testing without the Clojure math_blob file. This enables testing when database access is unavailable (e.g., when DATABASE_URL is not set). Changes: - DatasetInfo.is_valid now only requires votes, comments, and golden_snapshot - Added has_clojure_reference property to check if Clojure comparison is possible - Updated documentation to clarify math_blob is optional - Added tests for new behavior 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- test_legacy_clojure_regression.py: Replace hardcoded ["biodiversity", "vw"] with auto-discovery using clojure_dataset fixture. Only includes datasets with has_clojure_reference=True (i.e., have math_blob for Clojure comparison). Respects --include-local flag. - regression_download.py: After download, check for missing golden_snapshot.json and offer to create them interactively. Shows command to create later if user declines. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
143fe28 to
6e0b5d7
Compare
Clojure comparison tests only need votes, comments, and math_blob. They compare against the Clojure output, not the Python golden snapshot. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
7ceb99c to
075797b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (9)
delphi/scripts/regression_download.py:363
- The example references
download_real_data.pybut the script file is actually namedregression_download.py. Update to use the correct filenamepython scripts/regression_download.py.
delphi/scripts/regression_download.py:17 - The usage examples reference
download_real_data.pybut the script file is actually namedregression_download.py. Update the references to use the correct filename to avoid confusion.
delphi/scripts/regression_download.py:31 - The examples reference
download_real_data.pybut the script file is actually namedregression_download.py. Update to use the correct filenamepython scripts/regression_download.py.
delphi/scripts/regression_download.py:355 - The example references
download_real_data.pybut the script file is actually namedregression_download.py. Update to use the correct filenamepython scripts/regression_download.py.
delphi/scripts/regression_download.py:359 - The example references
download_real_data.pybut the script file is actually namedregression_download.py. Update to use the correct filenamepython scripts/regression_download.py.
delphi/scripts/regression_download.py:367 - The example references
download_real_data.pybut the script file is actually namedregression_download.py. Update to use the correct filenamepython scripts/regression_download.py.
delphi/scripts/regression_download.py:20 - The usage example references
download_real_data.pybut the script file is actually namedregression_download.py. Update to use the correct filename.
delphi/scripts/regression_download.py:26 - The usage example references
download_real_data.pybut the script file is actually namedregression_download.py. Update to use the correct filename.
delphi/scripts/regression_download.py:351 - The example references
download_real_data.pybut the script file is actually namedregression_download.py. Update to use the correct filenamepython scripts/regression_download.py.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- _compute_vote_stats: Replace per-row/per-column loops with numpy vectorized operations using boolean masks and axis-based sums. This eliminates O(rows + cols) Python loops. - bench_update_votes.py: Make standalone by accepting CSV path directly instead of depending on datasets package. Add TODO for using datasets package once PR compdemocracy#2312 is merged. Combined with pivot_table optimization, achieves ~10x speedup on bg2050 dataset (1M votes): 18-30s -> 2.5s (~400k votes/sec). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
|
Ready for human review and merge :-) |
- _compute_vote_stats: Replace per-row/per-column loops with numpy vectorized operations using boolean masks and axis-based sums. This eliminates O(rows + cols) Python loops. - bench_update_votes.py: Make standalone by accepting CSV path directly instead of depending on datasets package. Add TODO for using datasets package once PR compdemocracy#2312 is merged. Combined with pivot_table optimization, achieves ~10x speedup on bg2050 dataset (1M votes): 18-30s -> 2.5s (~400k votes/sec). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
|
Suggested Updates here: jucor#3 |
Summary
This PR adds support for testing with local datasets that are not committed to the repository. This enables:
real_data/.local/and it's auto-discovered, no config changes neededChanges
real_data/andreal_data/.local/based on directory naming pattern<report_id>-<name>/--include-localpytest flag to include git-ignored local datasets.local/to.gitignorefor confidential/large datasetsdatasets.pywithDatasetInfodataclass and discovery functionsconftest.pywith pytest hooks for dynamic test parametrizationdownload_real_data.pyto default to.local/with--commitflagtest_datasets.pytests/README.mdwith new documentationUsage
Test plan
pytest delphi/tests/test_datasets.pyto verify dataset discoverypytest delphi/tests/test_regression.pywith committed datasetsreal_data/.local/and verify it's discovered with--include-local.local/directory is properly git-ignored🤖 Generated with Claude Code