SCANPY-248 Auto detect test code#318
Conversation
SummaryThis PR adds automatic test directory detection for Python projects. The feature infers What reviewers should knowCore logic: The Key decision: Configuration integration: In Test coverage: Comprehensive unit tests ( Minor setup:
|
0cb9ba3 to
dd7a4af
Compare
c7dee3a to
4150bcc
Compare
4150bcc to
f0bd90b
Compare
c31219c to
f53f7e5
Compare
1dff852 to
1cf9e30
Compare
1cf9e30 to
9d12b07
Compare
9d12b07 to
ce9f8f0
Compare
170d90e to
e41141b
Compare
|
There was a problem hiding this comment.
Solid implementation overall — the chain-of-responsibility pattern is clean, the test coverage is thorough, and the integration with the existing configuration loader is correct. One real bug needs fixing before merge.
The previously flagged get_project_test_files pagination issue is still open.
| if result is None: | ||
| continue # file absent or no testpaths key — try next source | ||
| if result: | ||
| return {SONAR_TESTS: result} |
There was a problem hiding this comment.
Bug: The three-way return distinguishes None (file absent / no key) from "" (key declared, all paths invalid) from a non-empty string (success), but the caller in load() checks if result: which collapses None and "" into the same branch only on the truthy/falsy axis. The logic actually works today because of the preceding if result is None: continue, but the empty-string sentinel is a fragile contract.
More importantly: _existing_paths silently skips paths that are outside the project root (returns them filtered out), so if all declared paths are outside the root, paths is [] and the function falls into the warning + return "" branch — correct. But if _existing_paths raises an unexpected exception (e.g. PermissionError from is_dir()), the outer try/except in _load_from_pyproject_toml catches it and returns None instead of "", letting the chain fall through. This was flagged before and acknowledged, but the sentinel design amplifies the impact — the caller cannot distinguish between the two cases.
No code change required for the acknowledged filesystem I/O issue, but the return "" sentinel deserves a comment explaining why "" and not None (it's used to stop the chain when testpaths was declared but all paths were invalid).
- Mark as noise
| logging.debug(f"Error reading pyproject.toml for pytest testpaths: {e}") | ||
| return None | ||
| testpaths = toml_dict.get("tool", {}).get("pytest", {}).get("ini_options", {}).get("testpaths") | ||
| if not testpaths: |
There was a problem hiding this comment.
The testpaths value from pyproject.toml can be a string (space-separated) in addition to a list, per pytest docs. The current code handles that via testpaths.split() when it's not a list. However, if testpaths is an integer or another scalar type in the TOML (e.g. testpaths = 1), str(p).strip() will coerce it silently rather than skipping it, and _existing_paths will then look for a directory named "1". This is a minor edge case but worth guarding:
if isinstance(testpaths, list):
raw = [str(p) for p in testpaths if str(p).strip()]
elif isinstance(testpaths, str):
raw = [p for p in testpaths.split() if p]
else:
logging.debug(f"Unexpected type for testpaths in pyproject.toml: {type(testpaths)}")
return None- Mark as noise
| resp.raise_for_status() | ||
| return resp.json().get("task", {}) | ||
|
|
||
| def wait_for_ce_task_by_id(self, task_id: str) -> None: |
There was a problem hiding this comment.
Logic duplication: wait_for_ce_task_by_id polls with MAX_RETRIES * 10 iterations and time.sleep(2) between them. The existing wait_for_analysis_completion method (not shown in the diff but referenced in the fallback path) almost certainly uses a similar polling loop. If the retry budget, sleep interval, or terminal-state set ever needs to change, both loops must be updated. Extract the shared polling logic into a private _poll_until helper, or at minimum define the sleep duration and terminal states as class-level constants so both methods share them.
- Mark as noise
Seppli11
left a comment
There was a problem hiding this comment.
Overall, the cahnge looks good. It's really cool to see that we're finally parsing settings from the pyproject.toml file and co. (Hopefully the python version is next 🤞 )
One thing I'm not sure about is if it makes sense to also have a heuristic based on the folders in the scanner. It seems to create quite a few strange places where, IMO, the outcome isn't that straight forward.
The other, more nitpicky thing, python_project_loader doesn't seem like an appropriate name for this module, given how it is used and how closely tight it usage is to the present of the test property.
Let me know if you want to discuss anything
| # Running python_project_loader unconditionally would emit confusing warnings about | ||
| # pytest config even when the result would be discarded. | ||
| if SONAR_TESTS not in resolved_properties: | ||
| resolved_properties.update(python_project_loader.load(base_dir)) |
There was a problem hiding this comment.
In a more holistic view, applying the properties from the python_project_loader only if sonar.tests is not set doesn't really make sense to me. If we for example extend the python_project_loader to also detect the python version, things become weird.
Maybe it could make sense renaming the module to python_project_test_source_loader or similar (maybe you find a catchier name 😅 )
| return _load_from_ini_file(base_dir, "setup.cfg", _SETUP_CFG_PYTEST_SECTION) | ||
|
|
||
|
|
||
| def _load_from_filesystem(base_dir: pathlib.Path) -> Optional[str]: |
There was a problem hiding this comment.
Does it make sense to add this heuristic here as well, given that sonar-python itself has a more extensive set of heuristics.
There was a problem hiding this comment.
I think the heuristic from sonar-python is a bit of a crutch in the sense that it will only disable rules and still report metrics based on the FileType, because we can't contradict the scanner for the file type classification.
This will reduce noise but ultimately still produce an "inconsistent" analysis.
Having it here makes the analysis more consistent because the rule execution is in sync with the metrics reporting. It's an argument to use pysonar instead of the generic sonar-scanner-cli because here, the manual setup of sonar.tests is truly optional (assuming what is inferred matches the user expectations).
There was a problem hiding this comment.
That is a good point. Maybe it would make sense to only fall back to the heuristic if the user hasn't set anything, rather than we haven't found a valid test folder
| self._wait_for_ce_completion(workdir) | ||
| return process | ||
|
|
||
| def _wait_for_ce_completion(self, workdir: pathlib.Path) -> None: |
There was a problem hiding this comment.
This method does a lot of automatic checks and fallbacks. Is this really necessary for a test?
| resp.raise_for_status() | ||
| return resp.json() | ||
|
|
||
| def get_latest_ce_task(self) -> Optional[dict]: |
There was a problem hiding this comment.
this seems to be never called. Same applies to search_projects and get_project_measures
| testpaths = [] | ||
| """, | ||
| ) | ||
| result = python_project_loader.load(Path(".")) |
There was a problem hiding this comment.
If we keep the heuristic, this could lead to a weird behavior where the pyrpoject.tol loader returned an empty set, while the file heuristic loader found a folder. This would now override what the user set
| self.fs.create_dir("tests") | ||
| # nonexistent/ is not on disk — only tests/ should be returned | ||
| result = python_project_loader.load(Path(".")) | ||
| self.assertEqual(result[SONAR_TESTS], "tests") |
There was a problem hiding this comment.
A similar issue could appear here, where for whatever reason, all the user specified tests are not valid paths. If the heuristic picks them up, we're overriding what the user explicitly told us.
| def test_load_from_pytest_ini(self): | ||
| self.fs.create_file( | ||
| "pytest.ini", | ||
| contents="[pytest]\ntestpaths = tests integration\n", |
There was a problem hiding this comment.
nitpick: multiline strings would be nicer




No description provided.