feat: add package registry and maintainability check (#1400)#1403
feat: add package registry and maintainability check (#1400)#1403RuchitAgrawal wants to merge 7 commits into
Conversation
Signed-off-by: ruchitagrawal <rragrawal16@gmail.com>
|
@RuchitAgrawal Thanks for the PR! Could you suggest a few packages that would fail this check? That would help us identify good candidates to include in integration tests. |
|
@RuchitAgrawal Looks like the integration tests are failing. You can search for "case failed" in the log to see which test is failing. |
Signed-off-by: ruchitagrawal <rragrawal16@gmail.com>
@behnazh-w ,Here are a few packages that would fail across the check evaluation:
|
Found the issue and pushed a fix. |
Thanks a lot. For For the rest, we can add new integration tests since each test can cover a different scenario or failure reason. |
Signed-off-by: ruchitagrawal <rragrawal16@gmail.com>
|
@behnazh-w Thanks for the feedback! I've added the integration tests as suggested. Here is what was done:
All three new tests follow the same pattern i.e. the policy requires the check to pass, but since it correctly fails for these packages, we set expect_fail: true. Regarding the CI failures visible on the PR, those build tests were already failing on main before this PR . These changes don't touch any of those test cases. |
Signed-off-by: ruchitagrawal <rragrawal16@gmail.com>
|
Hi @behnazh-w, Out of the 18 failing tests, 1 was The problem was I applied the registry-maintainability policy to all arrow versions using a wildcard, assuming arrow@1.3.0 would pass. Fixed it in the latest commit, by scoping the policy to only |
|
|
||
| # Confirm registry presence and retrieve last release date. | ||
| try: | ||
| publish_dt: datetime = registry_info.package_registry.find_publish_timestamp( |
There was a problem hiding this comment.
The “release recency” signal checks the analyzed PURL version’s publish date, not the package’s latest release date. That means an old pinned version of an actively maintained package fails as “unmaintained,” and last_release_date is misleading.
There was a problem hiding this comment.
Added a _get_latest_release_timestamp helper that fetches the latest release date of the package. For PyPI, it reuses the cached package-level JSON and calls get_latest_release_upload_time(). For npm, it calls get_latest_version() then queries deps.dev for that version's timestamp. This is now what drives days_since_release and last_release_date, with the specific version's timestamp as a fallback.
| return urllib.parse.urljoin(pkg_registry.registry_url, f"project/{name}/{version}/") | ||
|
|
||
| if isinstance(pkg_registry, NPMRegistry): | ||
| return f"https://www.npmjs.com/package/{name}/v/{version}" |
There was a problem hiding this comment.
npm scoped package URLs omit the namespace. For pkg:npm/@scope/name@1.0.0, the report link becomes /package/name/v/1.0.0 instead of /package/@scope/name/v/1.0.0.
You could instead check the namespace first:
package_name = f"{namespace}/{name}" if namespace else name
return f"https://www.npmjs.com/package/{package_name}/v/{version}"
There was a problem hiding this comment.
Fixed by adding namespace as a parameter to _build_registry_url and constructing the npm URL as f"{namespace}/{name}" when a namespace is present, so scoped packages like @scope/name generate the correct link.
Signed-off-by: ruchitagrawal <rragrawal16@gmail.com>
|
@behnazh-w Thanks for reviewing the PR. I have addressed all the comments in the latest commit. Kindly take another look when you get a chance. |
Summary
Adds a new check
mcn_registry_maintainability_1that validates whether a package exists on its public registry and is actively maintained.Description of changes
The check uses three signals when available:
find_publish_timestamp()to confirm the package exists and check how many days have passed since the last release. Exceeding the threshold fails the check.yankedflag for PyPI packages and thedeprecated fieldfor npm packages from existing registry JSON responses. A yanked or deprecated package always fails, regardless of release age.get_repo_data()to check if the repo is archived and how recently code was pushed. An archived repo always fails.Results include remediation guidance and links to the registry page and source repository. The inactivity threshold is configurable via
defaults.iniunderregistry_maintainability(default: 365 days).Related issues
Closes #1400
Checklist
verifiedlabel should appear next to all of your commits on GitHub.