Skip to content

Conversation

@anurag2787
Copy link
Contributor

Resolves #1988

Description

This PR improves how we generate snapshots by making the handling of active entities more consistent. Earlier, different parts of the code used different ways to filter chapters by is_active, and some didn’t properly handle cases where repositories were empty
Now, we’ve standardized the process using the manager patterns active_chapters active_projects active_releases

Checklist

  • I've read and followed the contributing guidelines.
  • I've run make check-test locally; all checks and tests passed.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 2, 2025

Summary by CodeRabbit

  • Refactor
    • Improved backend release filtering logic with specialized managers
    • Streamlined snapshot processing with refined data retrieval methods

Walkthrough

Introduces a specialized Django manager for filtering active releases and updates snapshot processing to use active managers for chapters, projects, and releases with explicit creation-time boundaries instead of generic is_active filtering.

Changes

Cohort / File(s) Summary
Release Manager Infrastructure
backend/apps/github/models/managers/release.py, backend/apps/github/models/release.py
Adds new ActiveReleaseManager class that filters releases by draft status, pre-release status, and non-empty repository. Integrates managers into the Release model with both default and active-specific accessors.
Snapshot Processing Command
backend/apps/owasp/management/commands/owasp_process_snapshots.py
Refactors entity filtering for chapters, projects, and releases to use active managers (Chapter.active_chapters, Project.active_projects, Release.active_releases) with explicit creation-time boundaries (created_at__gte=start_at, created_at__lte=end_at) instead of get_new_items().filter(is_active=True) pattern.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

  • Manager implementation: Straightforward filtering logic with select_related optimization and clear filtering criteria
  • Command refactoring: Verify that active managers properly replace previous is_active filtering and that timeframe boundaries are correctly applied to all three entity types

Pre-merge checks and finishing touches

✅ Passed checks (4 passed)
Check name Status Explanation
Title Check ✅ Passed The title "Improved Snapshot Generation" is directly related to the main change in this pull request. The changeset adds new manager classes and updates the snapshot processing job to use standardized active entity managers (active_releases, active_chapters, active_projects), which directly represents an improvement to how snapshots are generated. The title is concise and specific enough that a teammate reviewing the git history would understand the primary objective of the change, even if it doesn't detail every aspect of the implementation.
Linked Issues Check ✅ Passed The pull request directly addresses all primary coding requirements from issue #1988. The changes create an ActiveReleaseManager to properly filter releases while ensuring repositories are not empty [#1988], add active_releases manager to the Release model, and update the snapshot processing job to use the standardized active_* managers (active_chapters, active_projects, active_releases) instead of custom is_active filtering [#1988]. The implementation ensures consistent handling of active entities across all entity types as required, and the timeframe filtering (created_at__gte=start_at and created_at__lte=end_at) tightens the processing boundaries appropriately.
Out of Scope Changes Check ✅ Passed All changes in this pull request are directly in scope with the linked issue #1988. The additions of ActiveReleaseManager and the managers to the Release model are specifically needed to implement the required active_releases manager pattern. The modifications to owasp_process_snapshots.py to use active_* managers directly correspond to the stated requirement to update snapshot processing to utilize these managers. No unrelated refactoring, formatting changes, or extraneous features are present in the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Nov 2, 2025

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
backend/apps/github/models/managers/release.py (1)

6-23: LGTM! Clean implementation of active release filtering.

The manager correctly filters out drafts, pre-releases, and empty repositories. The use of select_related("repository") is a good performance optimization since we're filtering on repository fields.

Consider enhancing the docstring to be more explicit about the filtering criteria:

 class ActiveReleaseManager(models.Manager):
-    """Active releases manager."""
+    """Manager for active releases.
+    
+    Returns releases that are published (not drafts), stable (not pre-releases),
+    and have a non-empty repository.
+    """
 
     def get_queryset(self):
-        """Get queryset of active releases.
-        
-        Filters out draft and pre-releases, and ensures repository exists.
-        """
+        """Get queryset of active releases.
+        
+        Returns:
+            QuerySet: Releases filtered by:
+                - is_draft=False
+                - is_pre_release=False  
+                - repository__is_empty=False (also excludes null repositories)
+        """

Note: The repository__is_empty=False filter will also exclude releases where repository is NULL. If releases without repositories should be included, you'll need to adjust the filter logic.

backend/apps/owasp/management/commands/owasp_process_snapshots.py (1)

69-85: LGTM! Excellent consistency improvement using active entity managers.

The changes successfully achieve the PR objectives by:

  • Replacing generic is_active filtering with specialized active managers
  • Using consistent patterns across all three entity types (chapters, projects, releases)
  • Maintaining proper time boundaries with created_at__gte and created_at__lte

The approach ensures that:

  1. Chapter.active_chapters filters active chapters
  2. Project.active_projects filters active projects
  3. Release.active_releases filters non-draft, non-prerelease releases from non-empty repositories

All three now apply time boundaries uniformly, eliminating the previous inconsistencies mentioned in issue #1988.

Optional consideration: Releases are filtered by created_at, which is consistent with chapters and projects. However, releases semantically have a published_at timestamp that might be more meaningful for snapshot purposes. The current approach (using created_at) is defensible and maintains consistency, but if you want to capture releases by their publication date instead, you'd need to adjust the filter and handle potential null published_at values.

Given that the goal is consistency and created_at is guaranteed to be non-null, the current implementation is solid.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f86ed08 and f8d2d62.

📒 Files selected for processing (3)
  • backend/apps/github/models/managers/release.py (1 hunks)
  • backend/apps/github/models/release.py (1 hunks)
  • backend/apps/owasp/management/commands/owasp_process_snapshots.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 1967
File: backend/apps/owasp/api/internal/filters/project_health_metrics.py:24-27
Timestamp: 2025-08-04T15:55:08.017Z
Learning: In the OWASP Nest project, the `is_active` filter in `ProjectHealthMetricsFilter` is intentionally designed as a workaround to eliminate inactive projects from the dashboard. It only filters FOR active projects when `value=True`, and returns an empty Q() when `value=False` to avoid showing inactive projects. This is not meant to be a general boolean filter but a specific solution to exclude inactive projects from the project health metrics dashboard.
🧬 Code graph analysis (2)
backend/apps/github/models/release.py (4)
backend/apps/github/models/managers/release.py (1)
  • ActiveReleaseManager (6-23)
backend/apps/github/models/mixins/release.py (1)
  • ReleaseIndexMixin (8-70)
backend/apps/common/models.py (2)
  • BulkSaveModel (10-34)
  • TimestampedModel (37-46)
backend/apps/github/models/common.py (1)
  • NodeModel (76-90)
backend/apps/owasp/management/commands/owasp_process_snapshots.py (5)
backend/apps/owasp/api/internal/nodes/snapshot.py (4)
  • new_chapters (35-37)
  • new_issues (40-42)
  • new_projects (45-47)
  • new_releases (50-52)
backend/apps/owasp/models/chapter.py (1)
  • Chapter (21-217)
backend/apps/owasp/api/internal/queries/snapshot.py (1)
  • snapshot (14-22)
backend/apps/owasp/models/project.py (1)
  • Project (37-384)
backend/apps/github/models/release.py (1)
  • Release (11-139)
🔇 Additional comments (3)
backend/apps/github/models/release.py (2)

7-7: LGTM! Proper import for the new manager.

The import is correctly placed with other model-related imports.


14-16: LGTM! Manager declarations follow Django best practices.

Explicitly defining both the default objects manager and the specialized active_releases manager is the correct approach. This matches the pattern used by Chapter and Project models, ensuring consistency across the codebase.

The order matters in Django—the first manager declared becomes the default manager, so keeping objects first maintains expected ORM behavior.

backend/apps/owasp/management/commands/owasp_process_snapshots.py (1)

73-90: LGTM! Correct to keep Issues and Users using the generic helper.

Issue and User models don't have specialized active managers (and likely don't need them based on their domain semantics), so continuing to use get_new_items for these entities is appropriate. This selective application of active managers demonstrates good judgment—only entities with meaningful "active" states get the specialized filtering.

@anurag2787
Copy link
Contributor Author

@arkid15r for new_issue if we use OpenIssueManager that is predefined but it only returns issues from the last 90 days so So instead, can we directly use this ?

new_issues = Issue.objects.filter(
                created_at__gte=snapshot.start_at,
                created_at__lte=snapshot.end_at,
                state="open",
                repository__project__isnull=False,
            ).select_related("repository")

@anurag2787 anurag2787 marked this pull request as ready for review November 2, 2025 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve snapshot generation

1 participant