Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detect bots #1409

Closed
wants to merge 1 commit into from
Closed

detect bots #1409

wants to merge 1 commit into from

Conversation

hussam789
Copy link
Collaborator

@hussam789 hussam789 commented Dec 23, 2024

PR Type

Enhancement


Description

  • Added centralized bot detection utility functions in utils.py:
    • is_user_name_a_bot(): Detects bot users based on common username patterns
    • is_pr_description_indicating_bot(): Identifies bot-generated PRs from description content
  • Enhanced bot detection across all git providers (GitHub, GitLab, Bitbucket):
    • Now checks both username patterns and PR descriptions
    • Added support for detecting automated PR creators like Snyk
    • Improved logging for bot detection cases
  • Standardized bot detection approach across all platforms for more consistent behavior

Changes walkthrough 📝

Relevant files
Enhancement
utils.py
Add bot detection utility functions                                           

pr_agent/git_providers/utils.py

  • Added new function is_user_name_a_bot() to detect bot usernames based
    on common indicators
  • Added new function is_pr_description_indicating_bot() to detect
    bot-generated PRs based on description patterns
  • +19/-0   
    bitbucket_app.py
    Improve bot detection in Bitbucket handler                             

    pr_agent/servers/bitbucket_app.py

  • Enhanced bot detection by checking username and PR description
  • Integrated new bot detection utility functions
  • +12/-1   
    github_app.py
    Enhance GitHub bot detection capabilities                               

    pr_agent/servers/github_app.py

  • Enhanced bot detection logic to check usernames and PR descriptions
  • Updated is_bot_user() function to use new utility functions
  • +16/-6   
    gitlab_webhook.py
    Improve GitLab bot detection mechanism                                     

    pr_agent/servers/gitlab_webhook.py

  • Refactored bot detection to use new utility functions
  • Added PR description check for bot detection
  • +7/-3     

    💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    Copy link
    Contributor

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
    🏅 Score: 92
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Bot Detection Coverage

    The bot indicators list may need to be reviewed to ensure comprehensive coverage of bot patterns while avoiding false positives for legitimate usernames

        bot_indicators = ['codium', 'bot_', 'bot-', '_bot', '-bot', 'qodo', "service", "github", "jenkins", "auto",
                          "cicd", "validator", "ci-", "assistant", "srv-"]
    Error Handling

    The bot detection logic could benefit from more specific error handling rather than catching all exceptions generically

        except Exception as e:
            get_logger().error(f"Failed 'is_bot_user' logic: {e}")
        return False

    Copy link
    Contributor

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Possible issue
    Improve bot name detection accuracy by using word boundary regex patterns instead of simple string containment

    Add case-insensitive regex pattern matching instead of simple string containment to
    avoid false positives. For example, 'robot' would currently match as a bot due to
    containing 'bot'.

    pr_agent/git_providers/utils.py [111]

    -return any(indicator in name.lower() for indicator in bot_indicators)
    +return any(re.search(rf'\b{re.escape(indicator)}\b', name.lower()) for indicator in bot_indicators)
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: The suggestion addresses a significant issue where the current implementation could lead to false positives in bot detection. Using word boundaries in regex would prevent matching substrings like 'robot' when looking for 'bot'.

    8
    Add proper type checking and validation for input parameters to prevent runtime errors

    Add input sanitization for the description parameter to handle None values and
    non-string inputs safely, preventing potential TypeErrors.

    pr_agent/git_providers/utils.py [114-116]

    -def is_pr_description_indicating_bot(description: str) -> bool:
    -    if not description:
    +def is_pr_description_indicating_bot(description: str | None) -> bool:
    +    if not isinstance(description, str) or not description:
             return False
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: The suggestion improves code robustness by adding proper type checking for the description parameter, preventing potential runtime errors when handling None or non-string inputs.

    7
    Ensure robust type checking and case-insensitive comparison for API response handling

    Add case-insensitive comparison for sender_type to ensure consistent bot detection
    across different GitHub API responses.

    pr_agent/servers/github_app.py [245]

    -if sender_type.lower() == "bot":
    +if isinstance(sender_type, str) and sender_type.lower() == "bot":
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: The suggestion enhances code reliability by adding type checking before string operations, preventing potential TypeErrors when handling unexpected API responses.

    7
    • Author self-review: I have reviewed the PR code suggestions, and addressed the relevant ones.

    @mrT23 mrT23 closed this Dec 24, 2024
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants