Add environment self-check utility (agent-s_doctor) for pre-run and post-failure diagnostics #145

Vinyjp · 2025-10-05T03:15:53Z

Self-check System Implementation

This PR introduces a new environment self-check utility to help contributors self-check utility (agent-s_doctor) for pre-run and post-failure diagnostics

Key Updates

New self_check.py in the project root
Performs unified checks for Python version, dependencies, log directory permissions, and OS-specific tools/permissions on macOS, Linux, and Windows.
Updated setup.py
Registers the script via py_modules and adds a CLI entry point:
```
entry_points={
    'console_scripts': [
        'agent-s_doctor = self_check:main',
    ],
},
```
After installation, users can simply run:
```
agent-s_doctor
```
to perform an environment health check.
README.md enhancement
Added a “Self-check” section explaining how to use the tool, including:
- --json → outputs machine-readable diagnostics, convenient for CI/CD integration
- --skip-screenshot → skips GUI-related checks for headless or SSH environments

Why This Matters

Many contributors run Agent S in the base environment with unsupported Python versions (outside 3.9–3.12) or without proper permissions/tools, causing runtime errors.
This self-check script detects such issues upfront and provides clear guidance to resolve them.
It also verifies required API keys (OPENAI_API_KEY, SILICONFLOW_KEY, HF_TOKEN) and suggests export commands when missing.

Note:
Currently, I have only tested the self-check functionality on macOS, where it works as expected.
Linux and Windows compatibility has not yet been tested, but the implementation should be cross-platform in principle.
Also, I did nothing with grounding.py, it looked changed just because I forget to sync your change several hours ago, then I made a copy paste form the latest type.
Below is the example of the self_check example, for the reason that I use grounding modal form other cloud company, it shows warn with HF_TOKEN.

Summary by CodeRabbit

New Features
- Introduces a command-line self-check tool to verify environment readiness (Python version, dependencies, API keys, permissions, logs directory, optional screenshot test).
- Supports JSON output for automation and a headless mode to skip screenshots.
- Provides clear pass/fail results, guidance, and CI-friendly exit codes.
Documentation
- Added a “Self-check (Recommended)” section under Grounding Models with usage instructions, including JSON output and headless options.

… safely skip code-agent when no environment/controller is present, and improve debugging transparency

…ost-failure diagnostics

coderabbitai · 2025-10-05T03:16:44Z

Walkthrough

Adds a new self-check CLI utility for environment validation, integrates it into packaging as agent-s_doctor, updates README with usage, and makes a no-op code change in grounding.py.

Changes

Cohort / File(s)	Summary of Changes
Documentation `README.md`	Adds “Self-check (Recommended)” section describing agent-s_doctor, JSON output, and headless usage.
Self-check Utility (CLI) `self_check.py`	New CLI tool to validate Python version, dependencies, logs dir writability, API keys, OS-specific capabilities, and screenshot capture; supports --json and --skip-screenshot; standardized result structures and exit codes.
Packaging/Entry Points `setup.py`	Registers console script `agent-s_doctor=self_check:main`; includes `self_check` in `py_modules`.
Grounding Agent Minor Edit `gui_agents/s3/agents/grounding.py`	Semantically neutral change: `return FAIL` → `return FAIL""` in done method (no functional impact).

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor User
    participant CLI as agent-s_doctor (self_check.main)
    participant Checks as collect_checks()
    participant Sys as OS/Env
    participant Render as Output (text/JSON)

    User->>CLI: Run agent-s_doctor [--json] [--skip-screenshot]
    CLI->>Checks: collect_checks(skip_screenshot)
    rect rgb(240,250,255)
        note over Checks: Core checks
        Checks->>Sys: Python version
        Checks->>Sys: Dependencies (dynamic import)
        Checks->>Sys: Logs directory access
        Checks->>Sys: API keys detection
    end
    rect rgb(245,240,255)
        note over Checks: OS-specific checks
        alt macOS
            Checks->>Sys: Accessibility / Screen Recording
        else Linux
            Checks->>Sys: scrot presence
        else Windows
            Checks->>Sys: pywinauto / pywin32
        end
    end
    opt Screenshot test
        Checks->>Sys: pyautogui screenshot
    end
    Checks-->>CLI: List[CheckResult]
    alt --json
        CLI->>Render: Serialize results as JSON
    else
        CLI->>Render: Print human-readable summary
    end
    CLI-->>User: Exit code 0 if no FAIL, else 1

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

A whisker-twitch, a tap-tap key,
I run the doctor cheerfully 🩺🐇
“Keys?” it asks. “Screens?” it probes—
Through logs and libs my warren roves.
Green lights blink—okie-dokie!
Now Agent S can brew its mochi.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title succinctly describes the primary change by stating the addition of an environment self-check utility and even names the CLI entry point, making it clear and specific about the main feature introduced. It avoids unnecessary detail while capturing both the tool’s purpose and its diagnostic scope.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 80787ad and 949ddc8.

📒 Files selected for processing (4)

README.md (1 hunks)
gui_agents/s3/agents/grounding.py (1 hunks)
self_check.py (1 hunks)
setup.py (2 hunks)

🧰 Additional context used

🪛 Ruff (0.13.3)

self_check.py

155-155: Do not catch blind exception: Exception

(BLE001)

178-178: Do not catch blind exception: Exception

(BLE001)

202-202: Do not catch blind exception: Exception

(BLE001)

335-335: Do not catch blind exception: Exception

(BLE001)

432-432: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (6)

self_check.py (4)

155-156: Static analysis false positives on exception handling.

Ruff flags these bare except Exception handlers, but they're appropriate for a diagnostic utility where graceful error reporting is preferred over crashing. The handlers capture unexpected import failures, permission check errors, and screenshot failures—all scenarios where we want to surface the issue to the user rather than abort.

Based on learnings

Also applies to: 178-179, 202-203, 335-336, 432-433

254-344: LGTM: macOS permission checks are well-structured.

The macOS permission verification gracefully handles multiple API versions (AXIsProcessTrustedWithOptions vs. AXIsProcessTrusted, CGPreflightScreenCaptureAccess availability). The fallback chains and clear remediation messages make this robust across macOS versions.

347-411: LGTM: Platform-specific dependency checks are appropriate.

Linux checks for scrot and Windows checks for pywinauto/pywin32 are correctly scoped to their respective platforms. The WARN status (rather than FAIL) for missing optional tools is a good user experience choice.

414-438: LGTM: Screenshot test with skip option.

The screenshot capability check is appropriately skippable for headless/CI environments via --skip-screenshot. Capturing the exception and reporting it as a WARN (not FAIL) is the right call since users may intentionally run in headless mode.

setup.py (1)

12-12: LGTM: Clean packaging integration.

The new module is correctly registered via py_modules and the console script entry point agent-s_doctor properly targets self_check:main. This follows setuptools conventions.

Also applies to: 40-40

README.md (1)

137-144: LGTM: Clear and helpful documentation.

The Self-check section provides concise guidance on running agent-s_doctor with examples for JSON output and headless environments. The placement before the Usage section is logical—users can validate their environment before attempting to run the agent.

coderabbitai · 2025-10-05T03:22:44Z

self_check.py

+CORE_DEPENDENCIES: Iterable[DependencySpec] = (
+    DependencySpec(
+        name="pyautogui",
+        import_name="pyautogui",
+        purpose="Required for screenshot capture and input control",
+    ),
+    DependencySpec(
+        name="Pillow",
+        import_name="PIL",
+        package_name="Pillow",
+        purpose="Image processing backend for screenshots",
+    ),
+    DependencySpec(
+        name="numpy",
+        import_name="numpy",
+        purpose="Used across perception and planning modules",
+    ),
+    DependencySpec(
+        name="requests",
+        import_name="requests",
+        purpose="HTTP client for API integrations",
+    ),
+    DependencySpec(
+        name="openai",
+        import_name="openai",
+        purpose="OpenAI-compatible provider client",
+    ),
+    DependencySpec(
+        name="anthropic",
+        import_name="anthropic",
+        optional=True,
+        purpose="Required when using Anthropic models",
+    ),
+    DependencySpec(
+        name="tiktoken",
+        import_name="tiktoken",
+        purpose="Token counting utilities",
+    ),
+    DependencySpec(
+        name="paddleocr",
+        import_name="paddleocr",
+        optional=True,
+        purpose="OCR support for perception tasks",
+    ),
+    DependencySpec(
+        name="paddlepaddle",
+        import_name="paddle",
+        optional=True,
+        purpose="Deep learning runtime used by PaddleOCR",
+    ),
+    DependencySpec(
+        name="pytesseract",
+        import_name="pytesseract",
+        optional=True,
+        purpose="Fallback OCR engine",
+    ),
+    DependencySpec(
+        name="google-genai",
+        import_name="google.genai",
+        package_name="google-genai",
+        optional=True,
+        purpose="Needed for Google Gemini providers",
+    ),
+    DependencySpec(
+        name="selenium",
+        import_name="selenium",
+        optional=True,
+        purpose="Browser automation helpers",
+    ),
+)


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify dependency list completeness.

The CORE_DEPENDENCIES list includes major packages (pyautogui, numpy, openai, anthropic, tiktoken), with optional flags for OCR engines and browser automation. Ensure this aligns with the actual requirements in setup.py.

Run this script to compare declared dependencies against setup.py:

🏁 Script executed:

#!/bin/bash # Description: Extract install_requires from setup.py and compare with self_check.py dependencies echo "=== Dependencies in setup.py ===" python3 << 'EOF' import re with open("setup.py") as f: content = f.read() # Extract install_requires list match = re.search(r'install_requires=\[(.*?)\]', content, re.DOTALL) if match: deps = match.group(1) # Parse quoted strings for line in deps.split('\n'): line = line.strip().strip(',').strip('"').strip("'") if line and not line.startswith('#'): # Remove platform conditions pkg = line.split(';')[0].strip() print(pkg) EOF echo "" echo "=== Dependencies in self_check.py CORE_DEPENDENCIES ===" python3 << 'EOF' import re with open("self_check.py") as f: content = f.read() # Find CORE_DEPENDENCIES section match = re.search(r'CORE_DEPENDENCIES.*?=.*?\((.*?)\)', content, re.DOTALL) if match: deps_block = match.group(1) # Extract package_name or name fields for match in re.finditer(r'name="([^"]+)"', deps_block): print(match.group(1)) EOF

Length of output: 302

Update CORE_DEPENDENCIES to match setup.py install_requires
The list omits several packages declared in setup.py: backoff, pandas, fastapi, uvicorn, together, scikit-learn, websockets, pyobjc, toml, pywinauto, pywin32. Include them or document why they’re excluded.

🤖 Prompt for AI Agents

In self_check.py around lines 46 to 115, the CORE_DEPENDENCIES tuple is missing several packages that are declared in setup.py (backoff, pandas, fastapi, uvicorn, together, scikit-learn, websockets, pyobjc, toml, pywinauto, pywin32); update CORE_DEPENDENCIES to either include DependencySpec entries for each of those packages (setting name, import_name and package_name where different, and mark optional=True where appropriate) or add a clear comment/documentation explaining why each omitted package is intentionally excluded (e.g., platform-specific, bundled, or not required at runtime). Ensure package names match setup.py, use import_name matching how the package is imported in code (e.g., pywin32 -> win32api if applicable), and keep optional flags accurate.

coderabbitai · 2025-10-05T03:22:44Z

self_check.py

+    if 3 <= major:
+        if 9 <= minor <= 12 or major > 3:


⚠️ Potential issue | 🔴 Critical

Critical: Fix Python version validation logic.

The version check incorrectly accepts Python 4.x and higher:

if 3 <= major: if 9 <= minor <= 12 or major > 3:

This condition would pass for Python 4.0, 4.1, etc., which don't exist yet and would likely break compatibility. The intent is to restrict to Python 3.9-3.12 only.

Apply this diff:

- if 3 <= major: - if 9 <= minor <= 12 or major > 3: + if major == 3 and 9 <= minor <= 12: return CheckResult( name="Python version", status="PASS",

🤖 Prompt for AI Agents

In self_check.py around lines 131 to 132, the version check incorrectly allows any major >= 3 (e.g., Python 4.x); replace the current condition with one that restricts to Python 3 only and checks minor between 9 and 12: require major == 3 and 9 <= minor <= 12 (i.e., change the nested if to enforce major == 3 and the minor range), so the check only passes for Python 3.9–3.12.

Vinyjp added 5 commits October 4, 2025 16:50

fix: make OSWorldACI env optional, prevent crashes in local CLI runs,…

c8002d9

… safely skip code-agent when no environment/controller is present, and improve debugging transparency

Merge branch 'simular-ai:main' into main

d4116fc

Add environment self-check utility (agent-s_doctor) for pre-run and p…

1c5004c

…ost-failure diagnostics

Merge branch 'develop'

bb4563e

syncing changes from remote repository

949ddc8

coderabbitai bot reviewed Oct 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add environment self-check utility (agent-s_doctor) for pre-run and post-failure diagnostics #145

Add environment self-check utility (agent-s_doctor) for pre-run and post-failure diagnostics #145

Uh oh!

Vinyjp commented Oct 5, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 5, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 5, 2025

Uh oh!

coderabbitai bot Oct 5, 2025

Uh oh!

Uh oh!

Add environment self-check utility (agent-s_doctor) for pre-run and post-failure diagnostics #145

Are you sure you want to change the base?

Add environment self-check utility (agent-s_doctor) for pre-run and post-failure diagnostics #145

Uh oh!

Conversation

Vinyjp commented Oct 5, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Self-check System Implementation

Key Updates

Why This Matters

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 5, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Vinyjp commented Oct 5, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 5, 2025 •

edited

Loading