Skip to content

Conversation

@adryserage
Copy link
Contributor

@adryserage adryserage commented Jan 2, 2026

🎯 Overview

This is an experimental PR for discussion about integrating CrewAI as an orchestration layer on top of Auto-Claude. I'd love to get feedback on the approach before we consider merging.

💡 Why CrewAI Integration?

The Problem

Currently, Auto-Claude excels at executing individual specs through its agent pipeline (Planner → Coder → QA). However, for complex product development workflows, we're missing:

  1. High-level orchestration: No way to coordinate multiple specs as part of a larger feature
  2. Product-level decision making: No agents that think about what to build, only how
  3. Automated prioritization: Manual intervention needed to sequence work
  4. Human escalation: No structured way to notify humans when AI gets stuck

The Solution: CrewAI as Orchestration Layer

┌─────────────────────────────────────────────────────────────┐
│              CREWAI ORCHESTRATION LAYER (NEW)               │
│  Product Crew → Development Crew → QA & Release Crew        │
└─────────────────────────────────────────────────────────────┘
                              ↓
                    CrewAI → Auto-Claude Bridge
                              ↓
┌─────────────────────────────────────────────────────────────┐
│              AUTO-CLAUDE EXECUTION LAYER (EXISTING)         │
│  Spec Orchestrator | Build Orchestrator | Claude SDK        │
└─────────────────────────────────────────────────────────────┘

🎁 What This PR Brings

1. Three Specialized Crews

Crew Purpose Agents
Product Management Transform user requests into actionable specs Product Manager, Requirements Analyst, Priority Analyst
Development Execute implementation via Auto-Claude bridge Tech Lead, Senior Developer, Code Reviewer
QA & Release Validate and prepare releases QA Lead, Security Analyst, Release Manager

2. Configurable Agent Models (via UI)

Each agent's model is configurable through the existing Settings UI:

  • Profiles: Balanced, Performance, Economy, Custom
  • Per-agent config: Model (Opus/Sonnet/Haiku) + Thinking Level (High/Medium/Low)
  • Cost optimization: Use Haiku for simple tasks, Opus for critical decisions

3. Multi-Channel Notifications

# Notify humans when needed
service.notify_success("Build Complete", "Feature X deployed")
service.notify_error("QA Failed", "3 acceptance criteria not met")

Channels: Console, Slack, Email, Webhook, Linear

4. Intelligent Escalation

Automatic human escalation when:

  • QA iterations exceed threshold (default: 10)
  • Consecutive failures (default: 3)
  • Security vulnerabilities detected (HIGH/CRITICAL)
  • Idle time exceeded (configurable)

5. Workflow State Machine

PENDING → ANALYZING → DEVELOPING → QA_VALIDATION → RELEASE_PREP → COMPLETED
                                        ↓
                                   ESCALATED (if blocked)

🔧 Technical Implementation

Package Structure

auto-claude/orchestration/       # Renamed from crewai/ to avoid SDK conflict
├── config.py                    # Reads settings from UI JSON
├── crews/                       # Three specialized crews
│   ├── product_management.py
│   ├── development.py
│   └── qa_release.py
├── flows/
│   └── development_workflow.py  # Main workflow with Pydantic state
├── bridge/
│   └── auto_claude_bridge.py    # Bridge to existing Auto-Claude
└── notifications/
    └── service.py               # Multi-channel notifications

Key Design Decisions

  1. No modification to core Auto-Claude: CrewAI sits on top, delegates to existing code
  2. Settings reuse: Uses same UI patterns as existing model selection
  3. Graceful degradation: Works without CrewAI if disabled
  4. SDK isolation: Package renamed to orchestration/ to avoid import conflicts

📊 Arguments For Implementation

1. Autonomous Product Development

Move from "implement this spec" to "build this feature end-to-end" with AI handling:

  • Requirements breakdown
  • Complexity assessment
  • Implementation sequencing
  • Quality validation
  • Release preparation

2. Cost Efficiency

  • Use cheaper models (Haiku) for routine tasks
  • Reserve expensive models (Opus) for critical decisions
  • Profiles allow quick switching between cost/performance

3. Reduced Human Overhead

  • Smart escalation means humans only intervene when truly needed
  • Notifications keep stakeholders informed without constant monitoring
  • Workflow state provides visibility into progress

4. Better Quality

  • Specialized agents focus on their domain (security, QA, architecture)
  • Multi-layer review before release
  • Structured QA validation with acceptance criteria

5. Enterprise Readiness

  • Slack/Email notifications for team visibility
  • Linear integration for issue tracking
  • Audit trail through workflow state

⚠️ Considerations

Potential Concerns

  1. Complexity: Adds another layer - worth it for the autonomy?
  2. Cost: More agents = more API calls - profiles help mitigate
  3. Debugging: Harder to debug multi-agent workflows
  4. Dependencies: Adds CrewAI SDK dependency

Mitigations

  • Feature flag (crewaiEnabled) to opt-in
  • Comprehensive test suite (26 tests)
  • Clear separation from core Auto-Claude
  • Profiles for cost control

✅ Testing

# All tests passing
pytest tests/test_crewai.py -v
# Result: 26 passed, 1 skipped

Tests cover:

  • Config loading and model selection
  • Notification service (all channels)
  • Escalation triggers and history
  • Workflow state management

📝 Next Steps (if approved)

  1. Documentation: Architecture guide and setup instructions
  2. Integration tests: End-to-end workflow testing
  3. UI polish: CrewAI settings tab in Settings dialog
  4. Performance benchmarks: Compare with manual spec creation

🙋 Questions for Discussion

  1. Is the orchestration layer approach the right abstraction?
  2. Should we support other orchestration frameworks (LangGraph, AutoGen)?
  3. What's the right default escalation thresholds?
  4. Should notifications be opt-in or opt-out by default?

Looking forward to your feedback! This is meant to start a discussion about where Auto-Claude could go for more autonomous, enterprise-grade AI development workflows.

Summary by CodeRabbit

  • New Features
    • Added CrewAI multi-agent orchestration with configurable profiles (balanced, performance, economy, custom)
    • Added CrewAI settings panel enabling global control and per-agent model/thinking level configuration
    • Introduced automated development workflows coordinating product management, development, QA, and release phases
    • Added notification and escalation system supporting multiple communication channels

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 2, 2026

Warning

Rate limit exceeded

@adryserage has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 7 minutes and 48 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between b7bf54e and 60a567a.

📒 Files selected for processing (20)
  • apps/backend/requirements.txt
  • apps/frontend/src/renderer/components/settings/AppSettings.tsx
  • apps/frontend/src/renderer/components/settings/CrewAISettings.tsx
  • apps/frontend/src/shared/constants/models.ts
  • apps/frontend/src/shared/i18n/locales/en/settings.json
  • apps/frontend/src/shared/i18n/locales/fr/settings.json
  • apps/frontend/src/shared/types/settings.ts
  • auto-claude/orchestration/__init__.py
  • auto-claude/orchestration/bridge/__init__.py
  • auto-claude/orchestration/bridge/auto_claude_bridge.py
  • auto-claude/orchestration/config.py
  • auto-claude/orchestration/crews/__init__.py
  • auto-claude/orchestration/crews/development.py
  • auto-claude/orchestration/crews/product_management.py
  • auto-claude/orchestration/crews/qa_release.py
  • auto-claude/orchestration/flows/__init__.py
  • auto-claude/orchestration/flows/development_workflow.py
  • auto-claude/orchestration/notifications/__init__.py
  • auto-claude/orchestration/notifications/service.py
  • tests/test_crewai.py

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

This PR introduces CrewAI multi-agent orchestration support to Auto-Claude. Backend changes add orchestration infrastructure including configuration management, crew factories for product management, development, and QA/release workflows, workflow state machines, and notification/escalation services. Frontend changes introduce CrewAI settings UI with profile selection and per-agent configuration, alongside supporting types and internationalization.

Changes

Cohort / File(s) Summary
Backend Dependencies
apps/backend/requirements.txt
Added optional CrewAI and CrewAI-tools packages for multi-agent orchestration.
Frontend Settings Navigation & Type
apps/frontend/src/renderer/components/settings/AppSettings.tsx
Extended AppSection type to include 'crewai'; added navigation item with Users icon; integrated CrewAISettings component rendering.
Frontend CrewAI Settings Component
apps/frontend/src/renderer/components/settings/CrewAISettings.tsx
New React component providing global CrewAI toggle, profile selection (balanced, performance, economy, custom), per-agent model and thinking-level configuration, change detection, and reset-to-defaults functionality with i18n support.
Frontend Configuration Constants
apps/frontend/src/shared/constants/models.ts
Added CrewAI configuration exports: DEFAULT_CREWAI_AGENT_MODELS, CREWAI_PROFILES, CREWAI_AGENT_LABELS, CREWAI_CREWS for UI and orchestration.
Frontend Settings Types
apps/frontend/src/shared/types/settings.ts
Introduced CrewAI-related types (CrewAIProfile, CrewAIAgentConfig, CrewAIAgentModelsConfig, CrewAIProfileDefinition) and extended AppSettings interface with crewaiEnabled, crewaiProfile, crewaiAgentModels fields.
Frontend Internationalization
apps/frontend/src/shared/i18n/locales/en/settings.json, fr/settings.json
Added English and French translations for CrewAI settings UI labels, descriptions, and configuration strings.
Backend Orchestration Package Init
auto-claude/orchestration/__init__.py
New package initializer re-exporting configuration, crew factories, workflow, and notification services as public APIs.
Backend Orchestration Configuration
auto-claude/orchestration/config.py
New module managing CrewAI settings loading from UI, profile handling, agent model resolution with support for balanced/performance/economy/custom profiles and sensible defaults.
Backend Orchestration Bridge
auto-claude/orchestration/bridge/__init__.py, auto_claude_bridge.py
New bridge exposing 60+ Auto-Claude capabilities as CrewAI-compatible tools; handles async/sync wrapping, lazy imports, progress callbacks, and spec/development/QA/release operations.
Backend Orchestration Crews
auto-claude/orchestration/crews/__init__.py, product_management.py, development.py, qa_release.py
Three crew modules providing agent factories and orchestration: Product Management (analyze/validate/prioritize), Development (tech lead/developer/reviewer), QA & Release (QA lead/security/release manager) with sequential task dependencies.
Backend Orchestration Flows
auto-claude/orchestration/flows/__init__.py, development_workflow.py
New workflow orchestrator implementing full development lifecycle: intake, analysis, routing by task type, development execution, QA validation with iteration limits, escalation paths, and release preparation; includes WorkflowState, TaskType, WorkflowStatus enums.
Backend Orchestration Notifications
auto-claude/orchestration/notifications/__init__.py, service.py
Multi-channel notification framework with console/Slack/email/webhook/Linear channels; EscalationManager detects QA iteration limits, consecutive failures, security issues, and idle timeouts with history tracking.
Tests
tests/test_crewai.py
Comprehensive test suite covering configuration loading, agent models, notification service behavior, escalation workflows, and integration scaffolding.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Frontend as Frontend<br/>(Settings UI)
    participant Orchestration as Orchestration<br/>(Workflows)
    participant ProductMgmt as ProductMgmt<br/>Crew
    participant DevCrew as Development<br/>Crew
    participant QACrew as QA/Release<br/>Crew
    participant Bridge as AutoClaudeBridge<br/>(Tools)
    participant Notifications as Notification<br/>Service

    User->>Frontend: Submit request
    Frontend->>Orchestration: run_development_workflow()
    
    rect rgb(200, 220, 255)
        note right of Orchestration: Analysis Phase
        Orchestration->>ProductMgmt: create_product_management_crew()
        ProductMgmt->>ProductMgmt: analyze request
        ProductMgmt->>ProductMgmt: validate requirements
        ProductMgmt->>ProductMgmt: prioritize tasks
        ProductMgmt-->>Orchestration: prioritized plan + task type
    end
    
    rect rgb(220, 255, 220)
        note right of Orchestration: Development Phase
        Orchestration->>DevCrew: create_development_crew()
        DevCrew->>Bridge: get_tool(), get_all_tools()
        DevCrew->>DevCrew: design architecture
        DevCrew->>Bridge: run_coder_session()
        DevCrew->>DevCrew: implement code
        DevCrew->>Bridge: run_tests()
        DevCrew->>DevCrew: review changes
        DevCrew-->>Orchestration: implementation report
    end
    
    rect rgb(255, 240, 200)
        note right of Orchestration: QA & Release Phase
        Orchestration->>QACrew: create_qa_release_crew()
        QACrew->>Bridge: get_all_tools()
        QACrew->>QACrew: validate QA criteria
        QACrew->>Bridge: run_qa_validation()
        QACrew->>QACrew: security scan
        QACrew->>Bridge: check_security()
        
        alt QA Passes
            QACrew->>QACrew: prepare release
            QACrew-->>Orchestration: release document
        else QA Fails (retry)
            QACrew->>Notifications: notify_warning()
            QACrew-->>Orchestration: retry iteration
        end
    end
    
    rect rgb(255, 200, 200)
        note right of Orchestration: Escalation (if needed)
        Orchestration->>Notifications: check escalation conditions()
        alt Max iterations or failures exceeded
            Notifications->>Notifications: evaluate escalation reason
            Notifications-->>User: escalate_to_human()
        end
    end
    
    Orchestration-->>Frontend: WorkflowState (completed/failed/escalated)
    Frontend-->>User: Display result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

This PR introduces substantial new infrastructure with multiple interacting modules, dense orchestration logic, new UI components with state management, type definitions, and internationalization. Review requires understanding workflow orchestration patterns, crew composition, notification systems, and frontend settings architecture across heterogeneous TypeScript and Python components.

Poem

🐰 A rabbit's ode to CrewAI orchestration...

With crews assembled, workflows flow so neat,
Product → Dev → QA in rhythm sweet,
Profiles balanced, custom agents dance,
Multi-tongue settings give automation a chance!
~🥕 The Code Rabbit

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive Title uses emoji and [RFC] label that obscure the core change; mentions 'CrewAI Orchestration Layer' which is accurate but vague about what aspect is most important, and lacks clarity about whether this is a new feature or breaking change. Clarify the title to focus on the primary change without emoji or labels, e.g., 'Add CrewAI orchestration layer for multi-agent workflows' or 'Integrate CrewAI for multi-spec coordination and escalation'
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 90.30% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @adryserage, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement to Auto-Claude by integrating CrewAI as an orchestration layer. The primary goal is to elevate Auto-Claude's capabilities from executing individual specifications to managing entire product development workflows autonomously. This is achieved by layering a multi-agent system on top of the existing Auto-Claude core, allowing for product-level decision-making, automated task prioritization, and structured human intervention when necessary, without altering Auto-Claude's fundamental operations.

Highlights

  • CrewAI Orchestration Layer: Introduces an experimental integration of CrewAI to provide a high-level orchestration layer on top of the existing Auto-Claude execution engine, enabling multi-agent workflows for complex product development.
  • Specialized Crews: Implements three distinct crews: Product Management (for requirements and prioritization), Development (for implementation via Auto-Claude), and QA & Release (for validation and release preparation).
  • Configurable Agent Models: Allows configuration of each agent's model (Opus/Sonnet/Haiku) and thinking level (High/Medium/Low) through the existing Settings UI, supporting cost optimization and performance tuning.
  • Multi-Channel Notifications & Escalation: Adds multi-channel notification capabilities (Console, Slack, Email, Webhook, Linear) and intelligent escalation mechanisms for scenarios like excessive QA iterations, consecutive failures, or critical security vulnerabilities.
  • Workflow State Machine: Establishes a robust workflow state machine (PENDING → ANALYZING → DEVELOPING → QA_VALIDATION → RELEASE_PREP → COMPLETED, with an ESCALATED state) to manage the development lifecycle.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

"""

from crewai import Agent, Task, Crew, Process
from typing import Optional, List

Check notice

Code scanning / CodeQL

Unused import Note

Import of 'List' is not used.
"""

from crewai import Agent, Task, Crew, Process
from typing import Optional

Check notice

Code scanning / CodeQL

Unused import Note

Import of 'Optional' is not used.
"""

import json
import os

Check notice

Code scanning / CodeQL

Unused import Note test

Import of 'os' is not used.
import json
import os
import sys
import tempfile

Check notice

Code scanning / CodeQL

Unused import Note test

Import of 'tempfile' is not used.
import sys
import tempfile
from pathlib import Path
from unittest.mock import MagicMock, patch

Check notice

Code scanning / CodeQL

Unused import Note test

Import of 'MagicMock' is not used.
/**
* Get human-readable thinking level label
*/
const getThinkingLabel = (thinkingValue: string): string => {

Check notice

Code scanning / CodeQL

Unused variable, import, function or class Note

Unused variable getThinkingLabel.
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an impressive and well-thought-out experimental PR for adding a CrewAI orchestration layer. The architecture is clean, with a clear separation of concerns between crews, flows, the bridge, and notification services. The new frontend settings UI is comprehensive and provides great configurability. The notification and escalation system is particularly robust and enterprise-ready.

I've identified a few critical issues that would prevent the code from running, mainly related to a TypeError in agent crew creation and some hardcoded model IDs that bypass the new configuration system. I've also noted some medium-severity issues regarding platform-dependent tools and a potentially unsafe command execution.

Overall, this is a fantastic foundation for multi-agent orchestration. With a few fixes, this will be a powerful enhancement to Auto-Claude.

Configured Crew ready to execute
"""
# Create bridge for Auto-Claude tools
bridge = AutoClaudeBridge(project_dir=project_dir, spec_dir=spec_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There's a TypeError here. The AutoClaudeBridge constructor is being called with a spec_dir keyword argument, but its __init__ method does not accept it. This will cause the program to crash when creating the development crew.

The bridge is designed to be initialized with just the project_dir, and spec_dir is passed to the individual tool methods as needed.

Suggested change
bridge = AutoClaudeBridge(project_dir=project_dir, spec_dir=spec_dir)
bridge = AutoClaudeBridge(project_dir=project_dir)

Configured Crew ready to execute
"""
# Create bridge for Auto-Claude tools
bridge = AutoClaudeBridge(project_dir=project_dir, spec_dir=spec_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Similar to the development crew, there's a TypeError here. The AutoClaudeBridge constructor is being called with a spec_dir keyword argument, which it doesn't accept. This will cause a crash when creating the QA & Release crew. The spec_dir should be removed from this constructor call.

Suggested change
bridge = AutoClaudeBridge(project_dir=project_dir, spec_dir=spec_dir)
bridge = AutoClaudeBridge(project_dir=project_dir)

Comment on lines +280 to +286
success = asyncio.run(
self._run_agent(
project_dir=self.project_dir,
spec_dir=spec_path,
model="claude-sonnet-4-5-20250929",
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The model ID is hardcoded here. This bypasses the configurable model settings introduced in this PR, where users can select models per agent. This tool is likely called by the seniorDeveloper agent, so it should use the model configured for that agent. A similar issue exists in run_qa_validation.

Suggested change
success = asyncio.run(
self._run_agent(
project_dir=self.project_dir,
spec_dir=spec_path,
model="claude-sonnet-4-5-20250929",
)
)
model_id, _ = get_agent_model("seniorDeveloper")
success = asyncio.run(
self._run_agent(
project_dir=self.project_dir,
spec_dir=spec_path,
model=model_id,
)
)

Comment on lines +348 to +354
success = asyncio.run(
self._run_qa(
project_dir=self.project_dir,
spec_dir=spec_path,
model="claude-sonnet-4-5-20250929",
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The model ID is hardcoded here, just like in run_coder_session. This tool is likely called by the qaLead agent, so it should use the model configured for that agent to respect the user's settings from the UI.

Suggested change
success = asyncio.run(
self._run_qa(
project_dir=self.project_dir,
spec_dir=spec_path,
model="claude-sonnet-4-5-20250929",
)
)
model_id, _ = get_agent_model("qaLead")
success = asyncio.run(
self._run_qa(
project_dir=self.project_dir,
spec_dir=spec_path,
model=model_id,
)
)


try:
result = subprocess.run(
test_command.split(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of test_command.split() is not safe for commands that contain arguments with spaces (e.g., file paths with spaces). This could lead to incorrect command execution or failures. Using shlex.split() is the recommended, safer way to parse shell-like command strings in Python.

You'll also need to add import shlex at the top of the file.

Suggested change
test_command.split(),
shlex.split(test_command),

Comment on lines +425 to +443
def search_codebase(self, query: str) -> str:
"""Search codebase for relevant code."""
try:
result = subprocess.run(
["grep", "-r", "-l", query, "."],
cwd=self.project_dir,
capture_output=True,
text=True,
timeout=30,
)
if result.returncode == 0 and result.stdout:
files = result.stdout.strip().split("\n")[:10]
return f"Found in files: {', '.join(files)}"
return f"No matches found for: {query}"

except subprocess.TimeoutExpired:
return "Search timed out"
except Exception as e:
return f"Search failed: {e}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The search_codebase tool uses grep, which is a Unix-specific command. This will fail on Windows systems. For better portability, consider using a Python-native solution to search through files. A simple implementation could iterate through files and check for the query string. A similar issue exists in scan_secrets.

Comment on lines +34 to +51
def create_requirements_analyst(verbose: bool = False) -> Agent:
"""Create the Requirements Analyst agent."""
model_id, thinking_budget = get_agent_model("requirementsAnalyst")

return Agent(
role="Requirements Analyst",
goal="Validate requirements against the existing codebase and technical constraints",
backstory="""You are a meticulous Requirements Analyst with strong technical
background. You specialize in analyzing how new requirements fit within
existing system architecture, identifying potential conflicts, dependencies,
and technical challenges. You ensure requirements are technically feasible
and well-integrated with the current codebase.""",
verbose=verbose,
llm=model_id,
max_iter=8,
memory=True,
allow_delegation=False,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The RequirementsAnalyst agent's goal is to "Validate requirements against the existing codebase and technical constraints". However, it has not been assigned any tools to inspect the codebase. It seems to rely solely on the codebase_context string passed to the crew.

If this is intentional, the agent's goal and backstory could be clarified to reflect this limitation. If it's meant to be more active, it would need an AutoClaudeBridge instance and tools like get_project_context and search_codebase.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

♻️ Duplicate comments (15)
tests/test_crewai.py (1)

8-12: Unused imports flagged by static analysis.

The imports os, tempfile, and MagicMock are not used in this test file.

apps/frontend/src/renderer/components/settings/CrewAISettings.tsx (1)

121-124: Unused helper function getThinkingLabel.

This function is defined but never called in the component.

auto-claude/orchestration/crews/development.py (2)

9-9: Remove unused import.

List is imported but never used in this file.

Proposed fix
-from typing import Optional, List
+from typing import Optional

227-227: TypeError: spec_dir is not a valid parameter for AutoClaudeBridge.__init__.

The AutoClaudeBridge constructor only accepts project_dir. This will cause a runtime crash when creating the development crew.

Proposed fix
-    bridge = AutoClaudeBridge(project_dir=project_dir, spec_dir=spec_dir)
+    bridge = AutoClaudeBridge(project_dir=project_dir)
auto-claude/orchestration/crews/product_management.py (1)

34-51: RequirementsAnalyst lacks tools to inspect codebase.

The agent's goal states "Validate requirements against the existing codebase and technical constraints," but no tools are assigned. The agent relies solely on the codebase_context string passed to the crew.

If this is intentional (the agent works from provided context only), consider clarifying the goal/backstory. Otherwise, consider providing an AutoClaudeBridge instance with tools like get_project_context or search_codebase.

auto-claude/orchestration/crews/qa_release.py (2)

9-9: Remove unused import.

Optional is imported but never used in this file.

Proposed fix
-from typing import Optional
+from typing import TYPE_CHECKING

Or simply remove the import entirely if no type hints need it.


233-233: TypeError: spec_dir is not a valid parameter for AutoClaudeBridge.__init__.

Same issue as in development.py. This will cause a runtime crash.

Proposed fix
-    bridge = AutoClaudeBridge(project_dir=project_dir, spec_dir=spec_dir)
+    bridge = AutoClaudeBridge(project_dir=project_dir)
auto-claude/orchestration/config.py (1)

151-163: Unused variable and inefficient config loading.

The config variable on line 158 is unused (as flagged by static analysis). Additionally, get_agent_model() internally calls get_crewai_config() on each iteration, resulting in N+1 config loads. Consider passing the config to avoid redundant I/O:

Proposed fix
 def get_all_agent_configs() -> dict[str, tuple[str, int | None]]:
     """
     Get model configurations for all CrewAI agents.

     Returns:
         Dictionary mapping agent names to (model_id, thinking_budget) tuples
     """
-    config = get_crewai_config()
+    config = get_crewai_config()
     result = {}

     for agent_name in DEFAULT_CREWAI_AGENT_MODELS:
-        result[agent_name] = get_agent_model(agent_name)
+        agent_config = config["agent_models"].get(
+            agent_name,
+            {"model": "sonnet", "thinkingLevel": "medium"},
+        )
+        model_short = agent_config.get("model", "sonnet")
+        thinking_level = agent_config.get("thinkingLevel", "medium")
+        model_id = MODEL_ID_MAP.get(model_short, MODEL_ID_MAP["sonnet"])
+        thinking_budget = THINKING_BUDGET_MAP.get(thinking_level, THINKING_BUDGET_MAP["medium"])
+        result[agent_name] = (model_id, thinking_budget)

     return result

Alternatively, refactor get_agent_model to accept an optional pre-loaded config parameter.

auto-claude/orchestration/bridge/auto_claude_bridge.py (7)

14-14: Import get_agent_model is needed to fix hardcoded models below.

The static analysis correctly flags get_agent_model and get_crewai_config as unused. However, get_agent_model should be used in run_coder_session (lines 280-286) and run_qa_validation (lines 348-354) to respect user-configured agent models instead of hardcoding them. Once those methods are fixed, this import will be required.


51-75: Add explanatory comments or logging to empty except clauses.

The empty except clauses on lines 58, 66, and 74 silently suppress import errors without explanation. This makes debugging difficult if Auto-Claude modules are missing or misconfigured. Add brief comments explaining why the imports are optional, or consider logging warnings when imports fail.

🔎 Proposed fix to add explanatory comments
             try:
                 from spec.pipeline.orchestrator import SpecOrchestrator

                 self._spec_orchestrator = SpecOrchestrator
             except ImportError:
+                # Optional: spec orchestrator may not be available in minimal installations
                 pass

         if self._run_agent is None:
             try:
                 from core.agent import run_autonomous_agent

                 self._run_agent = run_autonomous_agent
             except ImportError:
+                # Optional: agent runner may not be available in minimal installations
                 pass

         if self._run_qa is None:
             try:
                 from qa.loop import run_qa_validation_loop

                 self._run_qa = run_qa_validation_loop
             except ImportError:
+                # Optional: QA loop may not be available in minimal installations
                 pass

270-296: Use configured agent model instead of hardcoding.

The model ID is hardcoded to "claude-sonnet-4-5-20250929" on line 284, bypassing the per-agent model configuration introduced in this PR's UI settings. This tool is invoked by CrewAI agents (likely the seniorDeveloper agent), so it should respect the user's model choice for that agent.

🔎 Proposed fix to use configured agent model
     def run_coder_session(self, spec_dir: str) -> str:
         """Run a coder session for implementing subtasks."""
         self._lazy_import()

         if self._run_agent is None:
             return "Error: Agent runner not available"

         try:
             spec_path = Path(spec_dir)

+            model_id, _ = get_agent_model("seniorDeveloper")
             success = asyncio.run(
                 self._run_agent(
                     project_dir=self.project_dir,
                     spec_dir=spec_path,
-                    model="claude-sonnet-4-5-20250929",
+                    model=model_id,
                 )
             )

298-329: Use shlex.split() for safe command parsing.

Line 315 uses test_command.split(), which is unsafe for commands containing arguments with spaces (e.g., file paths). This can cause incorrect command execution or failures.

🔎 Proposed fix using shlex.split()

Add the import at the top of the file:

 import asyncio
 import json
+import shlex
 import subprocess
 from pathlib import Path
 from typing import Any, Callable

Then update line 315:

         try:
             result = subprocess.run(
-                test_command.split(),
+                shlex.split(test_command),
                 cwd=self.project_dir,
                 capture_output=True,
                 text=True,
                 timeout=300,
             )

338-371: Use configured agent model instead of hardcoding.

The model ID is hardcoded to "claude-sonnet-4-5-20250929" on line 352, just like in run_coder_session. This tool is likely invoked by the qaLead agent, so it should use the model configured for that agent to respect the user's settings from the UI.

🔎 Proposed fix to use configured agent model
     def run_qa_validation(self, spec_dir: str) -> dict[str, Any]:
         """Run QA validation loop."""
         self._lazy_import()

         if self._run_qa is None:
             return {"status": "error", "success": False, "error": "QA runner not available"}

         try:
             spec_path = Path(spec_dir)

+            model_id, _ = get_agent_model("qaLead")
             success = asyncio.run(
                 self._run_qa(
                     project_dir=self.project_dir,
                     spec_dir=spec_path,
-                    model="claude-sonnet-4-5-20250929",
+                    model=model_id,
                 )
             )

425-443: Replace grep with Python-native solution for cross-platform compatibility.

The search_codebase tool uses the Unix-specific grep command (line 429), which will fail on Windows systems. For better portability, use a Python-native solution that iterates through files and searches for the query string. The same issue exists in scan_secrets (lines 530-570).

🔎 Proposed Python-native implementation
     def search_codebase(self, query: str) -> str:
         """Search codebase for relevant code."""
         try:
-            result = subprocess.run(
-                ["grep", "-r", "-l", query, "."],
-                cwd=self.project_dir,
-                capture_output=True,
-                text=True,
-                timeout=30,
-            )
-            if result.returncode == 0 and result.stdout:
-                files = result.stdout.strip().split("\n")[:10]
-                return f"Found in files: {', '.join(files)}"
-            return f"No matches found for: {query}"
+            matching_files = []
+            for file_path in self.project_dir.rglob("*"):
+                if file_path.is_file() and not any(
+                    part.startswith(".") for part in file_path.parts
+                ):
+                    try:
+                        content = file_path.read_text(encoding="utf-8", errors="ignore")
+                        if query in content:
+                            matching_files.append(str(file_path.relative_to(self.project_dir)))
+                            if len(matching_files) >= 10:
+                                break
+                    except (OSError, UnicodeDecodeError):
+                        continue
+            
+            if matching_files:
+                return f"Found in files: {', '.join(matching_files)}"
+            return f"No matches found for: {query}"

-        except subprocess.TimeoutExpired:
-            return "Search timed out"
         except Exception as e:
             return f"Search failed: {e}"

530-570: Replace grep with Python-native solution and document exception handling.

This method has two issues:

  1. Windows incompatibility: Uses the Unix-specific grep command (line 546), which fails on Windows. Use a Python-native file search instead.
  2. Empty except clause: Line 565 silently suppresses all exceptions without explanation, making debugging difficult.
🔎 Proposed cross-platform implementation
     def scan_secrets(self) -> str:
         """Scan for exposed secrets."""
         # Simple pattern-based secret detection
         patterns = [
             "password",
             "secret",
             "api_key",
             "apikey",
             "token",
             "private_key",
         ]

         findings = []
         for pattern in patterns:
             try:
-                result = subprocess.run(
-                    ["grep", "-r", "-i", "-l", pattern, "."],
-                    cwd=self.project_dir,
-                    capture_output=True,
-                    text=True,
-                    timeout=30,
-                )
-                if result.returncode == 0 and result.stdout:
-                    files = result.stdout.strip().split("\n")
-                    # Filter out common false positives
-                    files = [
-                        f
-                        for f in files
-                        if not any(
-                            x in f
-                            for x in [".env.example", "requirements", "package.json", ".md", ".txt"]
-                        )
-                    ]
-                    if files:
-                        findings.append(f"{pattern}: {len(files)} potential files")
-            except Exception:
+                matching_files = []
+                for file_path in self.project_dir.rglob("*"):
+                    if file_path.is_file() and not any(
+                        part.startswith(".") for part in file_path.parts
+                    ):
+                        rel_path = str(file_path.relative_to(self.project_dir))
+                        # Filter out common false positives
+                        if any(
+                            x in rel_path
+                            for x in [".env.example", "requirements", "package.json", ".md", ".txt"]
+                        ):
+                            continue
+                        
+                        try:
+                            content = file_path.read_text(encoding="utf-8", errors="ignore").lower()
+                            if pattern in content:
+                                matching_files.append(rel_path)
+                        except (OSError, UnicodeDecodeError):
+                            continue
+                
+                if matching_files:
+                    findings.append(f"{pattern}: {len(matching_files)} potential files")
+            except Exception:
+                # Suppress errors for individual pattern searches to continue scanning
                 pass

         if findings:
             return f"Potential secrets found:\n" + "\n".join(findings)
         return "No obvious secrets detected"
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3db02c5 and b7bf54e.

📒 Files selected for processing (20)
  • apps/backend/requirements.txt
  • apps/frontend/src/renderer/components/settings/AppSettings.tsx
  • apps/frontend/src/renderer/components/settings/CrewAISettings.tsx
  • apps/frontend/src/shared/constants/models.ts
  • apps/frontend/src/shared/i18n/locales/en/settings.json
  • apps/frontend/src/shared/i18n/locales/fr/settings.json
  • apps/frontend/src/shared/types/settings.ts
  • auto-claude/orchestration/__init__.py
  • auto-claude/orchestration/bridge/__init__.py
  • auto-claude/orchestration/bridge/auto_claude_bridge.py
  • auto-claude/orchestration/config.py
  • auto-claude/orchestration/crews/__init__.py
  • auto-claude/orchestration/crews/development.py
  • auto-claude/orchestration/crews/product_management.py
  • auto-claude/orchestration/crews/qa_release.py
  • auto-claude/orchestration/flows/__init__.py
  • auto-claude/orchestration/flows/development_workflow.py
  • auto-claude/orchestration/notifications/__init__.py
  • auto-claude/orchestration/notifications/service.py
  • tests/test_crewai.py
🧰 Additional context used
📓 Path-based instructions (5)
apps/frontend/src/shared/i18n/locales/**/*.json

📄 CodeRabbit inference engine (CLAUDE.md)

apps/frontend/src/shared/i18n/locales/**/*.json: Store translation strings in namespace-organized JSON files at apps/frontend/src/shared/i18n/locales/{lang}/*.json for each supported language
When implementing new frontend features, add translation keys to all language files (minimum: en/.json and fr/.json)

Files:

  • apps/frontend/src/shared/i18n/locales/fr/settings.json
  • apps/frontend/src/shared/i18n/locales/en/settings.json
apps/frontend/src/**/*.{ts,tsx,jsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Always use i18n translation keys for all user-facing text in the frontend instead of hardcoded strings

Files:

  • apps/frontend/src/shared/constants/models.ts
  • apps/frontend/src/renderer/components/settings/AppSettings.tsx
  • apps/frontend/src/shared/types/settings.ts
  • apps/frontend/src/renderer/components/settings/CrewAISettings.tsx
apps/frontend/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Use useTranslation() hook with namespace prefixes (e.g., 'navigation:items.key') for accessing translation strings in React components

Files:

  • apps/frontend/src/shared/constants/models.ts
  • apps/frontend/src/renderer/components/settings/AppSettings.tsx
  • apps/frontend/src/shared/types/settings.ts
  • apps/frontend/src/renderer/components/settings/CrewAISettings.tsx
apps/frontend/**/*.{ts,tsx}

⚙️ CodeRabbit configuration file

apps/frontend/**/*.{ts,tsx}: Review React patterns and TypeScript type safety.
Check for proper state management and component composition.

Files:

  • apps/frontend/src/shared/constants/models.ts
  • apps/frontend/src/renderer/components/settings/AppSettings.tsx
  • apps/frontend/src/shared/types/settings.ts
  • apps/frontend/src/renderer/components/settings/CrewAISettings.tsx
tests/**

⚙️ CodeRabbit configuration file

tests/**: Ensure tests are comprehensive and follow pytest conventions.
Check for proper mocking and test isolation.

Files:

  • tests/test_crewai.py
🧠 Learnings (4)
📚 Learning: 2025-12-30T16:38:36.314Z
Learnt from: CR
Repo: AndyMik90/Auto-Claude PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T16:38:36.314Z
Learning: Applies to apps/backend/**/*.py : Always use the Claude Agent SDK (`claude-agent-sdk` package) for all AI interactions, never use the Anthropic API directly

Applied to files:

  • apps/backend/requirements.txt
  • auto-claude/orchestration/__init__.py
  • auto-claude/orchestration/bridge/auto_claude_bridge.py
📚 Learning: 2025-12-30T16:38:36.314Z
Learnt from: CR
Repo: AndyMik90/Auto-Claude PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T16:38:36.314Z
Learning: Applies to apps/frontend/src/shared/i18n/locales/**/*.json : When implementing new frontend features, add translation keys to all language files (minimum: en/*.json and fr/*.json)

Applied to files:

  • apps/frontend/src/shared/i18n/locales/fr/settings.json
  • apps/frontend/src/shared/i18n/locales/en/settings.json
📚 Learning: 2025-12-30T16:38:36.314Z
Learnt from: CR
Repo: AndyMik90/Auto-Claude PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-30T16:38:36.314Z
Learning: Applies to apps/backend/core/client.py : Implement agent-specific tool permissions in the Claude SDK client based on agent role (planner, coder, qa_reviewer, qa_fixer)

Applied to files:

  • auto-claude/orchestration/crews/product_management.py
  • auto-claude/orchestration/bridge/auto_claude_bridge.py
📚 Learning: 2025-12-19T15:00:48.233Z
Learnt from: AndyMik90
Repo: AndyMik90/Auto-Claude PR: 41
File: auto-claude/qa/loop.py:126-136
Timestamp: 2025-12-19T15:00:48.233Z
Learning: In auto-claude/qa/loop.py, when creating clients for QA fixer sessions (including human feedback processing), use get_phase_model(spec_dir, "qa", model) instead of hardcoding "sonnet" as the fallback to support dynamic model selection based on profiles.

Applied to files:

  • auto-claude/orchestration/bridge/auto_claude_bridge.py
🧬 Code graph analysis (10)
apps/frontend/src/shared/constants/models.ts (1)
apps/frontend/src/shared/types/settings.ts (2)
  • CrewAIAgentModelsConfig (180-193)
  • CrewAIProfileDefinition (196-201)
auto-claude/orchestration/notifications/__init__.py (1)
auto-claude/orchestration/notifications/service.py (13)
  • NotificationService (438-553)
  • EscalationManager (571-713)
  • NotificationChannel (77-88)
  • ConsoleChannel (91-129)
  • SlackChannel (132-207)
  • EmailChannel (210-294)
  • WebhookChannel (297-330)
  • LinearChannel (333-435)
  • Notification (52-74)
  • NotificationType (32-38)
  • NotificationPriority (24-29)
  • EscalationEvent (557-568)
  • EscalationReason (41-48)
apps/frontend/src/renderer/components/settings/AppSettings.tsx (1)
apps/frontend/src/renderer/components/settings/CrewAISettings.tsx (1)
  • CrewAISettings (49-388)
apps/frontend/src/renderer/components/settings/CrewAISettings.tsx (3)
apps/frontend/src/shared/types/settings.ts (5)
  • CrewAIProfile (171-171)
  • CrewAIAgentModelsConfig (180-193)
  • ModelTypeShort (164-164)
  • ThinkingLevel (161-161)
  • CrewAIAgentConfig (174-177)
apps/frontend/src/shared/constants/models.ts (5)
  • DEFAULT_CREWAI_AGENT_MODELS (162-175)
  • AVAILABLE_MODELS (19-23)
  • THINKING_LEVELS (46-52)
  • CREWAI_AGENT_LABELS (232-281)
  • CREWAI_CREWS (284-303)
.design-system/src/lib/utils.ts (1)
  • cn (4-6)
auto-claude/orchestration/config.py (1)
apps/frontend/src/shared/constants/models.ts (4)
  • MODEL_ID_MAP (26-30)
  • THINKING_BUDGET_MAP (33-39)
  • DEFAULT_CREWAI_AGENT_MODELS (162-175)
  • CREWAI_PROFILES (204-229)
auto-claude/orchestration/crews/qa_release.py (2)
auto-claude/orchestration/config.py (1)
  • get_agent_model (124-148)
auto-claude/orchestration/bridge/auto_claude_bridge.py (1)
  • get_tool (81-131)
auto-claude/orchestration/bridge/__init__.py (1)
auto-claude/orchestration/bridge/auto_claude_bridge.py (1)
  • AutoClaudeBridge (17-637)
tests/test_crewai.py (3)
auto-claude/orchestration/config.py (3)
  • get_crewai_config (78-116)
  • is_crewai_enabled (119-121)
  • get_agent_model (124-148)
auto-claude/orchestration/notifications/service.py (27)
  • NotificationService (438-553)
  • ConsoleChannel (91-129)
  • NotificationType (32-38)
  • notify (489-522)
  • notify_success (524-526)
  • notify_error (537-544)
  • SlackChannel (132-207)
  • is_configured (86-88)
  • is_configured (97-98)
  • is_configured (138-139)
  • is_configured (229-235)
  • is_configured (303-304)
  • is_configured (340-341)
  • EmailChannel (210-294)
  • WebhookChannel (297-330)
  • LinearChannel (333-435)
  • Notification (52-74)
  • NotificationPriority (24-29)
  • to_dict (63-74)
  • EscalationManager (571-713)
  • check_qa_iterations (595-614)
  • EscalationReason (41-48)
  • check_consecutive_failures (616-637)
  • check_security_vulnerabilities (639-668)
  • trigger_manual_escalation (670-685)
  • get_escalation_history (707-709)
  • clear_history (711-713)
auto-claude/orchestration/flows/development_workflow.py (4)
  • WorkflowState (45-84)
  • WorkflowStatus (33-42)
  • TaskType (24-30)
  • run_development_workflow (347-386)
auto-claude/orchestration/crews/product_management.py (1)
auto-claude/orchestration/config.py (1)
  • get_agent_model (124-148)
auto-claude/orchestration/bridge/auto_claude_bridge.py (2)
auto-claude/orchestration/config.py (2)
  • get_agent_model (124-148)
  • get_crewai_config (78-116)
apps/backend/spec/pipeline/orchestrator.py (1)
  • _run_agent (129-161)
🪛 Biome (2.1.2)
apps/frontend/src/renderer/components/settings/CrewAISettings.tsx

[error] 134-146: Provide an explicit type prop for the button element.

The default type of a button is submit, which causes the submission of a form when placed inside a form element. This is likely not the behaviour that you want inside a React application.
Allowed button types are: submit, button or reset

(lint/a11y/useButtonType)

🪛 GitHub Actions: CI
tests/test_crewai.py

[error] 57-57: TestCrewAIConfig.config_loaded_from_settings_file failed: config["enabled"] is False but expected True.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: CodeQL (javascript-typescript)
  • GitHub Check: CodeQL (python)
🔇 Additional comments (24)
auto-claude/orchestration/notifications/service.py (5)

24-74: LGTM!

The enum definitions and Notification dataclass are well-structured. The use of datetime.now as a default factory and ISO format serialization for timestamps are appropriate.


91-130: LGTM!

The ConsoleChannel implementation is clean and provides good visual feedback with emojis and formatting.


132-208: LGTM!

The SlackChannel implementation correctly handles webhook delivery with appropriate timeout (10s) and error handling.


297-331: LGTM!

The WebhookChannel implementation follows the same robust pattern as SlackChannel with proper timeout and error handling.


420-427: No issue here – the Authorization header format is correct for Linear's personal API key authentication.

Linear's GraphQL API supports two authorization formats: Authorization: Bearer <ACCESS_TOKEN> for OAuth tokens, and Authorization: <API_KEY> for personal API keys. The code correctly uses the personal API key format by setting the header directly to self.api_key without a "Bearer" prefix.

auto-claude/orchestration/bridge/__init__.py (1)

1-10: LGTM!

Clean package initialization following standard Python conventions for public API exports.

auto-claude/orchestration/flows/__init__.py (1)

1-22: LGTM!

Well-structured package initialization with clear documentation and proper public API exports.

apps/frontend/src/shared/i18n/locales/en/settings.json (1)

28-31: French translations are present and properly localized.

The CrewAI settings keys have been translated to French in apps/frontend/src/shared/i18n/locales/fr/settings.json. Both the section labels and the configuration block contain appropriate French translations, meeting the coding guideline requirement to add translation keys to all language files.

apps/backend/requirements.txt (1)

19-21: Versions exist and have no known vulnerabilities.

Both crewai (0.76.0) and crewai-tools (0.12.0) are available on PyPI with no known security vulnerabilities. The >= version constraints will allow newer compatible versions during installation.

apps/frontend/src/renderer/components/settings/AppSettings.tsx (1)

70-83: LGTM! Clean integration of CrewAI settings section.

The new crewai section follows the established pattern for app settings sections. The navigation item uses the Users icon appropriately, and the i18n keys (sections.crewai.title, sections.crewai.description) are accessed via the existing template pattern.

Also applies to: 193-194

apps/frontend/src/shared/i18n/locales/fr/settings.json (1)

28-31: LGTM! French translations for CrewAI settings are complete.

The French localization properly mirrors the expected keys used by CrewAISettings.tsx:

  • sections.crewai.title/description for navigation
  • Root-level crewai.* keys for the settings panel content

The translations are grammatically appropriate and consistent with the existing French localization style.

Also applies to: 261-271

apps/frontend/src/shared/constants/models.ts (2)

156-229: LGTM! Well-structured CrewAI profile configurations.

The profile system is cleanly organized:

  • Balanced: Cost-effective mix using Sonnet/Haiku with appropriate thinking levels
  • Performance: Opus everywhere with high/ultrathink for maximum quality
  • Economy: Haiku-dominant with minimal thinking for cost reduction
  • Custom: Delegates to user-configured crewaiAgentModels from settings

The type safety via CrewAIAgentModelsConfig ensures all 9 agents are configured consistently.


231-303: LGTM! Agent labels and crew groupings support the UI well.

CREWAI_AGENT_LABELS provides human-readable metadata per agent, and CREWAI_CREWS organizes them into accordion-friendly groupings. The as const assertions enable proper type inference for the agent key arrays.

apps/frontend/src/shared/types/settings.ts (1)

166-201: LGTM! Well-designed CrewAI type definitions.

The type structure cleanly separates:

  • CrewAIProfile: Profile preset identifiers
  • CrewAIAgentConfig: Per-agent model/thinking configuration
  • CrewAIAgentModelsConfig: Complete 9-agent configuration mapping
  • CrewAIProfileDefinition: Profile metadata with optional agent configs (null for custom)

The optional fields in AppSettings (crewaiEnabled?, crewaiProfile?, crewaiAgentModels?) ensure backward compatibility and graceful feature flag behavior.

Also applies to: 305-308

auto-claude/orchestration/notifications/__init__.py (1)

1-43: LGTM! Clean package initialization with comprehensive public API.

The __init__.py properly re-exports all notification-related classes and enums:

  • Service classes: NotificationService, EscalationManager
  • Channel implementations: Console, Slack, Email, Webhook, Linear
  • Data classes: Notification, EscalationEvent
  • Enums: NotificationType, NotificationPriority, EscalationReason

The __all__ list is well-organized and matches the test imports in test_crewai.py.

tests/test_crewai.py (1)

38-58: Patch target is correct; suggested fix is incorrect.

The current patch target orchestration.config.Path.home is the correct approach per Python unittest.mock best practices, which dictates patching where an object is used, not where it's defined. Since Path is imported and used in orchestration.config, this is the proper location to patch.

The suggested fix to change to pathlib.Path.home contradicts mocking conventions and could introduce unintended side effects across the codebase. No evidence of test failure was found in the repository history or CI logs.

If the test is actually failing, the issue likely lies elsewhere—such as test isolation problems with os.environ not being properly cleared between tests, or the mock needing to be applied at an earlier point in the import chain. Consider checking:

  • Whether XDG_CONFIG_HOME environment variable is being properly isolated between tests
  • Test execution order and whether earlier tests are affecting the environment state

Likely an incorrect or invalid review comment.

auto-claude/orchestration/config.py (1)

14-27: Constants correctly mirror frontend definitions.

The MODEL_ID_MAP and THINKING_BUDGET_MAP match the frontend constants in apps/frontend/src/shared/constants/models.ts, ensuring consistency between backend and UI. Good alignment.

auto-claude/orchestration/crews/__init__.py (1)

1-45: LGTM!

Clean package initializer with explicit __all__ exports. The docstring clearly documents the available crews and their purposes.

auto-claude/orchestration/__init__.py (1)

1-69: LGTM!

Well-structured package entry point with clear documentation and explicit exports. The module correctly exposes the CrewAI orchestration layer's public API while maintaining separation from core Auto-Claude functionality.

auto-claude/orchestration/crews/development.py (1)

241-252: Task context wiring looks correct.

The architecture → implementation → review task chain is properly established using task.context assignments and template references like "{{architecture_task.output}}". This ensures correct data flow between sequential tasks.

auto-claude/orchestration/crews/product_management.py (1)

196-209: Task context chaining is correctly implemented.

The sequential task dependencies (analyze_taskvalidate_taskprioritize_task) are properly wired using task.context assignments. Template references like "{{analyze_task.output}}" will be resolved by CrewAI at runtime.

auto-claude/orchestration/flows/development_workflow.py (2)

369-370: Good: Graceful degradation when CrewAI is disabled.

The is_crewai_enabled() check with a clear RuntimeError provides proper guardrails and aligns with the PR's feature flag approach.


45-85: Well-structured workflow state model.

The WorkflowState Pydantic model captures all necessary workflow context including inputs, crew outputs, status tracking, error handling, and audit trail. The escalation thresholds (max_qa_iterations=10, max_consecutive_failures=3) align with the PR objectives.

auto-claude/orchestration/crews/qa_release.py (1)

262-270: LGTM: Crew configuration is solid.

The crew setup with sequential processing, memory enabled, context window respect, and rate limiting (max_rpm=10) follows best practices for orchestrated LLM workflows.

Comment on lines +59 to +85
const handleEnableChange = async (enabled: boolean) => {
await saveSettings({ crewaiEnabled: enabled });
};

const handleProfileSelect = async (profileId: CrewAIProfile) => {
await saveSettings({ crewaiProfile: profileId });
};

const handleAgentModelChange = async (agent: AgentKey, model: ModelTypeShort) => {
const newConfig: CrewAIAgentModelsConfig = {
...customAgentModels,
[agent]: { ...customAgentModels[agent], model }
};
await saveSettings({ crewaiAgentModels: newConfig });
};

const handleAgentThinkingChange = async (agent: AgentKey, thinkingLevel: ThinkingLevel) => {
const newConfig: CrewAIAgentModelsConfig = {
...customAgentModels,
[agent]: { ...customAgentModels[agent], thinkingLevel }
};
await saveSettings({ crewaiAgentModels: newConfig });
};

const handleResetToDefaults = async () => {
await saveSettings({ crewaiAgentModels: DEFAULT_CREWAI_AGENT_MODELS });
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider adding error handling for saveSettings calls.

The async handlers (handleEnableChange, handleProfileSelect, handleAgentModelChange, handleAgentThinkingChange, handleResetToDefaults) call saveSettings without try-catch blocks. If saveSettings rejects, the errors will be unhandled.

🔎 Example error handling pattern
const handleEnableChange = async (enabled: boolean) => {
  try {
    await saveSettings({ crewaiEnabled: enabled });
  } catch (error) {
    console.error('Failed to save CrewAI enabled setting:', error);
    // Optionally show a toast notification to the user
  }
};
🤖 Prompt for AI Agents
In apps/frontend/src/renderer/components/settings/CrewAISettings.tsx around
lines 59-85, the async handlers call saveSettings without error handling; wrap
each handler (handleEnableChange, handleProfileSelect, handleAgentModelChange,
handleAgentThinkingChange, handleResetToDefaults) in a try/catch, await
saveSettings inside the try, call process-safe logging (console.error or the app
logger) with a descriptive message in the catch, and optionally surface a
user-facing notification/toast on failure; ensure the handlers still
return/resolve predictably after catching errors.

Comment on lines +134 to +146
<button
key={profile.id}
onClick={() => handleProfileSelect(profile.id)}
disabled={!isEnabled}
className={cn(
'relative w-full rounded-lg border p-4 text-left transition-all duration-200',
'hover:border-primary/50 hover:shadow-sm',
isSelected
? 'border-primary bg-primary/5'
: 'border-border bg-card',
!isEnabled && 'opacity-50 cursor-not-allowed'
)}
>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add explicit type="button" to prevent unintended form submission.

The profile selection button lacks an explicit type attribute. Inside a form context, this would default to type="submit", potentially causing unexpected behavior.

🔎 Proposed fix
       <button
         key={profile.id}
+        type="button"
         onClick={() => handleProfileSelect(profile.id)}
         disabled={!isEnabled}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<button
key={profile.id}
onClick={() => handleProfileSelect(profile.id)}
disabled={!isEnabled}
className={cn(
'relative w-full rounded-lg border p-4 text-left transition-all duration-200',
'hover:border-primary/50 hover:shadow-sm',
isSelected
? 'border-primary bg-primary/5'
: 'border-border bg-card',
!isEnabled && 'opacity-50 cursor-not-allowed'
)}
>
<button
key={profile.id}
type="button"
onClick={() => handleProfileSelect(profile.id)}
disabled={!isEnabled}
className={cn(
'relative w-full rounded-lg border p-4 text-left transition-all duration-200',
'hover:border-primary/50 hover:shadow-sm',
isSelected
? 'border-primary bg-primary/5'
: 'border-border bg-card',
!isEnabled && 'opacity-50 cursor-not-allowed'
)}
>
🧰 Tools
🪛 Biome (2.1.2)

[error] 134-146: Provide an explicit type prop for the button element.

The default type of a button is submit, which causes the submission of a form when placed inside a form element. This is likely not the behaviour that you want inside a React application.
Allowed button types are: submit, button or reset

(lint/a11y/useButtonType)

🤖 Prompt for AI Agents
In apps/frontend/src/renderer/components/settings/CrewAISettings.tsx around
lines 134 to 146, the profile selection <button> is missing an explicit type,
which inside a form will default to type="submit" and can trigger unintended
form submissions; update the JSX to add type="button" to the button element to
ensure it only triggers the click handler and not form submission.

<div className="grid grid-cols-2 gap-3">
{/* Model Select */}
<div className="space-y-1">
<Label className="text-xs text-muted-foreground">Model</Label>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Hardcoded UI labels should use i18n translation keys.

The labels "Model" (line 203) and "Thinking" (line 224) are hardcoded strings. Per coding guidelines, all user-facing text in the frontend should use i18n translation keys.

🔎 Proposed fix
           {/* Model Select */}
           <div className="space-y-1">
-            <Label className="text-xs text-muted-foreground">Model</Label>
+            <Label className="text-xs text-muted-foreground">{t('general.model', 'Model')}</Label>
           {/* Thinking Level Select */}
           <div className="space-y-1">
-            <Label className="text-xs text-muted-foreground">Thinking</Label>
+            <Label className="text-xs text-muted-foreground">{t('agentProfile.thinking', 'Thinking')}</Label>

Also applies to: 224-224

🤖 Prompt for AI Agents
In apps/frontend/src/renderer/components/settings/CrewAISettings.tsx around
lines 203 and 224, the user-facing strings "Model" and "Thinking" are hardcoded;
replace them with i18n translation keys by importing/using the project's i18n
hook or t function (e.g., useTranslation/t) and swap the literal strings for
t('settings.model') and t('settings.thinking') (or the appropriate key names per
the i18n namespace), then add those keys to the translation resource files for
all supported locales.

Comment on lines +270 to +271
<h4 className="font-medium text-sm text-foreground">{crew.name}</h4>
<p className="text-xs text-muted-foreground">{crew.description}</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Crew names and descriptions from constants are not localized.

crew.name and crew.description are pulled from CREWAI_CREWS constants, which contain English-only strings. For full i18n compliance, consider either:

  1. Adding translation keys like t(crewai.crews.${crew.id}.name), or
  2. Documenting that crew names are intentional English product terminology
🤖 Prompt for AI Agents
In apps/frontend/src/renderer/components/settings/CrewAISettings.tsx around
lines 270-271, crew.name and crew.description are rendered directly from
CREWAI_CREWS (English-only); replace these with localization lookups such as
t(`crewai.crews.${crew.id}.name`) and t(`crewai.crews.${crew.id}.description`)
using the existing i18n hook/context, and add the corresponding keys to the
translation JSONs; alternatively, if these names are intentionally fixed English
product terms, add a clear comment and documentation indicating they are not to
be localized.

Comment on lines +65 to +75
def _get_settings_path() -> Path:
"""Get the path to Auto-Claude UI settings file."""
# Check platform-specific paths
if os.name == "nt": # Windows
base = Path(os.environ.get("APPDATA", Path.home() / "AppData" / "Roaming"))
return base / "Auto-Claude" / "settings.json"
elif os.uname().sysname == "Darwin": # macOS
return Path.home() / "Library" / "Application Support" / "Auto-Claude" / "settings.json"
else: # Linux and others
xdg_config = os.environ.get("XDG_CONFIG_HOME", Path.home() / ".config")
return Path(xdg_config) / "Auto-Claude" / "settings.json"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Potential AttributeError on non-Windows, non-POSIX platforms.

os.uname() is only available on Unix-like systems. While the Windows check (os.name == "nt") comes first, on some edge-case platforms (e.g., certain embedded systems or unusual Python builds), os.uname() might still fail.

Consider using sys.platform for safer cross-platform detection:

Proposed fix
+import sys
+
 def _get_settings_path() -> Path:
     """Get the path to Auto-Claude UI settings file."""
     # Check platform-specific paths
     if os.name == "nt":  # Windows
         base = Path(os.environ.get("APPDATA", Path.home() / "AppData" / "Roaming"))
         return base / "Auto-Claude" / "settings.json"
-    elif os.uname().sysname == "Darwin":  # macOS
+    elif sys.platform == "darwin":  # macOS
         return Path.home() / "Library" / "Application Support" / "Auto-Claude" / "settings.json"
     else:  # Linux and others
         xdg_config = os.environ.get("XDG_CONFIG_HOME", Path.home() / ".config")
         return Path(xdg_config) / "Auto-Claude" / "settings.json"

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In auto-claude/orchestration/config.py around lines 65 to 75, the code calls
os.uname() which can raise AttributeError on non-Unix Python builds; replace the
os.uname() check with a robust platform check (e.g., use
sys.platform.startswith("darwin") or platform.system() == "Darwin") and keep the
existing Windows branch (os.name == "nt"); ensure fallback uses
os.environ.get("XDG_CONFIG_HOME", str(Path.home() / ".config")) and return Paths
constructed from those strings so the function never calls os.uname() directly
and safely handles unusual platforms.

Comment on lines +285 to +288
with smtplib.SMTP(self.smtp_host, self.smtp_port) as server:
server.starttls()
server.login(self.smtp_user, self.smtp_password)
server.sendmail(self.from_email, [self.to_email], msg.as_string())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add timeout to SMTP connection to prevent indefinite hangs.

The SMTP connection lacks a timeout parameter, which can cause the notification service to hang indefinitely if the SMTP server is unreachable or slow to respond.

🔎 Proposed fix with timeout
-            with smtplib.SMTP(self.smtp_host, self.smtp_port) as server:
+            with smtplib.SMTP(self.smtp_host, self.smtp_port, timeout=10) as server:
                 server.starttls()
                 server.login(self.smtp_user, self.smtp_password)
                 server.sendmail(self.from_email, [self.to_email], msg.as_string())
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
with smtplib.SMTP(self.smtp_host, self.smtp_port) as server:
server.starttls()
server.login(self.smtp_user, self.smtp_password)
server.sendmail(self.from_email, [self.to_email], msg.as_string())
with smtplib.SMTP(self.smtp_host, self.smtp_port, timeout=10) as server:
server.starttls()
server.login(self.smtp_user, self.smtp_password)
server.sendmail(self.from_email, [self.to_email], msg.as_string())
🤖 Prompt for AI Agents
In auto-claude/orchestration/notifications/service.py around lines 285 to 288,
the SMTP connection is created without a timeout which can cause the service to
hang; fix this by passing a configurable timeout to smtplib.SMTP (e.g.
smtplib.SMTP(self.smtp_host, self.smtp_port, timeout=self.smtp_timeout)), add a
new smtp_timeout attribute/default (e.g. 10 seconds) on the service or config,
and ensure any blocking calls inherit that socket timeout so the send path fails
fast and raises on connect/IO errors instead of hanging indefinitely.

Comment on lines +462 to +463
if enable_console:
self.channels.append(ConsoleChannel(verbose=True))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix inconsistent verbose parameter handling.

Line 463 always creates ConsoleChannel(verbose=True), ignoring the self.verbose flag passed to NotificationService.__init__. This means the verbose setting has no effect on console output.

🔎 Proposed fix
         if enable_console:
-            self.channels.append(ConsoleChannel(verbose=True))
+            self.channels.append(ConsoleChannel(verbose=self.verbose))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if enable_console:
self.channels.append(ConsoleChannel(verbose=True))
if enable_console:
self.channels.append(ConsoleChannel(verbose=self.verbose))
🤖 Prompt for AI Agents
In auto-claude/orchestration/notifications/service.py around lines 462-463, the
ConsoleChannel is always instantiated with verbose=True which ignores the
NotificationService instance's self.verbose setting; update the instantiation to
pass the instance flag (e.g., ConsoleChannel(verbose=self.verbose)) so the
service-level verbose option controls console output, and ensure self.verbose is
set on __init__ (or defaulted) before this line.

Comment on lines +595 to +637
def check_qa_iterations(
self,
iterations: int,
workflow_id: Optional[str] = None,
spec_name: Optional[str] = None,
) -> Optional[EscalationEvent]:
"""Check if QA iterations exceed threshold."""
if iterations > self.max_qa_iterations:
event = EscalationEvent(
reason=EscalationReason.QA_ITERATIONS_EXCEEDED,
title=f"QA Iterations Exceeded ({iterations}/{self.max_qa_iterations})",
description=f"QA validation has run {iterations} times without passing. "
f"Human review required to resolve persistent issues.",
workflow_id=workflow_id,
spec_name=spec_name,
qa_iterations=iterations,
)
self._trigger_escalation(event)
return event
return None

def check_consecutive_failures(
self,
failure_count: int,
last_error: Optional[str] = None,
workflow_id: Optional[str] = None,
spec_name: Optional[str] = None,
) -> Optional[EscalationEvent]:
"""Check if consecutive failures exceed threshold."""
if failure_count >= self.max_consecutive_failures:
event = EscalationEvent(
reason=EscalationReason.CONSECUTIVE_FAILURES,
title=f"Consecutive Failures ({failure_count})",
description=f"The workflow has failed {failure_count} times consecutively. "
f"Last error: {last_error or 'Unknown'}",
workflow_id=workflow_id,
spec_name=spec_name,
failure_count=failure_count,
metadata={"last_error": last_error} if last_error else {},
)
self._trigger_escalation(event)
return event
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider making threshold comparisons consistent.

Lines 602 and 624 use different comparison operators:

  • Line 602: iterations > self.max_qa_iterations (triggers after exceeding)
  • Line 624: failure_count >= self.max_consecutive_failures (triggers at threshold)

This inconsistency means with default values, QA escalates after 11 iterations but failures escalate at exactly 3. If this is intentional (giving one extra QA attempt), consider documenting it. Otherwise, use consistent operators.

🔎 Option 1: Make both use >= for consistency
     def check_qa_iterations(
         self,
         iterations: int,
         workflow_id: Optional[str] = None,
         spec_name: Optional[str] = None,
     ) -> Optional[EscalationEvent]:
         """Check if QA iterations exceed threshold."""
-        if iterations > self.max_qa_iterations:
+        if iterations >= self.max_qa_iterations:
             event = EscalationEvent(
                 reason=EscalationReason.QA_ITERATIONS_EXCEEDED,
                 title=f"QA Iterations Exceeded ({iterations}/{self.max_qa_iterations})",

Option 2: Document the intentional difference in the docstrings

     def check_qa_iterations(
         self,
         iterations: int,
         workflow_id: Optional[str] = None,
         spec_name: Optional[str] = None,
     ) -> Optional[EscalationEvent]:
-        """Check if QA iterations exceed threshold."""
+        """Check if QA iterations exceed threshold (triggers after max_qa_iterations + 1)."""
         if iterations > self.max_qa_iterations:
🤖 Prompt for AI Agents
In auto-claude/orchestration/notifications/service.py around lines 595 to 637,
the QA iterations check uses ">" while the failures check uses ">=", causing
inconsistent escalation timing; change the QA check to use ">=" so both methods
trigger at the configured threshold, and update the check_qa_iterations
docstring to state that escalation occurs when iterations are greater than or
equal to the max threshold.

Comment on lines 79 to 97
def test_get_agent_model_returns_defaults(self, temp_dir):
"""Test get_agent_model returns correct defaults."""
from orchestration.config import get_agent_model

with patch('orchestration.config.Path.home', return_value=temp_dir):
# Product Manager default is sonnet with medium thinking
model_id, thinking_budget = get_agent_model("productManager")
assert "sonnet" in model_id.lower()
assert thinking_budget > 0

# Tech Lead default is opus with high thinking
model_id, thinking_budget = get_agent_model("techLead")
assert "opus" in model_id.lower()
assert thinking_budget > 5000 # High thinking

# Priority Analyst default is haiku with low thinking
model_id, thinking_budget = get_agent_model("priorityAnalyst")
assert "haiku" in model_id.lower()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Test assertions rely on undocumented default configurations.

The tests for get_agent_model assert specific default values (e.g., "sonnet" for productManager, "opus" for techLead, "haiku" for priorityAnalyst) without referencing the actual default configuration constants. If defaults change, these tests will fail silently without indicating a breaking change.

Consider importing and referencing DEFAULT_CREWAI_AGENT_MODELS from the config module to make the expected values explicit:

from orchestration.config import DEFAULT_CREWAI_AGENT_MODELS

# Then assert against actual defaults
assert "sonnet" in model_id.lower()  # DEFAULT_CREWAI_AGENT_MODELS["productManager"]["model"]
🤖 Prompt for AI Agents
tests/test_crewai.py around lines 79 to 97: the test uses hardcoded expected
default model names and thinking thresholds which can become outdated; import
DEFAULT_CREWAI_AGENT_MODELS from orchestration.config and replace the literal
expectations with assertions against that constant (e.g., compare
model_id.lower() contains
DEFAULT_CREWAI_AGENT_MODELS["productManager"]["model"].lower() and assert
thinking_budget equals or meets the configured value from
DEFAULT_CREWAI_AGENT_MODELS for each role), keeping the existing Path.home
patching intact.

Comment on lines +530 to +546
@pytest.mark.skipif(
"crewai" not in sys.modules,
reason="CrewAI SDK not installed",
)
class TestCrewAIIntegration:
"""Integration tests that require CrewAI SDK."""

def test_run_development_workflow_disabled(self, temp_dir):
"""Test that run_development_workflow raises when CrewAI is disabled."""
from orchestration.flows import run_development_workflow

with patch('orchestration.config.is_crewai_enabled', return_value=False):
with pytest.raises(RuntimeError, match="CrewAI is not enabled"):
run_development_workflow(
user_request="Test",
project_dir="/test",
spec_dir="/spec",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Integration test skip condition may not work as intended.

The skip condition "crewai" not in sys.modules only checks if the module is already imported, not if it's available for import. The CrewAI SDK might be installable but not yet imported when this check runs.

🔎 Suggested fix using importlib
+import importlib.util
+
+def is_crewai_available():
+    return importlib.util.find_spec("crewai") is not None
+
 @pytest.mark.skipif(
-    "crewai" not in sys.modules,
+    not is_crewai_available(),
     reason="CrewAI SDK not installed",
 )
 class TestCrewAIIntegration:

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In tests/test_crewai.py around lines 530 to 546, the skipif currently checks
only if "crewai" is in sys.modules which fails to detect whether the CrewAI
package is installable but not yet imported; change the skip condition to detect
importability (for example use importlib.util.find_spec("crewai") is None) or
call pytest.importorskip("crewai") at the top of the test/class so the test is
skipped when the package cannot be imported rather than only when it's already
loaded.

@adryserage
Copy link
Contributor Author

What this PR is
PR #561 introduces an experimental CrewAI-based orchestration layer on top of Auto-Claude. Conceptually, it shifts Auto-Claude from a single linear execution pipeline (Planner → Coder → QA) to a higher-level workflow coordinator capable of managing complete feature lifecycles. CrewAI does not replace Auto-Claude; it sits above it, structuring how work is planned, delegated, reviewed, and released. The PR is explicitly framed as an RFC/early integration, not a finalized architectural commitment.

What it does (functional behavior)
The PR adds a workflow state machine that governs how a feature moves from idea to release, with explicit states such as pending, in-progress, blocked, escalated, or completed. It defines three distinct “crews”: a Product Management crew that translates user intent into structured requirements and priorities; a Development crew that coordinates implementation by delegating actual coding work to Auto-Claude; and a QA & Release crew that validates output, checks security and quality, and prepares releases. In parallel, it introduces notification and escalation logic so failures, stalls, or risk conditions automatically surface to humans instead of silently looping.

Concrete technical changes
On the backend, a new orchestration module is added, containing CrewAI configuration, crew definitions, workflow logic, a bridge to Auto-Claude, and notification handlers. CrewAI and its tools are added as optional dependencies. On the frontend, new settings screens allow enabling or disabling CrewAI and configuring agent profiles, including model choice and reasoning depth per role. Localization files are updated accordingly. Overall, the PR is non-trivial in size, adding several thousand lines of code and touching both backend orchestration and frontend configuration.

Benefits and upside
The primary benefit is feature-level autonomy: instead of running Auto-Claude repeatedly by hand, you can ask for a feature and let the system coordinate planning, implementation, validation, and release. Human involvement becomes exception-driven via explicit escalation rules rather than continuous supervision. Cost control is improved through per-agent model selection, allowing cheaper models for routine steps and stronger models only where judgment matters. The workflow state machine also provides clearer observability into progress, failures, and bottlenecks, which Auto-Claude alone does not explicitly model.

Context, limits, and tradeoffs
This PR adds a second orchestration layer, which increases architectural complexity and makes debugging more challenging, especially when failures occur across CrewAI and Auto-Claude boundaries. It expands the dependency surface and introduces new runtime and operational considerations. The authors mitigate this by keeping the integration optional, guarded by a feature flag, and designed to degrade gracefully when disabled. In short, this PR is less about immediate productivity gains and more about setting Auto-Claude on a path toward structured, multi-agent, end-to-end delivery—at the cost of higher system complexity.

@adryserage
Copy link
Contributor Author

During the development phase, work can be explicitly split into frontend and backend streams, each handled by a dedicated agent or sub-crew. Each stream can be assigned a different model profile, allowing you to optimize for speed and cost on frontend tasks while reserving more capable models for backend logic, data integrity, and security-sensitive work. This turns development into a parallelized process rather than a single linear execution.

In practice, the Product or Planning phase would first classify tasks by domain—frontend, backend, or shared—before execution begins. The Development crew then dispatches frontend work (UI components, pages, styling, i18n, client-side logic) to a frontend-specialized agent and backend work (APIs, database migrations, business rules, integrations) to a backend-specialized agent. Both operate independently but within the same overall workflow state, reporting progress back to the orchestrator.

To keep this parallel flow stable, a contract-first step is essential. Before either side starts coding, the system defines the interface between frontend and backend—API schemas, DTOs, or shared types. Both agents then implement against that agreed contract. This avoids the classic failure mode where frontend and backend drift apart while working in parallel and only discover incompatibilities late in the process.

Once both streams complete, an explicit integration step brings the work back together. This step validates that contracts are respected, runs full builds and tests, resolves conflicts, and prepares the combined output for QA. Conceptually, this is where a Tech Lead agent—or a human, if escalation is triggered—acts as the arbiter to ensure architectural coherence and prevent “two models, two designs” from slipping into production.

The added value of this approach is clear: faster throughput through parallelization, tighter cost control through model specialization, and clearer ownership boundaries. The tradeoff is increased coordination complexity, which must be managed through strict planning, contracts, and integration gates. With those safeguards in place, splitting frontend and backend across different models completes the full end-to-end flow envisioned by the PR.

@adryserage
Copy link
Contributor Author

Discussion in progress on Discord : https://discord.com/channels/1448614759996854284/1456570143978033297

@MikeeBuilds MikeeBuilds added feature New feature or request area/backend This is backend only size/XL Extra large (1000+ lines) ❌ MERGE CONFLICTS labels Jan 2, 2026
- Create crewai package with config, bridge, crews, flows, notifications
- Implement AutoClaudeBridge exposing Auto-Claude as CrewAI tools
- Add config.py to load agent models from UI settings JSON
- Add crewai>=0.76.0 and crewai-tools>=0.12.0 dependencies

Phase 1 of CrewAI integration plan.
- Add CrewAI types to settings.ts (CrewAIProfile, CrewAIAgentModelsConfig)
- Add CrewAI constants to models.ts (profiles, agent labels, crews)
- Create CrewAISettings.tsx component with profile selection and per-agent config
- Integrate CrewAI section into AppSettings navigation
- Add i18n translations for EN and FR

Phase 2 of CrewAI integration plan.
- Add Product Management Crew (Product Manager, Requirements Analyst, Priority Analyst)
- Add Development Crew (Tech Lead, Senior Developer, Code Reviewer)
- Add QA & Release Crew (QA Lead, Security Analyst, Release Manager)
- Each crew creates agents with dynamic model config from UI settings
- Crews use sequential process with task context chaining

Phase 3 of CrewAI integration plan.
- Add DevelopmentWorkflowFlow with Pydantic state management
- Implement complete lifecycle: intake → analysis → dev → QA → release
- Add routing logic for task types and QA iterations
- Include escalation handling for consecutive failures
- Export flow and helpers from flows/__init__.py
- Add NotificationService with Console, Slack, Email, Webhook, Linear channels
- Implement EscalationManager for human intervention triggers
- Support escalation on QA iterations, consecutive failures, security issues
- Export all notification components from main package
- Add tests for config loading and agent model selection
- Add tests for NotificationService and all channel types
- Add tests for EscalationManager with various escalation scenarios
- Add tests for WorkflowState, TaskType, and WorkflowStatus
- Include integration test placeholder for CrewAI SDK
The local crewai/ folder was shadowing the installed crewai SDK package,
causing circular import errors. Renamed to orchestration/ which:
- Avoids naming conflict with crewai==1.6.1 package
- Better describes the module's purpose (orchestration layer)
- All internal imports use relative paths, so no changes needed

Also updated tests/test_crewai.py to import from orchestration.
@adryserage
Copy link
Contributor Author

Rebased onto develop branch, resolved AppSettings.tsx conflicts (merged icon imports, ProfileList/CrewAISettings imports, and AppSection type). Ready for review.

@adryserage adryserage force-pushed the feature/crewai-integration branch from b7bf54e to f6a2090 Compare January 4, 2026 05:24
…e config tests

The tests were failing because patching Path.home() didn't properly override
the XDG_CONFIG_HOME environment variable check. Patching _get_settings_path
directly ensures the tests work regardless of environment configuration.
@adryserage
Copy link
Contributor Author

Test Fix Applied

Fixed the failing test_crewai.py tests by patching _get_settings_path instead of Path.home.

Issue: The original tests patched Path.home() but the config module also checks the XDG_CONFIG_HOME environment variable, which bypassed the patch on GitHub Actions runners.

Solution: Directly patch the _get_settings_path function to return the test settings file path, making the tests work regardless of environment configuration.

Commits:

  • 4569a6b: fix(tests): patch _get_settings_path instead of Path.home for reliable config tests

- Remove unused imports (get_agent_model, get_crewai_config)
- Add spec_dir parameter to AutoClaudeBridge.__init__
- Add explanatory comments to empty except blocks
@adryserage
Copy link
Contributor Author

CodeQL Issues Fixed

Addressed the code scanning alerts:

  1. Added spec_dir parameter to AutoClaudeBridge.__init__ - The crews were passing spec_dir but it wasn't a valid parameter. Now properly supported.

  2. Removed unused imports - get_agent_model and get_crewai_config were imported but not used in the bridge module.

  3. Added explanatory comments to empty except blocks - The lazy import pattern intentionally catches ImportError to allow fallback behavior.

Commits:

  • 60a567a: fix(crewai): address CodeQL issues in AutoClaudeBridge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/backend This is backend only feature New feature or request ❌ MERGE CONFLICTS size/XL Extra large (1000+ lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants