A multi-agent system for automatically generating GenePattern modules.
The generate-module.py
script orchestrates multiple AI agents to:
- Research bioinformatics tools using web search and analysis
- Plan module structure, parameters, and architecture
- Generate module artifacts (Dockerfile, wrapper scripts, manifests, etc.)
- Validate each artifact using the Module Toolkit linters
- Create a complete, ready-to-use GenePattern module
-
Environment Setup:
pip install -r requirements.txt
-
Environment Variables: Edit .env with your API keys and preferences
-
Required Environment Variables:
DEFAULT_LLM_MODEL
: LLM model for agents (default: Claude Sonnet 4)BRAVE_API_KEY
: For web research (optional but recommended)MAX_ARTIFACT_LOOPS
: Max validation retry attempts (default: 5)MODULE_OUTPUT_DIR
: Output directory (default: ./generated-modules)
Run the script and follow the prompts:
python generate-module.py
You'll be prompted for:
- Tool name (required)
- Tool version (optional)
- Primary language (optional)
- Brief description (optional)
- Repository URL (optional)
- Documentation URL (optional)
Tool name: samtools
Tool version: 1.19
Primary language: C
Brief description: Tools for manipulating SAM/BAM files
Repository URL: https://github.com/samtools/samtools
Documentation URL: http://www.htslib.org/doc/samtools.html
- Agent:
researcher_agent
- Purpose: Gather comprehensive information about the tool
- Actions:
- Web search for documentation and examples
- Analyze command-line interface and parameters
- Identify dependencies and requirements
- Research common usage patterns
- Agent:
planner_agent
- Purpose: Create implementation plan based on research
- Actions:
- Map parameters to GenePattern types
- Design parameter groupings for UI
- Plan module architecture and dependencies
- Define validation and testing strategy
- Agents: Multiple artifact-specific agents
- Current Artifacts (in generation order):
wrapper_agent
: Generates wrapper scripts for tool integrationmanifest_agent
: Creates module manifest with metadata and command lineparamgroups_agent
: Creates parameter groupings for UI organizationgpunit_agent
: Generates test definitions for automated testingdocumentation_agent
: Generates user documentationdockerfile_agent
: Creates Dockerfile
For each artifact:
- Generate content using specialized agent
- Write to module directory
- Validate using appropriate linter tool
- If validation fails, retry up to
MAX_ARTIFACT_LOOPS
times - Include feedback from previous attempts in retry prompts
Generated modules are saved to {MODULE_OUTPUT_DIR}/{tool_name}_{timestamp}/
:
samtools_20241222_143022/
├── wrapper.py # Execution wrapper script
├── manifest # Module metadata and command line
├── paramgroups.json # UI parameter groups
├── test.yml # GPUnit test definition
├── README.md # User documentation
└── Dockerfile # Container definition
The script provides real-time status updates:
[14:30:22] INFO: Creating module directory for samtools
[14:30:22] INFO: Created module directory: ./generated-modules/samtools_20241222_143022
[14:30:22] INFO: Starting research on the bioinformatics tool
[14:30:25] INFO: Research phase completed successfully
[14:30:25] INFO: Starting module planning based on research findings
[14:30:28] INFO: Planning phase completed successfully
[14:30:28] INFO: Starting artifact generation
[14:30:28] INFO: Generating dockerfile...
[14:30:31] INFO: Attempt 1/5 for dockerfile
[14:30:34] INFO: Generated Dockerfile (1847 characters)
[14:30:34] INFO: Validating dockerfile...
[14:30:37] INFO: Validation passed for dockerfile
[14:30:37] INFO: Successfully generated and validated dockerfile
After completion, you'll receive a comprehensive report:
============================================================
Module Generation Report
============================================================
Tool Name: samtools
Module Directory: ./generated-modules/samtools_20241222_143022
Research Complete: ✓
Planning Complete: ✓
Artifact Status:
wrapper:
Generated: ✓
Validated: ✓
Attempts: 1
manifest:
Generated: ✓
Validated: ✓
Attempts: 1
paramgroups:
Generated: ✓
Validated: ✓
Attempts: 1
gpunit:
Generated: ✓
Validated: ✓
Attempts: 1
documentation:
Generated: ✓
Validated: ✓
Attempts: 1
dockerfile:
Generated: ✓
Validated: ✓
Attempts: 1
Parameters Identified: 23
- input_file: File (Required)
- output_format: Choice (Optional)
- quality_threshold: Integer (Optional)
- threads: Integer (Optional)
- memory_limit: Text (Optional)
... and 18 more parameters
============================================================
🎉 MODULE GENERATION SUCCESSFUL!
Your GenePattern module is ready in: ./generated-modules/samtools_20241222_143022
============================================================
The script follows Pydantic AI best practices for multi-agent systems:
- Agent Specialization: Each agent has a focused domain expertise
- Structured Communication: Agents pass structured data between phases
- Error Handling: Robust error handling with retry mechanisms
- Validation Integration: Built-in validation using MCP server tools
- Status Tracking: Comprehensive progress monitoring and reporting