swf-fastmon-agent is a fast monitoring service for the ePIC streaming workflow testbed.
This agent monitors STF (Super Time Frame) files, samples TF (Time Frame) subsets, and distributes metadata via Server-Sent Events (SSE) streaming, enabling real-time remote monitoring of ePIC data acquisition processes.
The fast monitoring agent is designed as part of the SWF testbed ecosystem and integrates with:
- swf-testbed: Infrastructure orchestration, process management, and Docker services
- swf-monitor: PostgreSQL database and Django web interface for persistent monitoring data
- swf-common-lib: Shared utilities and BaseAgent framework for messaging
- swf-data-agent: Sends
stf_readymessages when STF files are available for fast monitoring
The agent operates as a managed service within swf-testbed, automatically configured and monitored through the central CLI. It extends the BaseAgent class from swf-common-lib for consistent messaging and logging across the ecosystem.
- Complete SWF testbed ecosystem installed (swf-testbed, swf-monitor, swf-common-lib as siblings)
- Python 3.9+ virtual environment
- All infrastructure services managed by swf-testbed
- Install the testbed ecosystem (from the swf-testbed directory):
cd $SWF_PARENT_DIR/swf-testbed
source install.sh # Installs all components including swf-fastmon-agent- Configure agent environment:
cd $SWF_PARENT_DIR/swf-fastmon-agent
cp .env.example .env
# Edit .env with your values (most defaults work for local development)- Start the complete testbed (infrastructure + agents):
cd $SWF_PARENT_DIR/swf-testbed
swf-testbed start # Starts Docker services + all agents including fastmon- Check agent status:
swf-testbed status # Shows all services and agentsNote: All infrastructure services (PostgreSQL, ActiveMQ, Redis) are managed by swf-testbed via Docker Compose. Do not attempt to run them separately.
For development, you can run the agent manually outside of supervisord:
cd $SWF_PARENT_DIR/swf-fastmon-agent
source $SWF_PARENT_DIR/swf-testbed/.venv/bin/activate
# Message-driven mode (production mode - waits for stf_ready messages)
python -m swf_fastmon_agent.main
# Continuous mode (development/testing - scans directories)
export FASTMON_MODE=continuous
python -m swf_fastmon_agent.mainThe fast monitoring agent is configured through environment variables. Copy .env.example to .env and update with your actual values.
- SWF_MONITOR_URL: HTTPS URL for authenticated API calls
- SWF_MONITOR_HTTP_URL: HTTP URL for REST logging (optional)
- SWF_API_TOKEN: Authentication token for swf-monitor API access
- ACTIVEMQ_HOST, ACTIVEMQ_PORT: ActiveMQ broker connection
- ACTIVEMQ_USER, ACTIVEMQ_PASSWORD: ActiveMQ credentials
- MQ_USER, MQ_PASSWD: Message queue credentials (for mq_comms module)
- MQ_CAFILE: SSL certificate path for secure connections
- FASTMON_MODE: Operation mode -
message(default, message-driven) orcontinuous(polling) - FASTMON_SELECTION_FRACTION: Fraction of STF files to sample (0.0-1.0, default: 0.1)
- FASTMON_TF_FILES_PER_STF: Number of TF files per STF (default: 7)
- SWF_LOG_LEVEL: Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
See .env.example for a complete list of configuration options.
The agent extends BaseAgent from swf-common-lib, providing:
- Message-Driven Processing: Receives
stf_readymessages from data agent via ActiveMQ - TF Sampling: Simulates TF subsamples from STF files based on configuration
- REST API Integration: Records TF metadata via swf-monitor
/api/fastmon-files/endpoint - Workflow Tracking: Tracks processing stages via
/api/workflow-stages/for visibility - Enhanced Heartbeats: Reports workflow metadata (active/completed tasks) to monitor
- Automatic Registration: Auto-registers as ActiveMQ subscriber with monitor
- REST Logging: Centralized logging via swf-monitor REST API
- Connection Resilience: Automatic reconnection on MQ disconnection
- Dual Operation Modes:
- Message-driven mode (default): Responds to
stf_readymessages from data agent - Continuous mode: Periodically scans directories (for development/testing)
- Message-driven mode (default): Responds to
- Sequential Agent IDs: Uses persistent state API for unique agent naming
- Environment Auto-Setup: Loads virtual environment and ~/.env variables on startup
- Standard Message Types: Follows workflow message type conventions
- MQ Client ID: Supports durable subscriptions with unique client IDs
- Status Reporting: Reports agent status, errors, and performance metrics
- Real-time Display: Receives and displays TF file notifications in terminal via SSE streaming
- Statistics Tracking: Monitors per-run TF counts and data volume
- Graceful Shutdown: Handles Ctrl+C with summary statistics
- Authentication: Uses API tokens for secure SSE stream access
- STF File Detection: Agent monitors directories for new STF files or receives data_ready messages
- TF Simulation: Generates TF subsamples from STF files based on configuration parameters
- Database Recording: Records TF metadata in swf-monitor database via REST API
- SSE Message Broadcasting: Agent sends TF file notifications to swf-monitor's
/api/messages/endpoint - Real-time Streaming: swf-monitor broadcasts messages via SSE to connected clients at
/api/messages/stream/ - Client Display: Client receives SSE stream and displays formatted TF information in real-time
- Historical Access: All data accessible via swf-monitor Django web application
- Simplified Architecture: No ActiveMQ dependency for clients - only HTTP access required
- Better Scalability: SSE handles many concurrent read-only client connections efficiently
- Enhanced Security: API token-based authentication with fine-grained access control
- Web Integration Ready: Easy to add web-based dashboards that consume the same SSE stream
# Start the SSE client with default settings
python -m swf_fastmon_client.main start
# Connect to a specific monitor URL
python -m swf_fastmon_client.main start --monitor-url https://my-monitor.domain.com
# Filter by specific message types
python -m swf_fastmon_client.main start --message-types tf_file_registered,fastmon_status
# Filter by specific agents
python -m swf_fastmon_client.main start --agents swf-fastmon-agent
# Combined filtering
python -m swf_fastmon_client.main start --message-types tf_file_registered --agents swf-fastmon-agent# Set up environment variables (recommended)
export SWF_MONITOR_URL="http://localhost:8002"
export SWF_API_TOKEN="your_api_token_here"
# Then start with defaults
python -m swf_fastmon_client.main start# Check client configuration
python -m swf_fastmon_client.main status
# Show version information
python -m swf_fastmon_client.main versionAll development and testing should use the swf-testbed framework:
# Start testbed services (PostgreSQL, ActiveMQ, Redis via Docker)
cd $SWF_PARENT_DIR/swf-testbed
swf-testbed start
# Check system status
swf-testbed status
# View logs
swf-testbed logs # All agent logs
tail -f logs/swf-fastmon-agent.log # Fast monitor agent only
# Stop services
swf-testbed stop# Run complete testbed test suite (includes all agents)
cd $SWF_PARENT_DIR/swf-testbed
./run_all_tests.sh
# Run fast monitoring agent tests specifically
cd $SWF_PARENT_DIR/swf-fastmon-agent
source $SWF_PARENT_DIR/swf-testbed/.venv/bin/activate
python -m pytest src/swf_fastmon_agent/tests/This agent is part of the ePIC streaming workflow testbed ecosystem and follows strict integration guidelines:
- Never run infrastructure services independently - always use
swf-testbed start - Use swf-testbed CLI for all service management (start, stop, status, logs)
- Follow BaseAgent patterns from swf-common-lib for consistency
- Coordinate changes across repositories using infrastructure branches
See CLAUDE.md for detailed development guidelines and project-specific conventions.
Symptoms: Connection refused, "Cannot connect to database/ActiveMQ" errors Solutions:
- Verify testbed services are running:
swf-testbed status - Start services if needed:
swf-testbed start - Check Docker containers:
docker ps | grep swf - Review service logs:
docker-compose logs postgres activemq redis
Symptoms: Agent process fails to start or crashes immediately Solutions:
- Check agent logs at
$SWF_PARENT_DIR/swf-testbed/logs/ - Verify
.envconfiguration exists and has required values - Ensure virtual environment is activated
- Check supervisord status:
swf-testbed status
Symptoms: API timeout errors, "Cannot connect to swf-monitor" errors Solutions:
- Verify swf-monitor is running:
swf-testbed status - Check database connection:
docker exec swf-postgres pg_isready - Verify API token is valid:
echo $SWF_API_TOKEN - Test API endpoint:
curl -H "Authorization: Token $SWF_API_TOKEN" $SWF_MONITOR_URL/api/runs/ - Review monitor logs for 400/500 errors
Symptoms: "This field may not be null" or "stf_file does not exist" errors Solutions:
- Verify STF file is registered before creating TF files
- Check that
file_id(UUID) is being passed, not filename - Ensure message data includes
file_idfield from STF record
Symptoms: "Auth failed" or "SSE endpoint not available" errors Solutions:
- Verify swf-monitor SSE endpoints are enabled
- Check
SWF_API_TOKENis set and valid - Ensure
SWF_MONITOR_URLis correct (typicallyhttp://localhost:8002) - Test SSE endpoint:
curl -H "Authorization: Token $SWF_API_TOKEN" http://localhost:8002/api/messages/stream/
Symptoms: Client connects but receives no messages Solutions:
- Check if agent is processing files:
tail -f logs/swf-fastmon-agent.log - Verify message type filters match (
tf_file_registered) - Check agent is sending notifications (look for "Sent TF file notification" in logs)
# Minimal local development configuration
export SWF_MONITOR_URL="http://localhost:8002"
export SWF_MONITOR_HTTP_URL="http://localhost:8002"
export SWF_API_TOKEN="your_token_here"