What's Changed
- Added Slurm dependency example
- Added unit tests for vec-inf client and missing unit tests for vec-inf API
- Fixed multi-node launch GPU placement group issue:
--exclusive
option is needed for slurm script and compilation config needs to stay at 0 - Set environment variables in the generated slurm script instead of in the helper to ensure reusability
- Replaced
python3.10 -m vllm.entrypoints.openai.api_server
withvllm serve
to support custom chat template usage - Added additional launch options:
--exclude
for excluding certain nodes,--node-list
for targeting a specific list of nodes, and--bind
for binding additional directories - Added remaining vLLM engine arg short-long name mappings for robustness
- Added notes in documentation to capture some gotchas and added vLLM version info
Tests Added
tests/vec-inf/client/test_api.py
:
shutdown_model()
wait_until_ready()
tests/vec-inf/client/test_helper.py
:
ModelRegistry
PerformanceMetricsCollector
ModelStatusMonitor
ModelLauncher