A Prefect server for orchestrating machine learning training and inference runs. Supports YOLO (training and inference), SegGPT (inference), ONNX models (inference), and IFCB flow metric (training). After setting up the system with Docker, users can monitor and run the workflows using the UI accessible in a browser.
Copy the example environment file and fill in your values:
cp .env.example .env
Edit .env
with your specific values:
POSTGRES_USERNAME
: Your PostgreSQL usernamePOSTGRES_PASSWORD
: Your PostgreSQL passwordEXTERNAL_HOST_NAME
: External hostname of your machinePROVENANCE_STORE_URL
: URL for provenance storeMEDIASTORE_URL
: URL for your media storeMEDIASTORE_TOKEN
: Authentication token for media store
Use Docker Compose to start the PostgreSQL container:
docker compose up -d postgres
Create a virtual environment and install dependencies:
python3 -m venv .venv
source .venv/bin/activate
pip install -r src/requirements.txt
Load environment variables and configure Prefect:
# Load environment variables
source .env
# Set Prefect configuration
prefect config set PREFECT_SERVER_API_HOST="$EXTERNAL_HOST_NAME"
prefect config set PREFECT_API_DATABASE_CONNECTION_URL="postgresql+asyncpg://$POSTGRES_USERNAME:$POSTGRES_PASSWORD@localhost:5432/prefect"
# Start Prefect server
prefect server start
In separate terminal windows, deploy the workflows you want to use:
For ONNX Inference:
source .venv/bin/activate
source .env
python src/flows/onnx_inference.py
For YOLO Inference:
source .venv/bin/activate
source .env
python src/flows/yolo_inference.py
For IFCB Flow Metric Training:
source .venv/bin/activate
source .env
python src/flows/ifcb_training.py
Navigate to the Prefect UI in your browser at http://{EXTERNAL_HOST_NAME}:4200
The ONNX inference workflow requires the following parameters in the Prefect UI:
ONNXInferenceParams:
model
: Path to the ONNX model fileinput_dir
: Directory containing input dataoutput_dir
: Directory where results will be savedbatch
(optional): Batch size for inferenceclasses
(optional): Specific classes to processoutfile
(optional): Custom output filenameforce_notorch
(optional): Force non-PyTorch backendcuda_visible_devices
: GPU devices to use (default: "0,1,2,3")
The YOLO inference workflow requires two parameter sets:
YOLOInferenceParams:
data_dir
: Directory containing input images/videosoutput_dir
: Directory where results will be savedmodel_weights_path
: Path to YOLO model weights (.pt file)device
: Compute device for inference (e.g., "0" for GPU 0, "cpu")agnostic_nms
: Class-agnostic Non-Maximum Suppression (default: true)iou
: IoU threshold for NMS to eliminate overlapping boxes (default: 0.5)conf
: Minimum confidence threshold for detections (default: 0.1)imgsz
: Image size for inference (default: 1280)batch
: Batch size for processing multiple inputs (default: 16)half
: Half-precision (FP16) inference for speed (default: false)max_det
: Maximum detections allowed per image (default: 300)vid_stride
: Frame stride for video processing (default: 1)stream_buffer
: Queue frames vs drop old frames (default: false)visualize
: Visualize model features during inference (default: false)augment
: Test-time augmentation for improved robustness (default: false)classes
: Filter predictions to specific class IDs (optional)retina_masks
: High-resolution segmentation masks (default: false)embed
: Extract feature vectors from specified layers (optional)name
: Name for prediction run subdirectory (optional)verbose
: Display detailed inference logs (default: true)
YOLOVisualizationParams:
show
: Display annotated images/videos in window (default: false)save
: Save annotated images/videos to file (default: false)save_frames
: Save individual video frames as images (default: false)save_txt
: Save detection results in text format (default: false)save_conf
: Include confidence scores in saved text files (default: false)save_crop
: Save cropped images of detections (default: false)show_labels
: Display labels for each detection (default: true)show_conf
: Display confidence scores alongside labels (default: true)show_boxes
: Draw bounding boxes around detected objects (default: true)
The IFCB flow metric training workflow requires the following parameters in the Prefect UI:
IFCBTrainingParams:
data_dir
: Directory containing IFCB point cloud dataoutput_dir
: Directory where trained model will be savedid_file
(optional): File containing list of IDs to load (one PID per line)n_jobs
: Number of parallel jobs for load/extraction phase (-1 uses all CPUs, default: -1)contamination
: Expected fraction of anomalous distributions (default: 0.1)aspect_ratio
: Camera frame aspect ratio (width/height, default: 1.36)chunk_size
: Number of PIDs to process in each chunk (default: 100)model_filename
: Filename for the trained model (default: "classifier.pkl")
For YOLO training, the data directory should contain a dataset.yaml
file:
path: /data # dataset root dir
train: images/train # train images (relative to 'path')
val: images/val # val images (relative to 'path')
test: images/test # test images (optional)
names:
0: person
1: bicycle
2: car
Ensure your directory structure matches:
data_dir/
├── dataset.yaml
├── images/
│ ├── train/
│ ├── val/
│ └── test/ (optional)
└── labels/
├── train/
├── val/
└── test/ (optional)