This repository provides a KNIME extension for fine-tuning and predicting with Vision Transformer (ViT) models. The nodes are fully developed in Python using PyTorch and HuggingFace Transformers and can be integrated into your KNIME workflows via the KNIME Analytics Platform.
The extension can be installed via the KNIME Hub by dragging and doping or installed like any other KNIME extension via the KNIME Extension Manager.
Here is an example of workflow that uses the extension.
-
ViT Classification Learner Node
- Train transformer models on image classification tasks.
- Supports ViT, Swin Transformer, and Pyramid Transformer architectures.
- Accepts training and validation image sets in PNG format.
- Configurable epochs, batch size, learning rate, and model type.
-
ViT Classification Predictor Node
- Predict labels and class probabilities on new image data.
- Auto-decodes predictions to original label strings.
- Customizable output column names and probability formatting.