Code | HuggingFace Model |
---|---|
MoH-ViT | 🤗 MoH-ViT-B-75, MoH-ViT-B-50, MoH-ViT-S-80, MoH-ViT-S-75 |
MoH-DiT | 😊 MoH-DiT-90 |
MoH-LLaMA3-8B | 😊 MoH-LLaMA3-8B |
pip install -r requirements.txt
Download and extract ImageNet train and val images from http://image-net.org/.
The directory structure is the standard layout for the torchvision datasets.ImageFolder
, and the training and validation data is expected to be in the train
folder and val
folder respectively:
/path/to/imagenet/
train/
class1/
img1.jpeg
class2/
img2.jpeg
val/
class1/
img3.jpeg
class/2
img4.jpeg
To evaluate the pre-trained MoH-ViT on ImageNet-1K val with GPUs:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
python -m torch.distributed.launch \
--nproc_per_node=8 \
--master_port=2024 \
--use_env main.py \
--config ./configs/${MODEL_TYPE}.py \
--data-path ${ImageNet-1K_PATH} \
--resume ./checkpoints/${MODEL_TYPE}.pth \
--eval
To train MoH-ViT on ImageNet-1K using 8 GPUs:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
python -m torch.distributed.launch \
--nproc_per_node=8 \
--master_port=2024 \
--use_env main.py \
--config ./configs/${MODEL_TYPE}.py \
--data-path ${ImageNet-1K_PATH} \
--batch-size 128 \
--output_dir results/${MODEL_TYPE} \
--num_workers 32
or
bash moh_transnext_base_75.sh ${ImageNet-1K_PATH}
bash moh_transnext_base_50.sh ${ImageNet-1K_PATH}
bash moh_transnext_small_80.sh ${ImageNet-1K_PATH}
bash moh_transnext_small_75.sh ${ImageNet-1K_PATH}