Slow traning #137

ksv87 · 2024-03-29T09:48:50Z

Before Asking

I have read the README carefully. 我已经仔细阅读了README上的操作指引。
I want to train my custom dataset, and I have read the tutorials for finetune on your data carefully and organize my dataset correctly; 我想训练自定义数据集，我已经仔细阅读了训练自定义数据的教程，以及按照正确的目录结构存放数据集。
I have pulled the latest code of main branch to run again and the problem still existed. 我已经拉取了主分支上最新的代码，重新运行之后，问题仍不能解决。

Search before asking

I have searched the DAMO-YOLO issues and found no similar questions.

Question

the damo-yolo very similar with yolox, but traning on my custome dataset damo-yolo traning time x3 by comparison with yolox, why?

Additional

No response

shantanusingh16 · 2024-09-28T05:50:53Z

Bump. I'm facing the same issue where the training time is around 5 days on 3 3090s with batch size 192 and size of dataset ~400k images. The number of workers is set to 8.

2024-09-28 05:43:37 | INFO | damo.apis.detector_trainer:368 - epoch: 1/40, iter: 300/1483, mem: 7203Mb, iter_time: 7.779s, model_time: 7.283s, total_loss: 2.5, loss_cls: 0.2, loss_bbox: 1.6, loss_dfl: 0.7, lr: 4.878e-05, size: (640, 640), ETA: 5 days, 10:22:08

ksv87 · 2024-10-01T14:56:33Z

As a result, I looked at the caching method in YOLOX and tried to apply it in DAMO-YOLO, the ETA was reduced from 2 days 18 hours to 8 hours, but I'm not sure that this did not affect the performance of the model - I'm conducting an experiment for evaluation.

It would be great if the authors added caching of the dataset

Alan-zhong · 2024-10-10T06:43:53Z

just too slow>............why？

proevgenii · 2025-01-29T17:15:12Z

Same question, training is much more slower than using similar models
It says ETA: more than 8 hours 🤯:

epoch: 2/300, iter: 0/25, mem: 23539Mb, iter_time: 4.024s, model_time: 0.610s, total_loss: 2.3, loss_cls: 0.5, loss_bbox: 1.0, loss_dfl: 0.7, lr: 7.683e-04, size: (640, 640), ETA: 8:19:36

My cfg:

Dataset: ~1k images
self.train.batch_size = 32
self.train.base_lr_per_img = 0.01 / 64
self.train.min_lr_ratio = 0.05

Device
GPU: NVIDIA A100

ksv87 added the question Further information is requested label Mar 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow traning #137

Slow traning #137

ksv87 commented Mar 29, 2024

shantanusingh16 commented Sep 28, 2024

ksv87 commented Oct 1, 2024

Alan-zhong commented Oct 10, 2024

proevgenii commented Jan 29, 2025

Slow traning #137

Slow traning #137

Comments

ksv87 commented Mar 29, 2024

Before Asking

Search before asking

Question

Additional

shantanusingh16 commented Sep 28, 2024

ksv87 commented Oct 1, 2024

Alan-zhong commented Oct 10, 2024

proevgenii commented Jan 29, 2025