Video inference requires more than 24GB GPU memory? #26

KangYuan1233 · 2025-02-18T02:55:33Z

I am trying to do video inference using Nvidia 4090 under Sa2VA-1B config, but got OutOfMemoryError.

KangYuan1233 · 2025-02-18T02:55:41Z

Traceback (most recent call last):
File "/mnt/mnt_yr/Image_algorithm_group/userdata/wrz/Sa2VA-main/demo/demo.py", line 81, in
result = model.predict_forward(
File "/root/.cache/huggingface/modules/transformers_modules/modeling_sa2va_chat.py", line 735, in predict_forward
pred_masks = self.grounding_encoder.language_embd_inference(sam_states, [seg_hidden_states] * num_frames)
File "/root/.cache/huggingface/modules/transformers_modules/sam2.py", line 342, in language_embd_inference
_, _, out_mask_logits = self.sam2_model.add_language_embd(
File "/root/.cache/huggingface/modules/transformers_modules/sam2.py", line 3792, in add_language_embd
current_out, pred_mask_gpu = self._run_single_frame_inference(
File "/root/.cache/huggingface/modules/transformers_modules/sam2.py", line 3511, in _run_single_frame_inference
) = self._get_image_feature(inference_state, frame_idx, batch_size)
File "/root/.cache/huggingface/modules/transformers_modules/sam2.py", line 3464, in _get_image_feature
backbone_out = self.forward_image(image)
File "/root/.cache/huggingface/modules/transformers_modules/sam2.py", line 2720, in forward_image
backbone_out = self.image_encoder(img_batch)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/sam2.py", line 715, in forward
features, pos = self.neck(self.trunk(sample))
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/sam2.py", line 1165, in forward
x = blk(x)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/sam2.py", line 1025, in forward
x, pad_hw = window_partition(x, window_size)
File "/root/.cache/huggingface/modules/transformers_modules/sam2.py", line 839, in window_partition
x.permute(0, 1, 3, 2, 4, 5).contiguous().view(-1, window_size, window_size, C)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU

HarborYuan · 2025-02-20T08:33:31Z

Hi @KangYuan1233 ,

This is a bit strange, could you please share the script you are using to run the code?

lxtGH · 2025-02-25T02:44:31Z

@HarborYuan Do we have tried on 24 GB card or not?

lxtGH · 2025-02-25T02:45:05Z

@KangYuan1233 I guess you should use at 40GB memory card for video inference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Video inference requires more than 24GB GPU memory? #26

Video inference requires more than 24GB GPU memory? #26

KangYuan1233 commented Feb 18, 2025

KangYuan1233 commented Feb 18, 2025

HarborYuan commented Feb 20, 2025

lxtGH commented Feb 25, 2025

lxtGH commented Feb 25, 2025

Video inference requires more than 24GB GPU memory? #26

Video inference requires more than 24GB GPU memory? #26

Comments

KangYuan1233 commented Feb 18, 2025

KangYuan1233 commented Feb 18, 2025

HarborYuan commented Feb 20, 2025

lxtGH commented Feb 25, 2025

lxtGH commented Feb 25, 2025