Open
Description
🐛 Describe the bug
Minimal reproducible code
import time
import torch
from torchcodec.decoders import VideoDecoder
device = "cpu"
# device = "cuda"
video_path = "NASAs_Most_Scientifically_Complex_Space_Observatory_Requires_Precision-MP4_small.mp4"
decoder = VideoDecoder(
video_path,
device=device,
# dimension_order="NHWC",
# seek_mode="approximate",
num_ffmpeg_threads=8,
)
index_list = [106, 125, 127, 130, 132, 144, 146, 171, 180, 181, 189, 194, 195, 199, 203, 204, 204, 214, 227, 242, 259, 263, 266, 296, 303, 314, 320, 323, 325, 328, 333, 338, 338, 350, 370, 373, 381, 384, 384, 384, 396, 400, 444, 448, 452, 463, 467, 470, 473, 479, 487, 489, 489, 529, 532, 532, 559, 564, 570, 617, 649, 658, 658, 665, 674, 691, 703, 704, 716, 718, 733, 743, 750, 754, 765, 777, 786, 792, 814, 818, 818, 821, 833, 847, 854, 858, 859, 877, 881, 891, 925, 926, 928, 949, 954, 967, 975, 982, 987, 990]
data = decoder.get_frames_at(indices=index_list)
loop_number = 20
start_time = time.time()
for i in range(loop_number):
data = decoder.get_frames_at(indices=index_list)
print(data.data.shape)
end_time = time.time()
# print(data.data)
# print(data.data.shape)
print(f"spend time:{(end_time - start_time) / loop_number}")
cpu spend time: 0.9169630885124207
gpu spend time:3.559278666973114
Use different devices, and you will find that cuda takes more time than cpu. I don't understand why.
Versions
As mentioned above