Closed
Description
❓ General Questions
What is the proper way to actually utilize multiple GPUs? When I generate config, compile, and load the MLCEngine with multiple tensor shards it will still error out if the model size is larger than a single one of the GPU's memory. Also, if I check nvidia-smi
it is only really utilizing one GPU.
e.g. this was run with 4 tensor shards