[Question] Proper way to use multiple GPUs

## ❓ General Questions

What is the proper way to actually utilize multiple GPUs? When I generate config, compile, and load the MLCEngine with multiple tensor shards it will still error out if the model size is larger than a single one of the GPU's memory. Also, if I check `nvidia-smi` it is only really utilizing one GPU.

e.g. this was run with 4 tensor shards

![image](https://github.com/mlc-ai/mlc-llm/assets/92545857/389e5d9c-9b7c-4d3e-96eb-2c8212a1f2da)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] Proper way to use multiple GPUs #2562

❓ General Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] Proper way to use multiple GPUs #2562

Description

❓ General Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions