Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support PyTorch set_per_process_memory_fraction #39

Open
1 of 3 tasks
kcm opened this issue Dec 29, 2022 · 1 comment
Open
1 of 3 tasks

Support PyTorch set_per_process_memory_fraction #39

kcm opened this issue Dec 29, 2022 · 1 comment
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@kcm
Copy link
Contributor

kcm commented Dec 29, 2022

Summary

PyTorch allows a limit for GPU memory. This is useful, for example, when a GPU resource is shared.

set_per_process_memory_fraction(fraction, device=None): Set memory fraction for a process. The fraction is used to limit an caching allocator to allocated memory on a CUDA device. The allowed value equals the total visible memory multiplied fraction. If trying to allocate more than the allowed value in a process, will raise an out of memory error in allocator.

Proposal

This setting takes a percentage [0-1] and a device (optional). Use an environment variable alongside ENABLE_CUDA of the format CUDA_MEMORY_FRACTION where the value is 0.0-1.0 and passed to fraction. Additionally, if set, check and prefer CUDA_MEMORY_FRACTION_... variable(s), where the value is the same format, and the ... is passed to device for each variable found.

Questions

@kcm kcm added enhancement New feature or request good first issue Good for newcomers labels Dec 29, 2022
@kcm
Copy link
Contributor Author

kcm commented Dec 29, 2022

One use case is for AWS vGPU support so that multiple consumers of the vGPU device(s) don't assume they have exclusive rights to the full resource usage.

gauthamchandra added a commit to uhadmin/weaviate-helm that referenced this issue Dec 29, 2022
Due to the lack of support for GPU virtualization by Weaviate (see
weaviate/t2v-transformers-models#39
for details), we need to ensure each pod that uses the GPU, utilizes the
full pod without sharing it.

The way to ensure this is to schedule each GPU enabled pod to its own
dedicated node using Kubernetes's anti-affinity feature.

This addresses this to make it easy to run clusters with GPU support.
gauthamchandra added a commit to uhadmin/weaviate-helm that referenced this issue Dec 29, 2022
Due to the lack of support for GPU virtualization by Weaviate (see
weaviate/t2v-transformers-models#39
for details), we need to ensure each pod that uses the GPU, utilizes the
full pod without sharing it.

The way to ensure this is to schedule each GPU enabled pod to its own
dedicated node using Kubernetes's anti-affinity feature.

This addresses this to make it easy to run clusters with GPU support.
gauthamchandra added a commit to uhadmin/weaviate-helm that referenced this issue Dec 29, 2022
Due to the lack of support for GPU virtualization by Weaviate (see
weaviate/t2v-transformers-models#39
for details), we need to ensure each pod that uses the GPU, utilizes the
full pod without sharing it.

The way to ensure this is to schedule each GPU enabled pod to its own
dedicated node using Kubernetes's anti-affinity feature.

This addresses this to make it easy to run clusters with GPU support.
gauthamchandra added a commit to uhadmin/weaviate-helm that referenced this issue Dec 29, 2022
Due to the lack of support for GPU virtualization by Weaviate (see
weaviate/t2v-transformers-models#39
for details), we need to ensure each pod that uses the GPU, utilizes the
full pod without sharing it.

The way to ensure this is to schedule each GPU enabled pod to its own
dedicated node using Kubernetes's anti-affinity feature.

This addresses this to make it easy to run clusters with GPU support.
kcm added a commit that referenced this issue Jan 17, 2023
gauthamchandra added a commit to uhadmin/weaviate-helm that referenced this issue Apr 3, 2023
Due to the lack of support for GPU virtualization by Weaviate (see
weaviate/t2v-transformers-models#39
for details), we need to ensure each pod that uses the GPU, utilizes the
full pod without sharing it.

The way to ensure this is to schedule each GPU enabled pod to its own
dedicated node using Kubernetes's anti-affinity feature.

This addresses this to make it easy to run clusters with GPU support.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant