-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support PyTorch set_per_process_memory_fraction
#39
Labels
Comments
kcm
added
enhancement
New feature or request
good first issue
Good for newcomers
labels
Dec 29, 2022
One use case is for AWS vGPU support so that multiple consumers of the vGPU device(s) don't assume they have exclusive rights to the full resource usage. |
gauthamchandra
added a commit
to uhadmin/weaviate-helm
that referenced
this issue
Dec 29, 2022
Due to the lack of support for GPU virtualization by Weaviate (see weaviate/t2v-transformers-models#39 for details), we need to ensure each pod that uses the GPU, utilizes the full pod without sharing it. The way to ensure this is to schedule each GPU enabled pod to its own dedicated node using Kubernetes's anti-affinity feature. This addresses this to make it easy to run clusters with GPU support.
gauthamchandra
added a commit
to uhadmin/weaviate-helm
that referenced
this issue
Dec 29, 2022
Due to the lack of support for GPU virtualization by Weaviate (see weaviate/t2v-transformers-models#39 for details), we need to ensure each pod that uses the GPU, utilizes the full pod without sharing it. The way to ensure this is to schedule each GPU enabled pod to its own dedicated node using Kubernetes's anti-affinity feature. This addresses this to make it easy to run clusters with GPU support.
gauthamchandra
added a commit
to uhadmin/weaviate-helm
that referenced
this issue
Dec 29, 2022
Due to the lack of support for GPU virtualization by Weaviate (see weaviate/t2v-transformers-models#39 for details), we need to ensure each pod that uses the GPU, utilizes the full pod without sharing it. The way to ensure this is to schedule each GPU enabled pod to its own dedicated node using Kubernetes's anti-affinity feature. This addresses this to make it easy to run clusters with GPU support.
gauthamchandra
added a commit
to uhadmin/weaviate-helm
that referenced
this issue
Dec 29, 2022
Due to the lack of support for GPU virtualization by Weaviate (see weaviate/t2v-transformers-models#39 for details), we need to ensure each pod that uses the GPU, utilizes the full pod without sharing it. The way to ensure this is to schedule each GPU enabled pod to its own dedicated node using Kubernetes's anti-affinity feature. This addresses this to make it easy to run clusters with GPU support.
gauthamchandra
added a commit
to uhadmin/weaviate-helm
that referenced
this issue
Apr 3, 2023
Due to the lack of support for GPU virtualization by Weaviate (see weaviate/t2v-transformers-models#39 for details), we need to ensure each pod that uses the GPU, utilizes the full pod without sharing it. The way to ensure this is to schedule each GPU enabled pod to its own dedicated node using Kubernetes's anti-affinity feature. This addresses this to make it easy to run clusters with GPU support.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Summary
PyTorch allows a limit for GPU memory. This is useful, for example, when a GPU resource is shared.
Proposal
This setting takes a percentage [0-1] and a device (optional). Use an environment variable alongside
ENABLE_CUDA
of the formatCUDA_MEMORY_FRACTION
where the value is 0.0-1.0 and passed tofraction
. Additionally, if set, check and preferCUDA_MEMORY_FRACTION_...
variable(s), where the value is the same format, and the...
is passed todevice
for each variable found.Questions
CUDA_MEMORY_FRACTION
/CUDA_MEMORY_FRACTION_...
?The text was updated successfully, but these errors were encountered: