You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello I successfully ran the profiler tool on ma classification model to profile the maximum memory usage. Because I want to use different CNN on a same GPU. But I'm really baffled by the results of the profiler. Let me explain
I have a NVIDIA RTX3090 with 24GB memory so for my small CNN I set 512 memory limit in my code before all use with this code : tf.config.set_logical_device_configuration(gpus[0],[tf.config.LogicalDeviceConfiguration(memory_limit=512)])
It seems to work because of the tensorflow logs 2022-01-19 16:24:13.615890: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with **512 MB memory:** -> device: 0, name: GeForce RTX 3090, pci bus id: 0000:2d:00.0, compute capability: 8.6
Nvidia-smi shows that GPU use 419MiB are used
Then I start a batch to make the inference on the classification model with batch size = 1
and tensorboard shows that the model use about 100MiB
so theoretically I could have set a small memory limit (under 512) but .. here is the real use of the memory given by Nvidia-smi is 1869MiB !
Finally if I want a tool to know how much is the real memory consumption of a model, how to use the tensor board profiler ? TensorBoard resuIt is useless actually ?
The text was updated successfully, but these errors were encountered:
I made further investigations. Actually the command line works but documentation is not enough clear. on my test when I set memory_limit=200.
A) When I Call import tensorflow => NVIDIA memory allocated is 423MiB
B) When I Call the code with memory limit => NVIDIA memory allocated is 423+200=623MiB
C) When a first inference is called then TensorFlow add a C part memory 938MiB (+423+200) Total = 1561 MiB
So I understand that A+C is a constant that is needed by TensorFlow and the memory_limit affects only the B part. I tested on many different model.
A+B depends on the driver or GPU HW.
So now it's clear for me. But finally documentation could mention this because see for a low model about 100MiB I need 1.5GB ram, it 's confusing.
Hello I successfully ran the profiler tool on ma classification model to profile the maximum memory usage. Because I want to use different CNN on a same GPU. But I'm really baffled by the results of the profiler. Let me explain
I have a NVIDIA RTX3090 with 24GB memory so for my small CNN I set 512 memory limit in my code before all use with this code :
tf.config.set_logical_device_configuration(gpus[0],[tf.config.LogicalDeviceConfiguration(memory_limit=512)])
It seems to work because of the tensorflow logs
2022-01-19 16:24:13.615890: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with **512 MB memory:** -> device: 0, name: GeForce RTX 3090, pci bus id: 0000:2d:00.0, compute capability: 8.6
Nvidia-smi shows that GPU use 419MiB are used
Then I start a batch to make the inference on the classification model with batch size = 1
and tensorboard shows that the model use about 100MiB
so theoretically I could have set a small memory limit (under 512) but .. here is the real use of the memory given by Nvidia-smi is 1869MiB !
Finally if I want a tool to know how much is the real memory consumption of a model, how to use the tensor board profiler ? TensorBoard resuIt is useless actually ?
The text was updated successfully, but these errors were encountered: