You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried running IntelAI DLRM model with int8 precision with default int8_configure.json. Could someone clarify if quantization happens each time the inference_performance.sh script is triggered, or if the int8 weights are stored after the first run and reused for the later runs.
Currently, the run takes around 10 hours to complete on a 64 core machine. Please let me know if any additional info is required from my end.
The text was updated successfully, but these errors were encountered:
I tried running IntelAI DLRM model with int8 precision with default int8_configure.json. Could someone clarify if quantization happens each time the inference_performance.sh script is triggered, or if the int8 weights are stored after the first run and reused for the later runs.
Currently, the run takes around 10 hours to complete on a 64 core machine. Please let me know if any additional info is required from my end.
The text was updated successfully, but these errors were encountered: