-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark on a10g and h100 #168
Comments
Is double precision card faster than single precision card? Will the A100 be faster than the L40 for gpu4pyscf? If I want to purchase a new GPU, should I consider A100 or V100 over L40 or even the RTX4090/5090? |
Currently, GPU4PySCF is running faster on A100 over L40 in general. For certain algorithms such as density fitting, A100 can be 10x faster than L40. But the consumer grade GPU (RTX4090/RTX5090) is much cheaper. |
Can GPU4PySCF runs on more than 1 GPU? |
@aris1978 Yes, you can run it on a multi-GPU system. This feature is still in experimental. We are still in the progress of improving the performance. |
In the multi-gpu system, will the VRAM be additive? (for example, two GPUs with 16GB of VRAM will be able to run jobs that require 32GB of VRAM on a GPU) |
@aris1978 Yes, the large intermediate variables will be distributed across multi-GPUs. This feature is still experimental. Please let us know if you find any issue. |
Amazing work. Thanks for the benchmarks on A100 and V100. Wondering if anyone tried a10g and h100s?
The text was updated successfully, but these errors were encountered: