-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit GPU memory usage? #75
Comments
GPU memory is mostly allocated via CuPy. You can set the memory limit via CuPy if you hope GPU can do something else. https://docs.cupy.dev/en/stable/user_guide/memory.html#limiting-gpu-memory-usage Although GINT* kernels do not allocate global memory explicitly, those kernels allocate a lot of local memory for high angular momentums. Those local memory are eventually allocated on global memory. So for high angular momentums, you probably still have the 'out of memory' issue. |
thank you for the explanation |
Hello, I am reopening this issue. I have found that if I turn on CUDA_MPS and limit the number of active threads with this command: My understanding is that this reduces the local/shared memory in use at once, stopping the errors, at the expense of runtime. Is it possible to do a similar modification at runtime, or compile time, in the code? Maybe these values:? gpu4pyscf/gpu4pyscf/lib/gint/gint.h Lines 77 to 81 in 6474b41
|
This is a good suggestion. If you turn off some threads, there is no need to allocate local memory for those threads. We can take it as one of the possible solutions. |
Hi, First of all, I’m absolutely blown away by the performance of GPU4PySCF—thank you for this amazing tool! I have a beginner question regarding an issue I encountered. I’m running a torsional scan similar to the provided example, and it generally works well for several iterations. However, at some point, I get the following error: CUDA Error of GINTint2e_jk_kernel: out of memory This happens on our cluster with an A100 40GB GPU. Since my molecule isn’t very large (24 atoms) and it runs fine for multiple iterations before failing, I’m a bit confused. Is there a way to free up memory between iterations to prevent this issue? Full code: import time
import pyscf
from pyscf import lib
from pyscf.geomopt.geometric_solver import optimize
from gpu4pyscf.dft import rks
atom = '''
C 0.724002 1.135021 -0.907355
O -0.356123 0.965447 -0.024473
C -0.744599 -0.386333 0.152087
C 0.396187 -1.157444 0.792032
O 0.011790 -2.507030 0.899684
C 1.644622 -1.020028 -0.053519
C 1.948759 0.441464 -0.321131
N 3.069963 0.492310 -1.261050
O 2.695457 -1.654767 0.636744
C -1.987816 -0.375699 1.005567
O -3.055730 0.286385 0.366128
O 0.929643 2.485897 -1.104082
H -0.977695 -0.828245 -0.823532
H 0.596763 -0.736811 1.783422
H 1.468667 -1.522896 -1.009604
H 2.212953 0.934306 0.618699
H 0.481425 0.707737 -1.884177
H 1.156487 2.903388 -0.265770
H 3.435645 -1.785156 0.038141
H 0.756639 -3.006876 1.245376
H -1.757549 0.093793 1.965767
H -2.306214 -1.397909 1.189633
H -2.790237 1.193944 0.194295
N 3.757659 1.504438 -1.211762
N 4.455331 2.386355 -1.246447
'''
xc = 'B3LYP'
bas = '6-311++G(2d,2p)'
scf_tol = 1e-10
max_scf_cycles = 200
screen_tol = 1e-14
grids_level = 3
mol = pyscf.M(atom=atom, basis=bas, max_memory=120000)
mol.verbose = 1
mf_GPU = rks.RKS(mol, xc=xc).density_fit()
mf_GPU.grids.level = grids_level
mf_GPU.conv_tol = scf_tol
mf_GPU.max_cycle = max_scf_cycles
mf_GPU.screen_tol = screen_tol
gradients = []
start_time = time.time()
# Content of geometric_scan.txt:
# $scan
# dihedral 1 7 8 24 90 -240 20
mol_eq = optimize(
mf_GPU,
maxsteps=500000000,
constraints='geometric_scan.txt', # atom index is 1-based in this file
)
print("Optimized coordinate:")
print(mol_eq.atom_coords())
print(time.time() - start_time) |
@Tillsten Thank you for the feedback! The geometry optimization is converged in 10 iterations on my side. It took about 80 seconds on V100-32GB. I was using the constraints commented in your script. I assumed you were using the same. Most GPU memory is released between optimization iterations. As shown in the above figure, the GPU memory usage is almost constant in the first few iterations. However, it blew up at 14:13:30. It is probably due to the failure of optimization. Can you share the log of GeomeTRIC?
|
I attachted a log from a run. |
Hello,
When running on a GPU that might be doing something else I am sometimes seeing out of memory errors:
CUDA Error of GINTint2e_jk_kernel: out of memory
Is it possible to specify a hard limit on the amount of memory used by these kernels?
The text was updated successfully, but these errors were encountered: