Bug: BigLlama-3.1-681B-Instruct requires llama_model_max_nodes to return a higher value #8950
Labels
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
What happened?
This issue is caused by a reappearance of issue #8615 (PR #8622). I recommend reading them for more information about this problem.
For Meta-Llama-3.1-405B-Instruct it turned out that the default llama_model_max_nodes of 8192 is still enough. For its self-merge available under https://huggingface.co/mlabonne/BigLlama-3.1-681B-Instruct (GGUFs available under https://huggingface.co/mradermacher/BigLlama-3.1-681B-Instruct-GGUF/tree/main) it is unfortunately not.
For this issue to be fixed the commented out logic under
llama.cpp/src/llama.cpp
Lines 3571 to 3573 in 3071c0a
model.hparams.n_layer > 200
for this model to work. Maybe a good approach would be having 0-200 return 8192, >200 return 16384 and >400 return 32768. To play around with this model I made the llama_model_max_nodes function always return 16384 which fixed the issue.Name and Version
version: 3557 (3071c0a)
built with cc (Debian 12.2.0-14) 12.2.0 for x86_64-linux-gnu
What operating system are you seeing the problem on?
Linux
Relevant log output
The text was updated successfully, but these errors were encountered: