[Bugfix] Fix value unpack error of simple connector for KVCache transfer. #11058

ShangmingCai · 2024-12-10T08:28:17Z

Currently the send_kv_caches_and_hidden_states get the value of num_heads and head_size from kv_cache[0].shape, which will raise ValueError when the KVCache is compacted. Since we have model_executable as the function input, we can directly get these values from the model config to avoid this error. Also, we do not need to calculate this for every single layer, so we calculate this only once for all layers.

For example,
when kv_cache[0].shape == torch.Size([2162, 16, 40, 128]), this line works normally,
when kv_cache[0].shape == torch.Size([2162, 81920]), it will raise ValueError.

ERROR 12-03 14:31:48 engine.py:135] ValueError('Error in model execution (input dumped to /tmp/err_execute_model_input_20241203-143148.pkl): not enough values to unpack (expected 4, got 2)')
ERROR 12-03 14:31:48 engine.py:135] Traceback (most recent call last):
ERROR 12-03 14:31:48 engine.py:135] File "/mnt/data_disk101/data_disk/lwq/LLM_INFER/split_platform/opensource/vllm-kuntai-disagg-refactor_1202/vllm/worker/model_runner_base.py", line 116, in _wrapper
ERROR 12-03 14:31:48 engine.py:135] return func(*args, **kwargs)
ERROR 12-03 14:31:48 engine.py:135] File "/mnt/data_disk101/data_disk/lwq/LLM_INFER/split_platform/opensource/vllm-kuntai-disagg-refactor_1202/vllm/worker/model_runner.py", line 1718, in execute_model
ERROR 12-03 14:31:48 engine.py:135] get_kv_transfer_group().send_kv_caches_and_hidden_states(
ERROR 12-03 14:31:48 engine.py:135] File "/mnt/data_disk101/data_disk/lwq/LLM_INFER/split_platform/opensource/vllm-kuntai-disagg-refactor_1202/vllm/distributed/kv_transfer/kv_transfer_agent.py", line 60, in send_kv_caches_and_hidden_states
ERROR 12-03 14:31:48 engine.py:135] self.connector.send_kv_caches_and_hidden_states(
ERROR 12-03 14:31:48 engine.py:135] File "/mnt/data_disk101/data_disk/lwq/LLM_INFER/split_platform/opensource/vllm-kuntai-disagg-refactor_1202/vllm/distributed/kv_transfer/kv_connector/simple_connector.py", line 134, in send_kv_caches_and_hidden_states
ERROR 12-03 14:31:48 engine.py:135] _, _, num_heads, head_size = kv_cache[0].shape
ERROR 12-03 14:31:48 engine.py:135] ValueError: not enough values to unpack (expected 4, got 2)

CC list: @KuntaiDu @youkaichao

Signed-off-by: ShangmingCai <[email protected]>

github-actions · 2024-12-10T08:28:32Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

Signed-off-by: ShangmingCai <[email protected]>

KuntaiDu

LGTM. Thanks for the contribution!

…fer. (vllm-project#11058) Signed-off-by: ShangmingCai <[email protected]>

Fix value unpack error of simple connector.

f0b13d4

Signed-off-by: ShangmingCai <[email protected]>

retrigger ci

ecfb7f0

Signed-off-by: ShangmingCai <[email protected]>

youkaichao requested a review from KuntaiDu December 10, 2024 19:30

Fix typo.

b8fc9aa

Signed-off-by: ShangmingCai <[email protected]>

KuntaiDu approved these changes Dec 12, 2024

View reviewed changes

KuntaiDu enabled auto-merge (squash) December 12, 2024 19:56

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 12, 2024

KuntaiDu merged commit db6c264 into vllm-project:main Dec 12, 2024
68 checks passed

sleepwalker2017 pushed a commit to sleepwalker2017/vllm that referenced this pull request Dec 13, 2024

[Bugfix] Fix value unpack error of simple connector for KVCache trans…

4f0bdeb

…fer. (vllm-project#11058) Signed-off-by: ShangmingCai <[email protected]>

BKitor pushed a commit to BKitor/vllm that referenced this pull request Dec 30, 2024

[Bugfix] Fix value unpack error of simple connector for KVCache trans…

65ae010

…fer. (vllm-project#11058) Signed-off-by: ShangmingCai <[email protected]>

ShangmingCai mentioned this pull request Jan 15, 2025

[Bugfix] Fix num_heads value for simple connector when tp enabled #12074

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Fix value unpack error of simple connector for KVCache transfer. #11058

[Bugfix] Fix value unpack error of simple connector for KVCache transfer. #11058

ShangmingCai commented Dec 10, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Dec 10, 2024

KuntaiDu left a comment

[Bugfix] Fix value unpack error of simple connector for KVCache transfer. #11058

[Bugfix] Fix value unpack error of simple connector for KVCache transfer. #11058

Conversation

ShangmingCai commented Dec 10, 2024 • edited by github-actions bot Loading

github-actions bot commented Dec 10, 2024

KuntaiDu left a comment

Choose a reason for hiding this comment

ShangmingCai commented Dec 10, 2024 •

edited by github-actions bot

Loading