Qwen2-1.5B not work. Only output many “!” #589

DouHappy · 2024-07-04T07:46:54Z

My environment:
sglang 0.1.17
torch 2.3.0
CUDA 11.8

My Problem:
My sglang works well on qwen1.5-4B and qwen1.5-0.5B. But not work on qwen2-1.5B. Outputs is all "!"

My test script:

# python -m sglang.launch_server --model-path /data/images/llms/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8 --port 5000 --trust-remote-code --tp-size 1 2>&1 | tee sglang.log
# python -m sglang.launch_server --model-path /data/images/llms/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8 --port 5000 --trust-remote-code --tp-size 1 --attention-reduce-in-fp32 2>&1 | tee sglang.log
# both not work

import sglang as sgl

@sgl.function
def sglang_build_qwen_prompt(s, question):
    s += sgl.system("You are a helpful assistant.")
    s += sgl.user(question)
    s += sgl.assistant(sgl.gen(
        name='output',
        temperature=0,
        max_tokens=100,
    ))


sgl.set_default_backend(sgl.RuntimeEndpoint(base_url='http://127.0.0.1:5000'))

output = sglang_build_qwen_prompt.run(
    question="Hello, Please Introduce yourself."
)
print(output.messages())

My script ouputs:

[{'role': 'system', 'content': 'You are a helpful assistant.'}, {'role': 'user', 'content': 'Hello, Please Introduce yourself.'}, {'role': 'assistant', 'content': '!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'}]

The text was updated successfully, but these errors were encountered:

merrymercy · 2024-07-09T08:49:52Z

Can you check the prompt template by printing s.text()? There is also a chat template fix PR for QWen (#530)

DouHappy · 2024-07-11T03:02:30Z

Can you check the prompt template by printing s.text()? There is also a chat template fix PR for QWen (#530)

Sorry for late. Template seems normal.

# python -m sglang.launch_server --model-path /data/images/llms/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8 --port 5000 --trust-remote-code --tp-size 1 2>&1 | tee sglang.log
# python -m sglang.launch_server --model-path /data/images/llms/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8 --port 5000 --trust-remote-code --tp-size 1 --attention-reduce-in-fp32 2>&1 | tee sglang.log
# both not work

import sglang as sgl

@sgl.function
def sglang_build_qwen_prompt(s, question):
    s += sgl.system("You are a helpful assistant.")
    s += sgl.user(question)
    # s += "<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n"
    # s += f"<|im_start|>user\n{question}<|im_end|>\n"
    # s += "<|im_start|>assistant\n"
    s += sgl.assistant(sgl.gen(
        name='output',
        temperature=0,
        max_tokens=40,
    ))
    print(f"s.text():\n{s.text()}")


sgl.set_default_backend(sgl.RuntimeEndpoint(base_url='http://127.0.0.1:5000'))

output = sglang_build_qwen_prompt.run(
    question="Hello, Please Introduce yourself."
)
print(output.messages())

I got output as

s.text():
SYSTEM:You are a helpful assistant.
USER:Hello, Please Introduce yourself.
ASSISTANT:!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

[{'role': 'system', 'content': 'You are a helpful assistant.'}, {'role': 'user', 'content': 'Hello, Please Introduce yourself.'}, {'role': 'assistant', 'content': '!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'}]

I also tried to add the template manually. Is the template right?

# python -m sglang.launch_server --model-path /data/images/llms/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8 --port 5000 --trust-remote-code --tp-size 1 2>&1 | tee sglang.log
# python -m sglang.launch_server --model-path /data/images/llms/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8 --port 5000 --trust-remote-code --tp-size 1 --attention-reduce-in-fp32 2>&1 | tee sglang.log
# both not work

import sglang as sgl

@sgl.function
def sglang_build_qwen_prompt(s, question):
    # s += sgl.system("You are a helpful assistant.")
    # s += sgl.user(question)
    s += "<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n"
    s += f"<|im_start|>user\n{question}<|im_end|>\n"
    s += "<|im_start|>assistant\n"
    s += sgl.gen(
        name='output',
        temperature=0,
        max_tokens=40,
    )
    print(f"s.text():\n{s.text()}")


sgl.set_default_backend(sgl.RuntimeEndpoint(base_url='http://127.0.0.1:5000'))

output = sglang_build_qwen_prompt.run(
    question="Hello, Please Introduce yourself."
)
# print(output.messages())

Output:

s.text():
<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
Hello, Please Introduce yourself.<|im_end|>
<|im_start|>assistant
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

zhyncs · 2024-07-18T04:21:13Z

ref #630

DouHappy · 2024-07-22T04:07:06Z

Thanks a lot. I will close this issue.

DouHappy closed this as completed Jul 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen2-1.5B not work. Only output many “!” #589

Qwen2-1.5B not work. Only output many “!” #589

DouHappy commented Jul 4, 2024

merrymercy commented Jul 9, 2024 •

edited

Loading

DouHappy commented Jul 11, 2024

zhyncs commented Jul 18, 2024

DouHappy commented Jul 22, 2024

Qwen2-1.5B not work. Only output many “!” #589

Qwen2-1.5B not work. Only output many “!” #589

Comments

DouHappy commented Jul 4, 2024

merrymercy commented Jul 9, 2024 • edited Loading

DouHappy commented Jul 11, 2024

zhyncs commented Jul 18, 2024

DouHappy commented Jul 22, 2024

merrymercy commented Jul 9, 2024 •

edited

Loading