Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qwen2-1.5B not work. Only output many “!” #589

Closed
DouHappy opened this issue Jul 4, 2024 · 4 comments
Closed

Qwen2-1.5B not work. Only output many “!” #589

DouHappy opened this issue Jul 4, 2024 · 4 comments

Comments

@DouHappy
Copy link

DouHappy commented Jul 4, 2024

My environment:
sglang 0.1.17
torch 2.3.0
CUDA 11.8

My Problem:
My sglang works well on qwen1.5-4B and qwen1.5-0.5B. But not work on qwen2-1.5B. Outputs is all "!"

My test script:

# python -m sglang.launch_server --model-path /data/images/llms/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8 --port 5000 --trust-remote-code --tp-size 1 2>&1 | tee sglang.log
# python -m sglang.launch_server --model-path /data/images/llms/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8 --port 5000 --trust-remote-code --tp-size 1 --attention-reduce-in-fp32 2>&1 | tee sglang.log
# both not work

import sglang as sgl

@sgl.function
def sglang_build_qwen_prompt(s, question):
    s += sgl.system("You are a helpful assistant.")
    s += sgl.user(question)
    s += sgl.assistant(sgl.gen(
        name='output',
        temperature=0,
        max_tokens=100,
    ))


sgl.set_default_backend(sgl.RuntimeEndpoint(base_url='http://127.0.0.1:5000'))

output = sglang_build_qwen_prompt.run(
    question="Hello, Please Introduce yourself."
)
print(output.messages())

My script ouputs:

[{'role': 'system', 'content': 'You are a helpful assistant.'}, {'role': 'user', 'content': 'Hello, Please Introduce yourself.'}, {'role': 'assistant', 'content': '!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'}]
@merrymercy
Copy link
Contributor

merrymercy commented Jul 9, 2024

Can you check the prompt template by printing s.text()? There is also a chat template fix PR for QWen (#530)

@DouHappy
Copy link
Author

Can you check the prompt template by printing s.text()? There is also a chat template fix PR for QWen (#530)

Sorry for late. Template seems normal.

# python -m sglang.launch_server --model-path /data/images/llms/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8 --port 5000 --trust-remote-code --tp-size 1 2>&1 | tee sglang.log
# python -m sglang.launch_server --model-path /data/images/llms/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8 --port 5000 --trust-remote-code --tp-size 1 --attention-reduce-in-fp32 2>&1 | tee sglang.log
# both not work

import sglang as sgl

@sgl.function
def sglang_build_qwen_prompt(s, question):
    s += sgl.system("You are a helpful assistant.")
    s += sgl.user(question)
    # s += "<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n"
    # s += f"<|im_start|>user\n{question}<|im_end|>\n"
    # s += "<|im_start|>assistant\n"
    s += sgl.assistant(sgl.gen(
        name='output',
        temperature=0,
        max_tokens=40,
    ))
    print(f"s.text():\n{s.text()}")


sgl.set_default_backend(sgl.RuntimeEndpoint(base_url='http://127.0.0.1:5000'))

output = sglang_build_qwen_prompt.run(
    question="Hello, Please Introduce yourself."
)
print(output.messages())

I got output as

s.text():
SYSTEM:You are a helpful assistant.
USER:Hello, Please Introduce yourself.
ASSISTANT:!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

[{'role': 'system', 'content': 'You are a helpful assistant.'}, {'role': 'user', 'content': 'Hello, Please Introduce yourself.'}, {'role': 'assistant', 'content': '!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'}]

I also tried to add the template manually. Is the template right?

# python -m sglang.launch_server --model-path /data/images/llms/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8 --port 5000 --trust-remote-code --tp-size 1 2>&1 | tee sglang.log
# python -m sglang.launch_server --model-path /data/images/llms/huggingface/hub/models--Qwen--Qwen2-1.5B-Instruct/snapshots/ba1cf1846d7df0a0591d6c00649f57e798519da8 --port 5000 --trust-remote-code --tp-size 1 --attention-reduce-in-fp32 2>&1 | tee sglang.log
# both not work

import sglang as sgl

@sgl.function
def sglang_build_qwen_prompt(s, question):
    # s += sgl.system("You are a helpful assistant.")
    # s += sgl.user(question)
    s += "<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n"
    s += f"<|im_start|>user\n{question}<|im_end|>\n"
    s += "<|im_start|>assistant\n"
    s += sgl.gen(
        name='output',
        temperature=0,
        max_tokens=40,
    )
    print(f"s.text():\n{s.text()}")


sgl.set_default_backend(sgl.RuntimeEndpoint(base_url='http://127.0.0.1:5000'))

output = sglang_build_qwen_prompt.run(
    question="Hello, Please Introduce yourself."
)
# print(output.messages())

Output:

s.text():
<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
Hello, Please Introduce yourself.<|im_end|>
<|im_start|>assistant
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

@zhyncs
Copy link
Member

zhyncs commented Jul 18, 2024

ref #630

@DouHappy
Copy link
Author

Thanks a lot. I will close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants