Skip to content
This repository has been archived by the owner on Oct 9, 2024. It is now read-only.

The generated results are different when using greedy search during generation #65

Open
FrostML opened this issue Mar 14, 2023 · 4 comments

Comments

@FrostML
Copy link

FrostML commented Mar 14, 2023

Thank you very much for your work. I got a problem when I ran BLOOM-176B on 8*A100.

I followed the README.md and executed the following command. To be specific, I set do_sample = true and top_k = 1 which I thought it was equivalent to greedy search:

python -m inference_server.cli --model_name bigscience/bloom --model_class AutoModelForCausalLM --dtype bf16 --deployment_framework hf_accelerate --generate_kwargs '{"min_length": 100, "max_new_tokens": 100, "do_sample": true, "top_k": 1}'

However, the generated outputs of several forwards were different with the same inputs. This situation happened occasionally.

Do you have any clues or ideas about this?

My env info:

CUDA 11.7
nccl 2.14.3

accelerate 0.17.1
Flask 2.2.3
Flask-API 3.0.post1
gunicorn 20.1.0
pydantic 1.10.6
huggingface-hub 0.13.2
@mayank31398
Copy link
Collaborator

Hi, do_sample = true and top_k = 1 should be fine but the correct way to do it is just do_sample = False.
This is weird. I don't this is a bug in the code in this repository.
But will try to give it a shot.
Can you try with just do_sample = False?

@FrostML
Copy link
Author

FrostML commented Mar 20, 2023

Hi @mayank31398 Sorry for the late reply.
It was ok with do_sample=False. The results were all the same.
But I still can't figure out why sampling can't work properly. Do you know who or which repo I can turn to for some help?

@richarddwang
Copy link

Refer to https://huggingface.co/blog/how-to-generate. Because sampling is designed to incorporate randomness into picking the next word.

@FrostML
Copy link
Author

FrostML commented Mar 22, 2023

But the k is 1. There shouldn't be any randomness. @richarddwang

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants