From cf00e89fa97e961d21dda451a5b6a4d5f36ddc83 Mon Sep 17 00:00:00 2001 From: youkaichao Date: Sun, 1 Dec 2024 00:04:01 -0800 Subject: [PATCH 1/2] polish doc Signed-off-by: youkaichao --- docs/source/models/supported_models.rst | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/source/models/supported_models.rst b/docs/source/models/supported_models.rst index f571b8bf6735e..1f2dc3d42e873 100644 --- a/docs/source/models/supported_models.rst +++ b/docs/source/models/supported_models.rst @@ -701,6 +701,9 @@ At vLLM, we are committed to facilitating the integration and support of third-p 2. **Best-Effort Consistency**: While we aim to maintain a level of consistency between the models implemented in vLLM and other frameworks like transformers, complete alignment is not always feasible. Factors like acceleration techniques and the use of low-precision computations can introduce discrepancies. Our commitment is to ensure that the implemented models are functional and produce sensible results. +.. tip:: + When you compare the output of :code:`model.generate` from HuggingFace transformers with the output of :code:`llm.generate` from vLLM, please note that the former will `read the model's generation config file (i.e., generation_config.json) `__ and apply the default parameters for the generation, while the latter will only use the parameters passed to the function. Please make sure all the sampling parameters are the same when comparing the outputs. + 3. **Issue Resolution and Model Updates**: Users are encouraged to report any bugs or issues they encounter with third-party models. Proposed fixes should be submitted via PRs, with a clear explanation of the problem and the rationale behind the proposed solution. If a fix for one model impacts another, we rely on the community to highlight and address these cross-model dependencies. Note: for bugfix PRs, it is good etiquette to inform the original author to seek their feedback. 4. **Monitoring and Updates**: Users interested in specific models should monitor the commit history for those models (e.g., by tracking changes in the main/vllm/model_executor/models directory). This proactive approach helps users stay informed about updates and changes that may affect the models they use. From 94970a609b459b1e64f7f8fadbf72e55bcf55c6c Mon Sep 17 00:00:00 2001 From: youkaichao Date: Sun, 1 Dec 2024 00:24:10 -0800 Subject: [PATCH 2/2] update link Signed-off-by: youkaichao --- docs/source/models/supported_models.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/models/supported_models.rst b/docs/source/models/supported_models.rst index 1f2dc3d42e873..9f3b6f59068e2 100644 --- a/docs/source/models/supported_models.rst +++ b/docs/source/models/supported_models.rst @@ -702,7 +702,7 @@ At vLLM, we are committed to facilitating the integration and support of third-p 2. **Best-Effort Consistency**: While we aim to maintain a level of consistency between the models implemented in vLLM and other frameworks like transformers, complete alignment is not always feasible. Factors like acceleration techniques and the use of low-precision computations can introduce discrepancies. Our commitment is to ensure that the implemented models are functional and produce sensible results. .. tip:: - When you compare the output of :code:`model.generate` from HuggingFace transformers with the output of :code:`llm.generate` from vLLM, please note that the former will `read the model's generation config file (i.e., generation_config.json) `__ and apply the default parameters for the generation, while the latter will only use the parameters passed to the function. Please make sure all the sampling parameters are the same when comparing the outputs. + When comparing the output of :code:`model.generate` from HuggingFace Transformers with the output of :code:`llm.generate` from vLLM, note that the former reads the model's generation config file (i.e., `generation_config.json `__) and applies the default parameters for generation, while the latter only uses the parameters passed to the function. Ensure all sampling parameters are identical when comparing outputs. 3. **Issue Resolution and Model Updates**: Users are encouraged to report any bugs or issues they encounter with third-party models. Proposed fixes should be submitted via PRs, with a clear explanation of the problem and the rationale behind the proposed solution. If a fix for one model impacts another, we rely on the community to highlight and address these cross-model dependencies. Note: for bugfix PRs, it is good etiquette to inform the original author to seek their feedback.