When I run the model using VLLM, the performance is a bit different from the one we have in the HuggingFace batched inference for the same data. Is it because the softmax is calculated in a different way given in vllm we don't have access to all the logits?