Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup at least one e2e test #77

Closed
Tracked by #73
ahg-g opened this issue Dec 7, 2024 · 14 comments · Fixed by #217
Closed
Tracked by #73

Setup at least one e2e test #77

ahg-g opened this issue Dec 7, 2024 · 14 comments · Fixed by #217
Assignees
Milestone

Comments

@ahg-g
Copy link
Contributor

ahg-g commented Dec 7, 2024

To verify the end-to-end flow and CRD setup.

@ahg-g
Copy link
Contributor Author

ahg-g commented Dec 16, 2024

/assign @liu-cong

@danehans
Copy link
Contributor

danehans commented Jan 6, 2025

@liu-cong any status updates?

@liu-cong
Copy link
Contributor

liu-cong commented Jan 6, 2025

/assign @kaushikmitr

Example: https://github.com/kubernetes-sigs/jobset/blob/main/test/e2e/e2e_test.go
The ext proc image: us-central1-docker.pkg.dev/k8s-staging-images/llm-instance-gateway/epp:main

@k8s-ci-robot
Copy link
Contributor

@liu-cong: GitHub didn't allow me to assign the following users: kaushikmitr.

Note that only kubernetes-sigs members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @kaushikmitr

Example: https://github.com/kubernetes-sigs/jobset/blob/main/test/e2e/e2e_test.go
The ext proc image: us-central1-docker.pkg.dev/k8s-staging-images/llm-instance-gateway/epp:main

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@liu-cong liu-cong removed their assignment Jan 6, 2025
@liu-cong
Copy link
Contributor

liu-cong commented Jan 6, 2025

@liu-cong any status updates?

@kaushikmitr is picking this up

@danehans
Copy link
Contributor

danehans commented Jan 7, 2025

/assign @danehans
/unassign @kaushikmitr

@danehans
Copy link
Contributor

A Hugging Face Hub token is required to pull the model image used for e2e testing. How should this be handled when running e2e in CI, e.g. create a Hugging Face account used specifically for CI?

@ahg-g
Copy link
Contributor Author

ahg-g commented Jan 15, 2025

If we can create one for CI, then that would be great. Perhaps we can ask on the sig-testing channel on how to handle such accounts?

@danehans
Copy link
Contributor

xref kubernetes/k8s.io#7698 for creating an HF account.

@danehans
Copy link
Contributor

@ahg-g @liu-cong regarding this issue, Mistral will not work without training Mistral-compatible LoRA weights. See this gist for details.

@ahg-g
Copy link
Contributor Author

ahg-g commented Jan 27, 2025

Can we use a dummy LoRA? Check this in vllm: https://github.com/vllm-project/vllm/blob/28e0750847ded93158a66efdcbc869d87463b38f/vllm/lora/lora.py#L75

However, I am not sure if there is a way to configure vllm to actually set it up.

@liu-cong
Copy link
Contributor

Mistral will not work without training Mistral-compatible LoRA weights

I think we just need to find the right adapter compatible with Mistral, right? BTW, you can find compatible adapters in the HF page, like this

Image

Besides, the Qwen model doesn't require any HF login and it has adapters as well: https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/discussions

@danehans
Copy link
Contributor

I think we just need to find the right adapter compatible with Mistral, right?

The model architecture also needs to be supported by vLLM. Qwen appears to be supported but I have not verified, have you?

@danehans danehans mentioned this issue Jan 30, 2025
@liu-cong
Copy link
Contributor

The model architecture also needs to be supported by vLLM. Qwen appears to be supported but I have not verified, have you?

I didn't try but Qwen appears in the vLLM get started guide so I am pretty sure it's supported.

@danehans danehans added this to the v0.1.0-rc.1 milestone Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants