-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preparing LLM Benchmark Table ( LangTest) #946
Comments
@ArshaanNazir please add yourself as an assignee to the task |
@ArshaanNazir any updates? |
We are working on it. Here is the link to the tracking sheet: https://johnsnowlabs-my.sharepoint.com/:x:/p/rakshit/ETX1Z44PipFOqm8Ue8Av3_UBycHH_9oK-oJJUpQfc_n54w?e=exe0Ja |
@ArshaanNazir did we publish any benchmark (LLM and embeddings) on the LangTest web site? |
We have created the streamlit apps for both of the benchmark tables. We are finalising their design and will be update on website by end of this week. |
@ArshaanNazir @vkocaman We have created a new folder for the langtest demos https://github.com/JohnSnowLabs/streamlit-demo-apps/tree/master/langtest do you need anything else? |
I am not sure if we are going ahead with the streamlit apps now. @dcecchini can you confirm ? |
Hi @Cabir40 @ArshaanNazir @muhammetsnts @JustHeroo, we started creating the streamlit apps for the leaderboards but @vkocaman suggested to ask the design team to build them using web tools that look better. They are preparing them; you can check a draft at in this link. In the meantime, we are reviewing the information to be contained on the pages, as we need to make sure that the leaderboards show all the relevant information (adding more filters, improving the visualization, creating more data with benchmark results, etc.). |
I understand why you would want a more attractive web app. I was hoping for a streamlit app -- simply because I am looking for an LLM leaderboard in a box that I could deploy to enterprise clients. |
No description provided.
The text was updated successfully, but these errors were encountered: