API-based evaluation support (humanevalpack_openai.py is too old) #234

s-natsubori · 2024-05-10T05:32:11Z

I am trying evaluate API base model .
（Local model working on vLLM Engine, and vLLM provide OpenAI Compatible API interface）

but tasks/humanevalpack_openai.py is not updated.
(postprocess_generation is not applied, save format is incompatible etc)
So I can't pass generation results to evaluate.

Do you have any plans to update it in the future?

If API-based evaluation is applied to all benchmarks, more models can be evaluated easily.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API-based evaluation support (humanevalpack_openai.py is too old) #234

API-based evaluation support (humanevalpack_openai.py is too old) #234

s-natsubori commented May 10, 2024

API-based evaluation support (humanevalpack_openai.py is too old) #234

API-based evaluation support (humanevalpack_openai.py is too old) #234

Comments

s-natsubori commented May 10, 2024