You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
The llama-server allows api calls with logprobs=1, but it would be very nice to also include the option to set echo=True, as was available for older OpenAI models such as davinci-002.
Motivation
This would allow for a number of interesting possibilities such as inferring the likelihood of a prompt given a completion, as done in this project.
OpenAI depreciates the echo option because it's too useful :) would be great to have it back in llama.cpp.
Possible Implementation
No response
The text was updated successfully, but these errors were encountered:
This would be similar to support --all-logits from llama-perplexity right? This would be very useful in the server allowing us to use the server for benchmarking as well.
Prerequisites
Feature Description
The llama-server allows api calls with
logprobs=1
, but it would be very nice to also include the option to setecho=True
, as was available for older OpenAI models such asdavinci-002
.Motivation
This would allow for a number of interesting possibilities such as inferring the likelihood of a prompt given a completion, as done in this project.
OpenAI depreciates the
echo
option because it's too useful :) would be great to have it back in llama.cpp.Possible Implementation
No response
The text was updated successfully, but these errors were encountered: