-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP draft of generate() outside of chat.answer() #432
base: main
Are you sure you want to change the base?
Conversation
…e to ingest messages, if appropriate.
This PR needs to coordinate with #438 - wherever the final pattern lands there, I need to integrate in the generate() and answer()(_stream()) ingest points. This is also modulo however we expect to pass prompt str or messages, etc, through the pre-process stage. Will make this a point of conversation in our meet Monday. |
@pmeier - I'd already built out most of the assistant examples, so just did them all this pass- but we don't have to discuss them all for first round; lets just pick _anthropic and _openai and see how those are looking specifically. In the interest of helping this merge quickly, I'll also pull out exl2 for another PR that's just about integrating that as an assistant on its own. |
@pmeier - seems like at this point it makes more sense to rebase the changes from latest corpus-dev unless that's causing trouble somewhere - if that works I'll make that PR and then close this one pointing to the new version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dillonroach I pushed 8e120c6. The two main changes I made are
-
Fix the return types of
.generate()
for all assistants. It now returns an async iterator of the full data the API returns. Since this function is aimed as a general building block for users, we shouldn't assume they only want the text.However, we can make this assumption for
.answer()
. Meaning, this function calls.generate()
and extracts the text.
- The
_render_prompt
and corresponding functions treated incoming messages as dictionaries.ragna.core.Message
is a proper object and thus key access will not work. Furthermore,message.role
is theragna.core.MessageRole
enum and cannot be used directly. Instead, you need to usemessage.role.value
if you want to have the string.
With my commit, CI should be green. Note that I base my review on the latest commit, i.e. including my cleanup. Thus, please make sure to review what I have pushed, because the comments otherwise might make no sense.
"preamble_override": self._make_preamble(), | ||
"message": prompt, | ||
"preamble_override": system_prompt, | ||
"message": self._render_prompt(prompt), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the message indeed can only be a single string here, the endpoint has a chat_history
parameter. And that one takes the previous message similar to all other assistants.
I would let _render_prompt
return a tuple of the chat history and the current user message, e.g.
chat_history, message = self._render_prompt(prompt)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conflicted on this one - seems like this would fit specifically with the pre-process pass, or this puts this one specific assistant ahead of the others in terms of capabilities - it certainly doesn't hurt anything, so happy to do so here, but also see how we might want to see what a pre-process stage looks like for all assistants and implement in one go.
Takes format from chat.answer and duplicates, without sources, as chat.generate which then gets implemented in Assistant as well. Got a bit turned around with the async aspect in the stream = False case, room for cleanup. The exl2 assistant is currently instantiating a number of vars it likely should be taking from Chat instead (e.g. stream:bool)