Open-source LLM from HuggingFace Agent Support for json output parsing #130

tugrulguner · 2023-08-23T02:22:36Z

Hi,

I added HuggingFace text-generation models support, which was failing because of the use of ChatOpenAI (OpenAI API Key error occurs) when ConversationalAgent applies json output parsing through ConvoJSONOutputParser that uses load_json_output. To do that, instead of calling ChatOpenAI in the except part of the load_json_output, I am passing whatever llm model is given, either ChatOpenAI or HuggingFaceTextGenerationModel, just passing the provided llm inside to make it more general.

I also added max_retry to set the amount of retry that recursive function being applied to get the correct output that JSON parseable. _fix_message needs to be improved to get better responses from models each time it is called. I added it as TO DO as well as a comment there.

After max_retry number is reached and if we couldn't get the proper json parseable response, I raised ValueError to inform user that selected open-source model is not good enough or is not able to generate json parseable output.

I checked this modified code with ChatOpenAI and it is working with no issue. But, there is a lot of things to do to improve and set the boundaries for the open-source model use. This may mean one needs to specify and allow some specific open-source models only. Also, some model optimizations can be applied.

tugrulguner · 2023-08-30T17:36:42Z

anyone, any comments on this? @tiangolo @yyiilluu

yyiilluu

i am very sorry for the delayed review and thank you for making this improvement.
i did a pretty bad job for this one :)
once the comment is addressed, i would be more than happy to approve it

yyiilluu · 2023-09-13T16:52:32Z

autochain/agent/structs.py


        return response
+
+    @staticmethod
+    def _fix_message(clean_text: str) -> UserMessage:


can you make this is a helper function inside of _attempt_fix_and_generate?

yyiilluu · 2023-09-13T16:55:35Z

autochain/agent/structs.py

@@ -55,30 +54,94 @@ def format_output(self) -> Dict[str, Any]:


 class AgentOutputParser(BaseModel):


i think there is a way to simplify this logics inside of this function
first json load, if run into exception, fix and generate, format into message for load_json_output, decrease retry, call load_json_output again,
if no exception, just return

tugrulguner · 2023-09-15T02:40:15Z

No worries at all! I will work on your comments, thank you so much

tugrulguner added 8 commits August 20, 2023 23:37

load_json_output modified for general llm

f833dbf

json correction generalized with max_retry

b7ed9c2

passing llm into parse

d8bab2c

Corrections made for message

1aeb323

more debugging for json correction

be7d5ed

comment and to do added

c761556

clearing prints and modifying initial llm gen

a2afbf1

reversing back the llm gen parameter

4ccfbdd

yyiilluu reviewed Sep 13, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Open-source LLM from HuggingFace Agent Support for json output parsing #130

Open-source LLM from HuggingFace Agent Support for json output parsing #130

tugrulguner commented Aug 23, 2023

tugrulguner commented Aug 30, 2023

yyiilluu left a comment

yyiilluu Sep 13, 2023

yyiilluu Sep 13, 2023

tugrulguner commented Sep 15, 2023

		@@ -55,30 +54,94 @@ def format_output(self) -> Dict[str, Any]:


		class AgentOutputParser(BaseModel):

Open-source LLM from HuggingFace Agent Support for json output parsing #130

Are you sure you want to change the base?

Open-source LLM from HuggingFace Agent Support for json output parsing #130

Conversation

tugrulguner commented Aug 23, 2023

tugrulguner commented Aug 30, 2023

yyiilluu left a comment

Choose a reason for hiding this comment

yyiilluu Sep 13, 2023

Choose a reason for hiding this comment

yyiilluu Sep 13, 2023

Choose a reason for hiding this comment

tugrulguner commented Sep 15, 2023