Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Gradio]Base agent gradio for multimodal #146

Merged
merged 2 commits into from
Dec 13, 2023

Conversation

Southpika
Copy link
Contributor

1.添加gradio的多模态输入输出
2.添加文件整体一览
#TODO
输出暂时只支持image

Copy link
Collaborator

@sijunhe sijunhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Bobholamovic 对于gradio比较熟悉,麻烦review下内容。
gradio的代码后续应该会更多,更复杂,但是gradio其实并没agent class的核心功能。结构上,我建议把所有gradio相关的代码从Agent这个class里面剥离出去,专门写成一个叫GradioMixin的类。然后Agent的继承关系写成 Agent(BaseAgent, GradioMixin)

Copy link
Collaborator

@sijunhe sijunhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

先合入一波

@sijunhe sijunhe merged commit a05b2ec into PaddlePaddle:develop Dec 13, 2023
@Bobholamovic
Copy link
Member

@Bobholamovic 对于gradio比较熟悉,麻烦review下内容。 gradio的代码后续应该会更多,更复杂,但是gradio其实并没agent class的核心功能。结构上,我建议把所有gradio相关的代码从Agent这个class里面剥离出去,专门写成一个叫GradioMixin的类。然后Agent的继承关系写成 Agent(BaseAgent, GradioMixin)

同意,应该剥离出去



def get_file_type(file_name: str) -> str:
if "." not in file_name:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文件后缀相关操作建议使用os.pathpathlib

@@ -107,6 +109,7 @@ def launch_gradio_demo(self, **launch_kwargs: Any):
) from None

raw_messages = []
self.use_file: List[File] = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use_file容易让人以为是开关或者函数,建议用files或者 files_to_use。此外,这个变量看起来似乎并不需要设置为Agent的属性?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

主要为当次上传的文件,之后改为私有属性,使用为Agent的属性主要考虑为保证其全局性,上传完的文件传入模型之后之后不会再次传入

)

with gr.Accordion("Files", open=False):
file_lis = self._file_manager._file_registry.list_files()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

访问公共属性file_registry而不是内部属性

history = history + [((single_file.name,), None)]
size = len(file)

output_lis = self._file_manager._file_registry.list_files()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

避免使用其它对象的下划线开头的内部属性


if (
response.files
and response.files[-1].type == "output"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里考虑了简化的情形:最后一个被工具使用的输出文件。可以检查其它复杂一些、我们现在暂时不处理的情形,例如工具存在多个输出文件等,如果发现可能被隐式忽略的情况,可通过gr.Warning给出警告。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为主要考虑我们的对话也只提取了此run的第一个和最后一个对话,只提取最后一轮对话中的文件主要考虑和展示界面的一致性,是否需要将中间文件进行提醒呢,这种情况我理解应该希望用户通过下面的files进行查看?

if output_file_id in response.text:
output_result = response.text
output_result = output_result.replace(
output_file_id, IMAGE_HTML.format(BASE64_ENCODED=base64_encoded)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里好像硬编码成图片了,是不是至少判断/推断下文件类型,如果不是图片的话给出警告并做其它处理?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个地方可能传错版本了 这里会使用引入在common的函数判断是否为图片

skyan pushed a commit to skyan/ERNIE-Bot-SDK that referenced this pull request Dec 26, 2023
* gradio for multimodal

* fix small bugs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants