Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Async Initiate chat autogen workflow interrupts more for human input feedback vs. Sync initiate chat agents conversation flow. #3575

Open
b4codify opened this issue Sep 26, 2024 · 1 comment
Labels
0.2 Issues which were filed before re-arch to 0.4

Comments

@b4codify
Copy link

Describe the bug

Async a_initiate_chat agents conversation is interrupting more for human feedback whereas same codebase but for Sync initiate_chat workflow only asks once for human feedback.
Moreover, in case Async flow, it doesn't execute the generated code after first human no feedback provided at console prompt, but it asks again for the feedback and when user provides no feedback again, only then generated code is executed. Refer the attached console log as well as video recording for your ref.

This autogen framework's behavior of asking human feedback multiple times gets very chaotic when there are more agents involved and completely breaks the application logic specially around asking same feedback repeatedly by Async a_initiate_chat() agents' workflow.

Here is the output for your ref. for both Sync and Async program:
SYNC output:

--------------------------------------------------------------------------------
coder (to user_proxy):

To get the date today, we can use the datetime module in Python. Here's a simple script:

'''python
# date_script.py
import datetime

def get_date():
    today = datetime.date.today()
    return today

print(get_date())
'''

Please execute this script to get the date today.

--------------------------------------------------------------------------------
Replying as user_proxy. Provide feedback to coder. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: 

>>>>>>>> NO HUMAN INPUT RECEIVED.

>>>>>>>> USING AUTO REPLY...

>>>>>>>> EXECUTING CODE BLOCK (inferred language is python)...
user_proxy (to coder):

exitcode: 0 (execution succeeded)
Code output: 2024-09-26


--------------------------------------------------------------------------------
coder (to user_proxy):

TERMINATE

ASYNC output

--------------------------------------------------------------------------------
coder (to user_proxy):

To get the date today, we can use the datetime module in Python. Here's a simple script:

'''python
# date_script.py
import datetime

def get_date():
    today = datetime.date.today()
    return today

print(get_date())
'''

Please execute this script to get the date today.

--------------------------------------------------------------------------------
Replying as user_proxy. Provide feedback to coder. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: 

>>>>>>>> NO HUMAN INPUT RECEIVED.

>>>>>>>> USING AUTO REPLY...
Replying as user_proxy. Provide feedback to coder. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: 

>>>>>>>> NO HUMAN INPUT RECEIVED.

>>>>>>>> USING AUTO REPLY...

>>>>>>>> EXECUTING CODE BLOCK (inferred language is python)...
user_proxy (to coder):

exitcode: 0 (execution succeeded)
Code output: 2024-09-26


--------------------------------------------------------------------------------

Steps to reproduce

Refer the Async & Sync code for your ref.
SYNC Code

import os
from autogen import (
    ConversableAgent,
    GroupChat,
    GroupChatManager,
)
from autogen.cache import Cache
from autogen.coding import DockerCommandLineCodeExecutor

from dotenv import load_dotenv
from model_config import ModelConfig

load_dotenv()
llm_env_config = ModelConfig.from_env()
os.makedirs("agent_code", exist_ok=True)

# Litellm config
litellm_config_list = [
    {
        "model": llm_env_config.model_name,
        "api_key": llm_env_config.api_key,
        "base_url": llm_env_config.api_url,
        "temperature": llm_env_config.temperature,
        "cache_seed": None,
        "price": [0, 0],
    },
]

config_list = {
    "config_list": litellm_config_list,
    "temperature": llm_env_config.temperature,
    "cache_seed": None,
}


def initiatize_agents():

    code_cmd_executor = DockerCommandLineCodeExecutor(
        image="agent-runtime-1:latest",
        work_dir="agent_code",
        timeout=60,
        container_name="autogen_container1",
        auto_remove=False,
        stop_container=False,
    )

    coder = ConversableAgent(
        name="coder",
        llm_config=config_list,
        max_consecutive_auto_reply=3,
        system_message="You are an expert in python."
        "You suggest a feasible plan for finishing a complex task by decomposing it into 3-5 sub-tasks. "
        "Make sure that your suggested plan can only be implemented in code or bash commands."
        "You solve tasks using your coding skills."
        "You write python code (in a python coding block) for the user to execute."
        "You can use the available tools call available to you to complete a task."
        "Do not ask 'user_proxy' to carry out or modify any task."
        "Do not assume anything else and generate code only based on the given information."
        "Make sure to install the libraries and dependencies you need using pip and bash commands before using them in the code."
        "Solve the tasks step by step if you need to."
        "When using code, you must indicate the script type in the code block. The user cannot provide any other feedback or perform any other action beyond executing the code you suggest."
        "The user can't modify your code. So do not suggest incomplete code which requires users to modify."
        "Don't use a code block if it's not intended to be executed by the user."
        "Do not ask users to copy and paste the result. Instead, use 'print' function for the output when relevant. Check the execution result returned by the user."
        "Finally, inspect and review the feedback or execution results from user_proxy."
        "If the execution is wrong, analyze the error and suggest a fix."
        "If the plan is not good, suggest a better plan. "
        "If code got executed successfully then work on the next pending sub-tasks."
        "When you find an answer, verify the answer carefully. Include verifiable evidence in your response if possible."
        "IMPORTANT: "
        "DO NOT invent tools call if not registered with agent."
        "Do not show appreciation in your responses, say only what is necessary."
        "'Thank you' or 'You're welcome' are said in the conversation, then say TERMINATE to indicate the conversation is finished and this is your last message."
        "Wait for the user to execute your code and then you can reply with the word 'TERMINATE' if execution completes the pending task"
        "Return 'TERMINATE' when all the tasks are completed or if nothing else to be done."
        "DO NOT REPEAT YOURSELF."
        "DO NOT hallucinate.",
        description="Write the python code or bash commands to be executed by the user_proxy to complete a given task.",
        code_execution_config=False,
        human_input_mode="NEVER",
    )

    user_proxy = ConversableAgent(
        name="user_proxy",
        description="Execute the code or bash commands provided by the coder and reports the results back to coder.",
        human_input_mode="ALWAYS",
        max_consecutive_auto_reply=3,
        is_termination_msg=lambda msg: msg.get("content") is not None
        and "TERMINATE" in msg["content"],
        code_execution_config={
            "last_n_messages": "auto",
            "executor": code_cmd_executor,
        },
    )

    task = "What's the date today?"

    # Use Cache.disk to cache LLM responses. Change cache_seed for different responses.
    with Cache.disk(cache_seed=40) as cache:
        chat_results = user_proxy.initiate_chat(
            recipient=coder,
            message=task,
            cache=cache,
            summary_method="last_msg",
        )
        # return the chat summary
        return chat_results.summary


def cleanup_docker_container():
    import docker

    client = docker.from_env()
    try:
        container = client.containers.get("autogen_container1")
        container.stop()
        container.remove()
        print("\n###### Docker container cleaned up successfully.")
    except docker.errors.NotFound:
        print("\n###### No container to clean up.")
    except Exception as e:
        print(f"\n****** Error cleaning up container: {str(e)}")


def main():
    res = initiatize_agents()
    if res:
        print(f"\n Agents response: {res}")
        print("\n\n####### Agents workflow completed ########\n\n")

    # Add cleanup code
    cleanup_docker_container()


if __name__ == "__main__":
    main()

ASYNC Code

import asyncio
import os
from autogen import (
    ConversableAgent,
    GroupChat,
    GroupChatManager,
)
from autogen.cache import Cache
from autogen.coding import DockerCommandLineCodeExecutor

from dotenv import load_dotenv
from model_config import ModelConfig

load_dotenv()
llm_env_config = ModelConfig.from_env()
os.makedirs("agent_code", exist_ok=True)

# Litellm config
litellm_config_list = [
    {
        "model": llm_env_config.model_name,
        "api_key": llm_env_config.api_key,
        "base_url": llm_env_config.api_url,
        "temperature": llm_env_config.temperature,
        "cache_seed": None,
        "price": [0, 0],
    },
]

config_list = {
    "config_list": litellm_config_list,
    "temperature": llm_env_config.temperature,
    "cache_seed": None,
}


async def initiatize_agents():

    code_cmd_executor = DockerCommandLineCodeExecutor(
        image="agent-runtime-1:latest",
        work_dir="agent_code",
        timeout=60,
        container_name="autogen_container2",
        auto_remove=False,
        stop_container=False,
    )

    coder = ConversableAgent(
        name="coder",
        llm_config=config_list,
        max_consecutive_auto_reply=3,
        system_message="You are an expert in python."
        "You suggest a feasible plan for finishing a complex task by decomposing it into 3-5 sub-tasks. "
        "Make sure that your suggested plan can only be implemented in code or bash commands."
        "You solve tasks using your coding skills."
        "You write python code (in a python coding block) for the user to execute."
        "You can use the available tools call available to you to complete a task."
        "Do not ask 'user_proxy' to carry out or modify any task."
        "Do not assume anything else and generate code only based on the given information."
        "Make sure to install the libraries and dependencies you need using pip and bash commands before using them in the code."
        "Solve the tasks step by step if you need to."
        "When using code, you must indicate the script type in the code block. The user cannot provide any other feedback or perform any other action beyond executing the code you suggest."
        "The user can't modify your code. So do not suggest incomplete code which requires users to modify."
        "Don't use a code block if it's not intended to be executed by the user."
        "Do not ask users to copy and paste the result. Instead, use 'print' function for the output when relevant. Check the execution result returned by the user."
        "Finally, inspect and review the feedback or execution results from user_proxy."
        "If the execution is wrong, analyze the error and suggest a fix."
        "If the plan is not good, suggest a better plan. "
        "If code got executed successfully then work on the next pending sub-tasks."
        "When you find an answer, verify the answer carefully. Include verifiable evidence in your response if possible."
        "IMPORTANT: "
        "DO NOT invent tools call if not registered with agent."
        "Do not show appreciation in your responses, say only what is necessary."
        "'Thank you' or 'You're welcome' are said in the conversation, then say TERMINATE to indicate the conversation is finished and this is your last message."
        "Wait for the user to execute your code and then you can reply with the word 'TERMINATE' if execution completes the pending task"
        "Return 'TERMINATE' when all the tasks are completed or if nothing else to be done."
        "DO NOT REPEAT YOURSELF."
        "DO NOT hallucinate.",
        description="Write the python code or bash commands to be executed by the user_proxy to complete a given task.",
        code_execution_config=False,
        human_input_mode="NEVER",
    )

    user_proxy = ConversableAgent(
        name="user_proxy",
        description="Execute the code or bash commands provided by the coder and reports the results back to coder.",
        human_input_mode="ALWAYS",
        max_consecutive_auto_reply=3,
        is_termination_msg=lambda msg: msg.get("content") is not None
        and "TERMINATE" in msg["content"],
        code_execution_config={
            "last_n_messages": "auto",
            "executor": code_cmd_executor,
        },
    )

    task = "What's the date today?"

    # Use Cache.disk to cache LLM responses. Change cache_seed for different responses.
    with Cache.disk(cache_seed=40) as cache:
        chat_results = await user_proxy.a_initiate_chat(
            recipient=coder,
            message=task,
            cache=cache,
            summary_method="last_msg",
        )
        # return the chat summary
        return chat_results.summary


def cleanup_docker_container():
    import docker

    client = docker.from_env()
    try:
        container = client.containers.get("autogen_container2")
        container.stop()
        container.remove()
        print("\n###### Docker container cleaned up successfully.")
    except docker.errors.NotFound:
        print("\n###### No container to clean up.")
    except Exception as e:
        print(f"\n****** Error cleaning up container: {str(e)}")


async def main():
    res = await initiatize_agents()
    if res:
        print(f"\n Agents response: {res}")
        print("\n\n####### Agents workflow completed ########\n\n")

    # Add cleanup code
    cleanup_docker_container()


if __name__ == "__main__":
    asyncio.run(main())

Model Used

No response

Expected Behavior

  • It should behave exactly same as Sync code block in case of a_initiate_chat() call.
  • It should be able to execute the generated code on receiving first human feedback in case of a_initiate_chat() call.

Screenshots and logs

Autogen_Sync_Async_behavior

Additional Information

No response

@b4codify b4codify added the bug label Sep 26, 2024
@rysweet rysweet added 0.2 Issues which were filed before re-arch to 0.4 needs-triage labels Oct 2, 2024
@ekzhu
Copy link
Collaborator

ekzhu commented Oct 13, 2024

@b4codify . Thanks for the issue. This is indeed a bug in the a_generate_reply method which runs all registered reply functions whether they are async or sync.

for reply_func_tuple in self._reply_func_list:
reply_func = reply_func_tuple["reply_func"]
if "exclude" in kwargs and reply_func in kwargs["exclude"]:
continue
if self._match_trigger(reply_func_tuple["trigger"], sender):
if inspect.iscoroutinefunction(reply_func):
final, reply = await reply_func(
self, messages=messages, sender=sender, config=reply_func_tuple["config"]
)
else:
final, reply = reply_func(self, messages=messages, sender=sender, config=reply_func_tuple["config"])
if final:
return reply

To fix this bug, we need to add a new flag to the register_reply method to allow ignore sync reply functions in async chat, much like the existing flag ignore_async_in_sync_chat:

ignore_async_in_sync_chat: bool = False,

And then in the constructor, set this flag to true when registering the a_check_termination_and_human_reply function.

self.register_reply(
[Agent, None], ConversableAgent.a_check_termination_and_human_reply, ignore_async_in_sync_chat=True
)

The team is currently focusing on the release of v0.4 preview. Would you like to submit a PR for this fix?

Thank you!

@fniedtner fniedtner removed the bug label Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.2 Issues which were filed before re-arch to 0.4
Projects
None yet
Development

No branches or pull requests

4 participants