Skip to content

Conversation

@ulya-tkch
Copy link
Contributor

This PR creates a new way to call pydantic agent with cleanlab through an AgentWrapper. Simply run:

hatch run python src/airline_agent/agent.py --kb-path data/kb.json --vector-db-path data/vector-db --validation-mode agent

The functionality is the same to using --validation-mode cleanlab_log_tools

Key Info

  • Implementation plan: link
  • Priority:

What changed?

What do you want the reviewer(s) to focus on?


Checklist

  • Did you link the GitHub issue?
  • Did you follow deployment steps or bump the version if needed?
  • Did you add/update tests?
  • What QA did you do?
    • Tested...

Copy link
Member

@anishathalye anishathalye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sharing some early feedback. I can do a more thorough review on the next iteration, once the high-level feedback is addressed.

project = get_cleanlab_project()
if validation_mode == "agent":
agent = cast(
Agent,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In mypy, cast is used to assume a typing relationship. It's useful for working around limitations of the type checker. It is up to the programmer to ensure this is correct (e.g., you can cast(str, 3) and this is ok according to mypy).

In this case, I believe this is an incorrect cast: CleanlabAgent is not an Agent. I think we should be using the AbstractAgent type in most places (and CleanlabAgent is indeed an AbstractAgent, by the class hierarchy CleanlabAgent -> Wrapper -> AbstractAgent).

event_stream_handler: Any = None,
) -> AgentRunResult[RunOutputDataT]: ...

async def run(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you want to override async def iter rather than this method. See the implementation of AbstractAgent in Pydantic AI: it uses iter(). Other methods like AbstractAgent.to_ag_ui() also produce something that uses .iter(), so if you don't override that, we'll be missing the Cleanlab integration there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am really struggling with how to get the "final output" to stick here when overriding iter(). It seems that a new AgentRunResult is created at the end of the run and no matter what final nodes I return I cannot get the agent to return the cleanlab replacement string.

logger = logging.getLogger(__name__)

# Constants
CONTEXT_RETRIEVAL_TOOLS = ["search", "get_article", "list_directory"] # Default common tool names
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than hard-coding things like this as constants, could you make them arguments of the CleanlabAgent constructor, so the CleanlabAgent can be a generic way to add Cleanlab to any Pydantic AI agent? That'll get us close to being able to release this as a generic integration for Pydantic AI, e.g., as a submodule in the cleanlab-codex library.

@anishathalye anishathalye requested a review from axl1313 October 20, 2025 17:13
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new CleanlabAgent wrapper class that integrates Cleanlab validation directly into pydantic-ai agents, providing an alternative to the existing post-hoc validation approach. The wrapper intercepts agent execution at the iteration level to apply validation seamlessly.

Key changes:

  • New CleanlabAgent wrapper class that extends WrapperAgent and overrides the iter() method to inject Cleanlab validation
  • Updated CLI to support --validation-mode agent option
  • Pinned pydantic-ai version to 1.0.17

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/airline_agent/cleanlab_utils/cleanlab_agent.py New file implementing CleanlabAgent wrapper with validation logic
src/airline_agent/cleanlab_utils/conversion_utils.py Added instruction extraction from ModelRequest messages
src/airline_agent/agent.py Integrated CleanlabAgent wrapper for "agent" validation mode
pyproject.toml Pinned pydantic-ai to version 1.0.17
README.md Updated documentation to include new "agent" validation mode

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +442 to +443
if content:
tool_name_for_result = tool_call_to_name[tool_msg["tool_call_id"]]
Copy link

Copilot AI Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable tool_name_for_result is only assigned when content is truthy, but it's used outside that conditional block on line 446. This will cause an UnboundLocalError if content is empty. Move the assignment outside the conditional or restructure the logic to ensure tool_name_for_result is always defined before use.

Suggested change
if content:
tool_name_for_result = tool_call_to_name[tool_msg["tool_call_id"]]
tool_name_for_result = tool_call_to_name[tool_msg["tool_call_id"]]

Copilot uses AI. Check for mistakes.
Comment on lines +684 to +750
handled_final_result = False
original_result = None

class CleanlabAgentRun(AgentRun[AgentDepsT, Any]):
def __init__(self, wrapped_run: AgentRun[Any, Any], cleanlab_agent: CleanlabAgent[Any, Any]) -> None:
super().__init__(wrapped_run._graph_run) # noqa: SLF001
self._wrapped = wrapped_run
self._cleanlab_agent = cleanlab_agent
self._modified_result: AgentRunResult[Any] | None = None

@property
def result(self) -> AgentRunResult[Any] | None:
"""Override result property to return modified result if available."""
if self._modified_result is not None:
return self._modified_result
return self._wrapped.result

async def __anext__(self) -> Any:
"""Override async iteration to intercept End nodes."""
nonlocal handled_final_result, original_result

node = await self._wrapped.__anext__()

# If this is an End node and we haven't handled the final result yet
if isinstance(node, End) and not handled_final_result:
handled_final_result = True
original_result = self._wrapped.result

if original_result:
current_history = list(message_history) if message_history else []

updated_history, final_response_str = (
self._cleanlab_agent._run_cleanlab_validation_logging_tools( # noqa: SLF001
project=self._cleanlab_agent.cleanlab_project,
query=user_query,
result=original_result,
message_history=current_history,
tools=self._cleanlab_agent.openai_tools,
thread_id=self._cleanlab_agent.thread_id,
)
)

graph_run = getattr(self._wrapped, "_graph_run", None)
if graph_run and hasattr(graph_run, "state"):
graph_run.state.message_history = updated_history
logger.info(
"[cleanlab] Updated agent run's internal message history: %d messages",
len(updated_history),
)

if final_response_str != original_result.output:
# Create new result with modified output, preserving all original metadata
self._modified_result = AgentRunResult(
output=final_response_str,
_output_tool_name=original_result._output_tool_name, # noqa: SLF001
_state=original_result._state, # noqa: SLF001
_new_message_index=original_result._new_message_index, # noqa: SLF001
_traceparent_value=original_result._traceparent_value, # noqa: SLF001
)
logger.info("[cleanlab] Updated final response string")

return node

def __aiter__(self) -> CleanlabAgentRun:
return self

yield CleanlabAgentRun(agent_run, self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simplify this code a bit and get rid of the need for the CleanlabAgentRun wrapper if we iterate through the nodes in the agent graph within this overridden implementation of iter() and modify the output on the node directly (the node.data.output is used as the AgentRunResult output in the base AgentRun implementation) rather than overriding the result property on the AgentRun. Not sure if there's a reason for wanting to preserve the original output and creating a new modified result object though?

I can send my version of the code if helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants