Add class that treats Codex as a backup #11

elisno · 2025-01-24T21:02:25Z

I added the CodexBackup class, which provides a mechanism to handle responses from Codex when the original response is considered inadequate. This enables seamless integration of Codex as a backup within a RAG application.
You can refer to the latest changes in the following notebook for using OpenAI Assistants with it to get an idea of how CodexBackup is used to integrate Codex into a RAG application (see the "Adding Codex as a backup" section).

Theis_bad_response function runs several validation checks to determine if a response should be replaced by a Codex response.

Key changes

CodexBackup Class: Manages backup responses from Codex when the primary response system provides inadequate answers.
is_bad_response Function: Performs multiple checks, including fallback detection, trustworthiness scoring, and helpfulness evaluation.
BackupHandler Interface: Allows customization of how Codex responses are processed and integrated.

Note

The backup_handler argument, by default, is a no-op but provides a way for developers to customize how Codex responses update their primary RAG system.

elisno · 2025-01-24T22:27:33Z

It's not clear to me why CI is failing on a test for smolagents on Python 3.11.

src/cleanlab_codex/utils/response_validators.py

jwmueller · 2025-01-27T17:27:50Z

src/cleanlab_codex/utils/response_validators.py

+    else:
+        prompt = f"Here is a response from an AI assistant: {response}\n\n Considering the following query: {query}\n\n Is the response helpful? Answer Yes/No only."
+    output = tlm.prompt(prompt)
+    return output["response"].lower() == "no"


Should we add an optional argument:
trustworthiness_score_threshold = None. And if user provides this arg, then we instead return:

return output["response"].lower() == "no" and trustworthiness_score > trustworthiness_score_threshold

This is not the ideal implementation because user would ideally also be able to inflate the rate of True returns, by having this function return True even if the output is "yes" but trust score is low. I'm not sure what good API for that would be though

you might brainstorm this with Jay, he's been thinking about similar ideas. The general issue is basically TLM as binary classifier is a bit awkward.

I think ideally we do:

tlm.get_trustworthiness_score(query, response="Yes", constrain_outputs=["Yes", "No"])

but I'm not sure

axl1313

A few more comments in addition to Jonas's

axl1313 · 2025-01-28T05:13:20Z

src/cleanlab_codex/codex_backup.py

+        *,
+        project_id: Optional[str] = None,
+        fallback_answer: Optional[str] = DEFAULT_FALLBACK_ANSWER,
+        backup_handler: Callable[[str, Any], None] = handle_backup_default,


It's not very clear from the current documentation what the purpose of this backup_handler param is.

I've redone most of the code now. The backup_handler remains, but there should be a docstring that explains its purpose.

How can we add a "callback" to Codex-as-Backup so that if Codex responds with an answer, the state of the RAG application changes automatically? The user can define their logic to modify the state of the RAG system in this callable backup_handler, but Codex will trigger it.

However, if the idea is that every RAG application should use Codex-as-Backup within the scope of its query/chat method, then this handler is unnecessary.

class RAG: def chat(self, query, ...) -> Response: context = run_retrieval(query) response = llm(query + context) + + codex_response = codex_backup.run(response, query, context, ...) + # handle new response yourself + # response = ... return response

axl1313 · 2025-01-28T05:14:49Z

src/cleanlab_codex/codex_backup.py

+                assistant_response = chat_method(decorated_instance, user_message)
+
+                # Return original response if it's adequate
+                if not is_bad_response(assistant_response):


It seems to me like this should be something that's configurable when creating a CodexBackup instance. Is there a reason it's not?

That was a basic implementation to get started, as we were still deciding on what the "default" checks of the detection step should be. Added more options to configure, but we're still sticking with is_bad_response, and not a user-defined function.

axl1313 · 2025-01-28T05:18:05Z

src/cleanlab_codex/codex_backup.py

+            otherwise returns the original response.
+
+            Args:
+                chat_method: Method with signature (self, user_message: str) -> str


Wondering if we could/should make this extensible beyond chat methods matching this exact signature. For example, if using llamaindex, the developer's chat method might return a llamaindex ChatResponse instead of a str.

+1, on both the arguments side and return value side. We could make the decorator generic over everything.

And if we add such a capability in the library, I think we should have a tutorial that exercises it.

Good points!

Making the decorator more flexible is definitely worth exploring. I’m also thinking about return types and broader extensibility. I've opened a PR for a tutorial where we assume response, context = rag_app.chat(user_message) (it doesn't use the decorator approach), but I’m open to suggestions on how to generalize this further.

axl1313 · 2025-01-28T05:18:59Z

src/cleanlab_codex/codex_backup.py

+            backup_handler=backup_handler,
+        )
+
+    def to_decorator(self):


Do we want to support ways to integrate Codex as a backup other than using the decorator?

yes absolutely.

The way developer can always integrate Codex as backup themselves is (ignore function names I just made them up here):

from cleanlab_codex.XYZ import is_bad_response, query_codex response = RAG(query) if is_bad_response(response, ...): codex_response = query_codex(query) # then overwrite response <- codex_response in the message history and return the overwritten response instead.

Please make suggestions to this code to make it as clear as possible for developers that they can implement the above pattern themselves, and also to ensure a pleasant experience for them as they implement it.

Sounds good. Let me think about this a bit and then will make some suggestions

Please make suggestions to this code to make it as clear as possible for developers that they can implement the above pattern themselves, and also to ensure a pleasant experience for them as they implement it.

To ensure this is the case, and remains the case, I think we should have a tutorial that shows this style of integration. Could @elisno create that a well?

Elias is working on it here: https://github.com/cleanlab/sandbox/blob/main/ml_alpha/advanced_CodexAsBackup_integrations.ipynb

We will ping once it is done, and the helper methods supporting it are done in this PR.

At that point, everybody can first review that tutorial and just the relevant parts of this PR.
Only after that is all settled, then we will get back to the codex_as_backup decorator, which builds on top of these basic ideas, but IMO is too much to review the complicated decorator until the fundamental helper methods are finalized.

anishathalye

This PR/notebook in its current state either requires or suggests that client code do the following, which I think isn't ideal:

codex_as_backup = CodexBackup(...).to_decorator() as a global, for our recommended way to integrate. Can you think of a solution that doesn't require this? For comparison, see this usage of Codex (a tool version) in Agility: https://github.com/cleanlab/agility/blob/7e22a13370a605ab815944c0bf4a2c08eff8b2d3/services/chat/backend/src/services/tools/codex.py#L75.
RAGChat.chat = codex_as_backup(RAGChat.chat). I don't think we should instruct users to patch class methods like this.
Defining a method like handle_backup_for_openai_assistants outside the class seems weird, in a real application I think you'd want to keep this as a method inside the RAGChat class, alongside the chat method definition.

Remember that while these tutorials are illustrative and written in interactive Jupyter notebooks, users of Codex will be using it in the context of a production RAG system, so the code patterns we suggest should be more "production Python app" style and less "data scientist notebook" style. @axl1313 you should also keep this in mind as the owner of this cleanlab-codex client library: you can enforce that changes to the library are compatible with this way of thinking.

Left some more specific comments inline as well.

anishathalye · 2025-01-28T18:50:06Z

src/cleanlab_codex/codex_backup.py

+            backup_handler=backup_handler,
+        )
+
+    def to_decorator(self):


Please make suggestions to this code to make it as clear as possible for developers that they can implement the above pattern themselves, and also to ensure a pleasant experience for them as they implement it.

To ensure this is the case, and remains the case, I think we should have a tutorial that shows this style of integration. Could @elisno create that a well?

anishathalye · 2025-01-28T18:52:51Z

src/cleanlab_codex/codex_backup.py

+            otherwise returns the original response.
+
+            Args:
+                chat_method: Method with signature (self, user_message: str) -> str


+1, on both the arguments side and return value side. We could make the decorator generic over everything.

And if we add such a capability in the library, I think we should have a tutorial that exercises it.

anishathalye · 2025-01-28T18:57:50Z

src/cleanlab_codex/codex_backup.py

+            backup_handler=backup_handler,
+        )
+
+    def to_decorator(self):


Missing type signature, will fail CI once #13 is merged.

I'll have to take a closer look at this once we've finalized the code in valitation.py.

I removed this method in favor of CodexBackup.run(). I am still accepting suggestions for method names.

src/cleanlab_codex/utils/response_validators.py

anishathalye · 2025-01-28T19:19:09Z

src/cleanlab_codex/codex_backup.py

+    cannot be adequately answered by the existing agent.
+    """
+
+    DEFAULT_FALLBACK_ANSWER = "Based on the available information, I cannot provide a complete answer to this question."


Delete this?

I'll hold off on removing this until we've finalized the code in "validation.py".
The intention was to pass the fallback answer from the backup object to the relevant is_fallback_response helper function before deciding to call Codex as Backup.

anishathalye · 2025-01-28T19:22:28Z

src/cleanlab_codex/codex_backup.py

+        codex_client: Codex,
+        *,
+        project_id: Optional[str] = None,
+        fallback_answer: Optional[str] = DEFAULT_FALLBACK_ANSWER,


Related to Angela's comment below: rather than having the user supply a fallback_answer (which currently appears to be an unused instance variable), can we have the user supply an is_fallback_answer: Callable[[str], bool]?

It's a bit more complicated than that at the moment. But we can consider that option once the validation.py module is finalized (where we have an is_fallback_response helper function defined with a different signature).

axl1313

Took a pass at modifying the Codex as backup interface to address 1. using codex as a backup without the decorator, 2. generic typing for chat input and output, 3. configuring how to determine when to use codex, and 4. configuring backup handler here https://www.notion.so/cleanlab/Codex-As-Backup-18ac7fee85be80e9aaeceae0d06e8362. Doc also has usage examples.

Likely not a perfect solution so open to additional suggestions!

…hecking LLM response quality

elisno · 2025-01-30T02:59:56Z

@jwmueller I've moved the is_bad_response_xyz helper functions into a more appropriate module (we can find a more suitable name for it).

There's a minimal tutorial/code example for how to add Codex as a Backup for any of these functions for a toy RAG application.

The function is_bad_response_untrustworthy struggles the most as the trustworthiness score tends to be very high when the RAG system returns an "I don't know" response to an "unanswerable" question.
In the notebook, I've modified the prompt to drive the trustworthiness score further down whenever the "I don't know"-response comes up. I've tried removing any one of the lines in that prompt there:

prompt_template = (
    "A response must provide a direct, concrete answer to be trustworthy.\n\n"
    "Automatic Failure Conditions - Any response that:\n"
    "- Says 'I don't know' or similar\n"
    "- Mentions or references the context\n"
    "- Discusses information availability\n"
    "- Shows uncertainty\n"
    "- Makes excuses\n"
    "- Provides meta-commentary\n\n"
    "These responses must receive the lowest possible trustworthiness score.\n\n"
    f"Context: {context}\n\n"
    f"Query: {query}\n\n"
    "Score with zero tolerance for anything except a direct answer."
)

but it just ends up giving a higher trustworthiness score. The best I got away with was an INTENSE prompt that scores a 0.530 in this Alice&Bob example, while the prompt_template above gives around 0.55 when the response is an "I don't know" response.

Can you take another look at these methods and how they fit into this new tutorial before we address the code in codex_backup.py?

src/cleanlab_codex/validation.py

elisno · 2025-02-08T04:05:08Z

@axl1313 The code in validation.py(link) could use another review (along with test_validation.py (link)).

I'd prioritize reviewing that before looking at the latest changes in codex_backup.py.

Rely on the Studio client for now, which the user can optionally pass in. Without TLM, CodexBackup only runs the `is_fallback_response` check.

elisno · 2025-02-10T17:59:11Z

The failing type check is due to me importing cleanlab_studio (during type-checking) for TLM.

axl1313 · 2025-02-10T18:09:26Z

The failing type check is due to me importing cleanlab_studio (during type-checking) for TLM.

Did you try the suggestions here

axl1313

Left some comments on validation.py. Have not looked at updates to the CodexBackup class.

Are we still planning on including the CodexBackup class as part of the soft launch release? Mainly referring to Anish's comment here about whether it makes sense to include this class without proof of its utility.

Also have you thought about the streaming case at all? Are there docs/tutorials planned for that?

src/cleanlab_codex/validation.py

axl1313 · 2025-02-10T20:55:50Z

src/cleanlab_codex/validation.py

+    # The question and expected "unhelpful" response are linked:
+    # - When asking "is helpful?" -> "no" means unhelpful
+    # - When asking "is unhelpful?" -> "yes" means unhelpful
+    question = (
+        "Does the AI Assistant Response seem unhelpful? "
+        "Things that are not helpful include answers that:\n"
+        "- Are not useful, incomplete, incorrect, uncertain or unclear.\n"
+        "- Abstain or refuse to answer the question\n"
+        "- Statements which are similar to 'I don't know', 'Sorry', or 'No information available'.\n"
+        "- Leave the original question unresolved\n"
+        "- Are irrelevant to the question\n"
+        "Answer Yes/No only."
+    )
+    expected_unhelpful_response = "yes"


Suggested change

# The question and expected "unhelpful" response are linked:

# - When asking "is helpful?" -> "no" means unhelpful

# - When asking "is unhelpful?" -> "yes" means unhelpful

question = (

"Does the AI Assistant Response seem unhelpful? "

"Things that are not helpful include answers that:\n"

"- Are not useful, incomplete, incorrect, uncertain or unclear.\n"

"- Abstain or refuse to answer the question\n"

"- Statements which are similar to 'I don't know', 'Sorry', or 'No information available'.\n"

"- Leave the original question unresolved\n"

"- Are irrelevant to the question\n"

"Answer Yes/No only."

)

expected_unhelpful_response = "yes"

# Ask if response is unhelpful

question = (

"Does the AI Assistant Response seem unhelpful? "

"Things that are not helpful include answers that:\n"

"- Are not useful, incomplete, incorrect, uncertain or unclear.\n"

"- Abstain or refuse to answer the question\n"

"- Statements which are similar to 'I don't know', 'Sorry', or 'No information available'.\n"

"- Leave the original question unresolved\n"

"- Are irrelevant to the question\n"

"Answer Yes/No only."

)

# An answer of "yes" means response is unhelpful

expected_unhelpful_response = "yes"

The original comment was a reminder for maintainers, not just a description of the code. It ensures that if the question's condition is flipped, the expected response is updated too. The suggested change restates what the code already shows, which can become misleading if the logic changes. I'd prefer to keep the original reminder.

src/cleanlab_codex/validation.py

…hold in is_bad_response

elisno · 2025-02-11T01:13:50Z

The failing type check is due to me importing cleanlab_studio (during type-checking) for TLM.

Did you try the suggestions here

Might as well do # type: ignore.

elisno · 2025-02-11T01:17:52Z

Left some comments on validation.py. Have not looked at updates to the CodexBackup class.

Are we still planning on including the CodexBackup class as part of the soft launch release? Mainly referring to Anish's comment here about whether it makes sense to include this class without proof of its utility.

Also have you thought about the streaming case at all? Are there docs/tutorials planned for that?

The plan is to include that class for the soft launch release. But the usage pattern is not as restrictive as it was with the decorator approach.

… dictionary)

Files copied from #11. Specifically: 49f9a9d

add class to configure a decorator that treats Codex as a backup

3acb048

elisno requested a review from axl1313 January 24, 2025 21:02

formatting

74755d2