Skip to content

Conversation

@tillwf
Copy link
Contributor

@tillwf tillwf commented Jan 5, 2026

Closes MLOB-5067

@tillwf tillwf requested a review from a team as a code owner January 5, 2026 08:39
@tillwf tillwf force-pushed the till.wohlfarth/MLOB-5067/Simplify_gc_access branch from 44971e0 to aa2a7eb Compare January 5, 2026 08:41
| Evaluated on LLM spans | Evaluated using LLM | Checks whether the agent resolved the user’s intent by analyzing full session spans. Runs only on sessions marked as completed. |

##### How to Use
<div class="alert alert-info">Goal completeness is only available for OpenAI and Azure OpenAI.</div>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<div class="alert alert-info">Goal completeness is only available for OpenAI and Azure OpenAI.</div>
<div class="alert alert-info">Goal Completeness is only available for OpenAI and Azure OpenAI.</div>


The evaluation requires sending a span with a specific tag when the session ends. This signal allows the evaluation to identify session boundaries and trigger the completeness assessment:

For optimal evaluation accuracy and cost control, it is preferable to send a tag when the session is finished and configure the evaluation to run only on session with this tag. The evaluation returns a detailed breakdown including resolved intentions, unresolved intentions, and reasoning for the assessment. A session is considered incomplete if more than 50% of identified intentions remain unresolved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For optimal evaluation accuracy and cost control, it is preferable to send a tag when the session is finished and configure the evaluation to run only on session with this tag. The evaluation returns a detailed breakdown including resolved intentions, unresolved intentions, and reasoning for the assessment. A session is considered incomplete if more than 50% of identified intentions remain unresolved.
For optimal evaluation accuracy and cost control, it is preferable to send a tag when the session is finished and configure the evaluation to run only on sessions with this tag. The evaluation returns a detailed breakdown including resolved intentions, unresolved intentions, and reasoning for the assessment. A session is considered incomplete if more than 50% of identified intentions remain unresolved.


1. Go to the **Goal Completeness** settings
2. Configure the evaluation data:
- Select **spans** as the data type since Goal Completeness runs on LLM spans which contains the full session history.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Select **spans** as the data type since Goal Completeness runs on LLM spans which contains the full session history.
- Select **spans** as the data type since Goal Completeness runs on LLM spans which contain the full session history.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants