You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1) Use "whether the user's final goal is achieved" as the ONLY criterion, not whether all steps were executed.
41
-
2) If there is clear evidence that the final deliverable/final outcome has been produced and is usable, set finished=true.
42
-
- For file deliverables, you MUST verify the file exists by checking that an appropriate OSS link is present in session_files.
46
+
2) Consider the expected deliverable type based on user_request:
47
+
- If the user asked for a file/output artifact (e.g., PDF/DOCX/ZIP/code project), you MUST verify the file exists by checking
48
+
that an appropriate OSS link is present in session_files; otherwise finished=false.
49
+
- If the user asked for "in-chat content" (e.g., search + summarize + tutorial), you should judge completion by whether the final
50
+
requested content is already present/produced in history_steps outputs (e.g., the assistant/tool produced a complete tutorial/summary).
43
51
3) If any critical step failed, is missing, is still running, or the outputs are insufficient to prove goal completion, set finished=false.
44
-
4) If the information in history_steps and session_files is insufficient to confirm completion (e.g., no final output, only partial logs,
45
-
or expected output file link is not present in session_files),
52
+
4) If the information in history_steps and session_files is insufficient to confirm completion (e.g., no final summary/tutorial text,
53
+
only partial logs; or a required output file link is not present in session_files),
46
54
you MUST return finished=false and explain what information is missing in reason.
47
55
5) If there are contradictions in history_steps, prefer the later entries. If you still cannot decide, return finished=false
48
56
and explain the contradiction in reason.
49
57
6) Do NOT assume results that are not explicitly supported by history_steps or session_files. Judge only from verifiable evidence.
58
+
50
59
# Output Format (very important)
51
60
You must output ONLY ONE JSON object that strictly matches this schema:
52
61
{{
53
62
"finished": true|false,
54
-
"reason": "A brief, specific explanation in English that cites key evidence from history_steps and/or session_files (e.g., a tool_name status/output or the presence/absence of an OSS link). If not finished, state the critical blocking reason(s) or missing info."
63
+
"reason": "A brief, specific explanation in English that cites key evidence from history_steps and/or session_files (e.g., a tool_name status/output; or the presence/absence of an OSS link when a file is required). If not finished, state the critical blocking reason(s) or missing info."
55
64
}}
65
+
56
66
# Output Constraints
57
67
- Output ONLY valid JSON (no Markdown, no code fences, no extra commentary).
58
68
- reason must be an English string and should reference concrete evidence from history_steps and/or session_files.
0 commit comments