Skip to content

Issues related to planning and decision-making. #245

@AlphaForgeX

Description

@AlphaForgeX

Background

System: Windows 11
Application: Microsoft PC Manager
Code: Main branch of UFO. https://github.com/microsoft/UFO/tree/main
LLM: gpt-4o - Azure OpenAI (AOAI)

Task Instructions

1.start the MSPCManager desktop application;
2.click the 'Storage' button in UI#1;
3.click the 'Scan' button in UI#2;
4.wait for the 'Deep Cleanup Scan' to finish in UI#3;
5.click the 'Proceed' button in UI#4;
6.wait for the 'Deep Cleanup' to finish in UI#5;
7.click the 'Done' button in UI#6 to complete the whole process.

Expected Process

1.start the MSPCManager desktop application;
Image
2.click the 'Storage' button in UI#1;
Image
3.click the 'Scan' button in UI#2;
Image
4.wait for the 'Deep Cleanup Scan' to finish in UI#3;
Image
5.click the 'Proceed' button in UI#4;
Image
6.wait for the 'Deep Cleanup' to finish in UI#5;
Image
7.click the 'Done' button in UI#6 to complete the whole process.
Image
8.Finally,back to the 'Storage management' page.
Image

Real Process

1.start the MSPCManager desktop application;
Image
2.click the 'Storage' button in UI#1;
Image
3.click the 'Scan' button in UI#2;
Image
4.wait for the 'Deep Cleanup Scan' to finish in UI#3;
Image
5.click the 'Proceed' button in UI#4;
Image
6.wait for the 'Deep Cleanup' to finish in UI#5;
Image
7.click the 'Done' button in UI#6 to complete the whole process.
Image
8.Back to the 'Storage management' page.
Image
9.Then it continued executing from Step 3 to Step 8 and got stuck in an infinite loop, only stopping after exceeding the maximum number of execution steps. Moreover, the final report indicated that Steps 3 through 8 were not executed successfully.
Image
Image
Image
Image

The section of the report related to the cause of the error

Image Image

Logs

evaluation.log
request.log
response.log

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions