Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix constant error observations appearing in the logs #5352

Closed
wants to merge 3 commits into from

Conversation

tofarr
Copy link
Collaborator

@tofarr tofarr commented Dec 1, 2024

Now checking that .gitignore exists. before trying to filter based on its contents.

  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

I see the following constantly in the logs:

|21:13:47 - openhands:DEBUG: action_execution_server.py:166 - Running action:
    |FileReadAction(path='.gitignore', start=0, end=-1, thought='', action='read', security_risk=None)
    ...
    |21:13:47 - openhands:DEBUG: action_execution_server.py:168 - Action output:
    |**ErrorObservation**
    |File not found: /workspace/.gitignore. Your current working directory is /workspace.
    ...

The reason is that every time we call /list-files we try to filter based on the contents of .gitignore - so we get an error message every time we try to list files if there is no .gitignore.

This change checks that the .gitignore exists before trying to read it.


To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:6e252fe-nikolaik   --name openhands-app-6e252fe   docker.all-hands.dev/all-hands-ai/openhands:6e252fe

@tofarr tofarr marked this pull request as ready for review December 1, 2024 21:20
@enyst
Copy link
Collaborator

enyst commented Dec 2, 2024

For some reason, one of the image builds got stuck for over 4h. I restarted them.

async def filter_for_gitignore(file_list, base_path):
gitignore_path = os.path.join(base_path, '.gitignore')
async def has_gitignore() -> bool:
file_list = await call_sync_from_async(runtime.list_files, '')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is a bit heavy, to do twice, for every click. (first at line 68)

Could we avoid it somehow? Or it doesn't matter?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree we shouldn't do a list_files first--we should just try to access the file, and use error handling for FileNotFoundError

If that's causing logspam let's remove the logs upstream

try:
read_action = FileReadAction(gitignore_path)
read_action = FileReadAction('.gitignore')
observation = await call_sync_from_async(runtime.run_action, read_action)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought, if there is no addition to the stream on this execution path (and it shouldn't be), could we just get/read the ErrorObs and treat it here directly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this approach, the logs would still contain:

|**ErrorObservation**
    |File not found: /workspace/.gitignore. Your current working directory is /workspace

The only ways around this that I can see are:

  1. Live with this oddity.
  2. Do not filter files using .gitignore
  3. Check that the .gitignore exists. (The approach I took)
  4. Update the FileReadAction with an optional flag to permit missing files and not yield an error observation.
  5. Update the runtime to allow access to the FileStore and use this directly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if wrong, but isn't there also
6. Do not log stuff in the FileReadAction, but in its client code?

Or that sounds really bad? I don't think they are many... but I could be wrong.

Copy link
Collaborator

@enyst enyst Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this one isn't doing it, it's the run_action, and that is really useful...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we use runtime.read directly instead of run_action

Copy link
Collaborator

@enyst enyst Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it also does run_action? We could write one that doesn't, I guess, unless I'm missing something.

OK, but it's late and sorry, it seems I no longer understand this problem in the first place: weren't these all in the sandbox now, both the results of list_files and .gitignore, so why do we have runtime.list_files return all files?

Copy link
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this, the log spam is annoying and everyone will be better off when it's gone.

We have a good discussion in the comments, warts and confusions and all. I think hashing out the potential issues in comments makes a PR better, and I would love to see the final form here.

So I'll put a Request changes just to make sure... that it's not merged too fast. 😅

(I noticed you merge PRs a little too fast quite often, before comments are addressed or even acknowledged. Sorry, but I think we can do this better, because sometimes that can break things and some of us really don't like to break things in main here.)

@tofarr tofarr closed this Dec 5, 2024
@tofarr
Copy link
Collaborator Author

tofarr commented Dec 5, 2024

I'm gonna punt on this one now as there does seem to be concerns about the additional overhead of a file existence check. Perhaps a more thorough refactor will allow a better solution

@tofarr tofarr deleted the fix-constant-error-observations branch December 6, 2024 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants