[BUG/Enhancement] OCR Workflow Requires Excessive Clicks and Warnings in 2.0.0-rc1 #762

powyncify · 2025-02-20T19:49:04Z

Environment

GitHub v2-dev branch

Description

Description:

The workflow for attaching and processing images with OCR in version 2.0.0-rc1 has become significantly more cumbersome, requiring multiple clicks and confirmations. This negatively impacts usability, especially for users who primarily rely on OCR for image attachments.

Steps to Reproduce:

Attach an image file to a message.
Click on the attached image thumbnail.
Select the "Add Text (OCR)" option from the popup menu.
- Warning 1: A warning appears: "Warning: May not be compatible with the current model. Please try another format." (This warning appears even if the model is compatible with OCR.)
Wait for the OCR processing to complete.
Click the "Chat" (or send) button.
- Warning 2: A second warning appears: "Attachment Compatibility Notice. Some attached files may not be fully compatible with the current AI model. This could affect processing. Would you like to review or proceed?"

This results in a total of 4-5 clicks and two warnings per image before the OCR'd text is sent to the model.

Expected Behavior:

In previous versions (or as a desired enhancement), attaching an image should ideally:

Optionally, automatically perform OCR on the image without requiring manual selection (configurable via a setting).
Minimize or eliminate unnecessary warnings if the selected model supports OCR and the image format is compatible.
Streamline the process to require fewer clicks.

Enhancement Request:

Introduce a setting (e.g., "OCR all attached images by default") to allow users to bypass the manual selection of the "Add Text (OCR)" option. When enabled, any attached image would automatically be processed with OCR.

Additional Notes:

The two warnings ("May not be compatible..." and "Attachment Compatibility Notice") appear to be overly cautious and often unnecessary, especially when the user knows the model and image format are compatible. Consider suppressing these warnings, or making them less intrusive, when OCR is explicitly enabled or when compatibility can be confidently determined.
Our use case exclusively involves attaching images for OCR processing. We never upload images that are not intended for OCR. Therefore, a default OCR option would significantly improve our workflow.
The warnings are shown even if the models are compatible.

This revised report is more structured, clearly separates the bug (excessive clicks/warnings) from the enhancement request (default OCR setting), and provides precise steps to reproduce the issue. It also emphasizes the user impact and provides context for the requested changes. It is ready to be posted.

Device and browser

Big-AGI deployed on Vercel, accessed via Edge on Windows 11

Screenshots and more

Willingness to Contribute

🙋‍♂️ Yes, I would like to contribute a fix.

enricoros · 2025-02-21T20:46:49Z

@powyncify this makes sense. Default behavior for attachments should be a thing.
Do you want the Image at all, or just the OCR of it? I.e. should the option be to pre-check "Add Text" to the images or "Only Text (OCR)"?

powyncify · 2025-02-21T21:54:43Z

That's a great idea, @enricoros ! Yes "Only Text (OCR)" makes sense. But only if it is easy to implement. Otherwise, just pre-check "add text" to the images will do fine.

Keep up the good work!

enricoros · 2025-02-21T23:46:05Z

Thanks @powyncify , it you deploy your own source this can be done easily, probably just a single line change.

I can't make it Default for everyone as even people uploading photos would get some weird OCR symbols out of them.

Seems like having an option would be the way to go (note that I try to minimize options in favor of auto detects or ux behaviors).

powyncify added the type: bug Something isn't working label Feb 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG/Enhancement] OCR Workflow Requires Excessive Clicks and Warnings in 2.0.0-rc1 #762

[BUG/Enhancement] OCR Workflow Requires Excessive Clicks and Warnings in 2.0.0-rc1 #762

powyncify commented Feb 20, 2025

enricoros commented Feb 21, 2025

powyncify commented Feb 21, 2025

enricoros commented Feb 21, 2025

[BUG/Enhancement] OCR Workflow Requires Excessive Clicks and Warnings in 2.0.0-rc1 #762

[BUG/Enhancement] OCR Workflow Requires Excessive Clicks and Warnings in 2.0.0-rc1 #762

Comments

powyncify commented Feb 20, 2025

Environment

Description

Device and browser

Screenshots and more

Willingness to Contribute

enricoros commented Feb 21, 2025

powyncify commented Feb 21, 2025

enricoros commented Feb 21, 2025