You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The workflow for attaching and processing images with OCR in version 2.0.0-rc1 has become significantly more cumbersome, requiring multiple clicks and confirmations. This negatively impacts usability, especially for users who primarily rely on OCR for image attachments.
Steps to Reproduce:
Attach an image file to a message.
Click on the attached image thumbnail.
Select the "Add Text (OCR)" option from the popup menu.
Warning 1: A warning appears: "Warning: May not be compatible with the current model. Please try another format." (This warning appears even if the model is compatible with OCR.)
Wait for the OCR processing to complete.
Click the "Chat" (or send) button.
Warning 2: A second warning appears: "Attachment Compatibility Notice. Some attached files may not be fully compatible with the current AI model. This could affect processing. Would you like to review or proceed?"
This results in a total of 4-5 clicks and two warnings per image before the OCR'd text is sent to the model.
Expected Behavior:
In previous versions (or as a desired enhancement), attaching an image should ideally:
Optionally, automatically perform OCR on the image without requiring manual selection (configurable via a setting).
Minimize or eliminate unnecessary warnings if the selected model supports OCR and the image format is compatible.
Streamline the process to require fewer clicks.
Enhancement Request:
Introduce a setting (e.g., "OCR all attached images by default") to allow users to bypass the manual selection of the "Add Text (OCR)" option. When enabled, any attached image would automatically be processed with OCR.
Additional Notes:
The two warnings ("May not be compatible..." and "Attachment Compatibility Notice") appear to be overly cautious and often unnecessary, especially when the user knows the model and image format are compatible. Consider suppressing these warnings, or making them less intrusive, when OCR is explicitly enabled or when compatibility can be confidently determined.
Our use case exclusively involves attaching images for OCR processing. We never upload images that are not intended for OCR. Therefore, a default OCR option would significantly improve our workflow.
The warnings are shown even if the models are compatible.
This revised report is more structured, clearly separates the bug (excessive clicks/warnings) from the enhancement request (default OCR setting), and provides precise steps to reproduce the issue. It also emphasizes the user impact and provides context for the requested changes. It is ready to be posted.
Device and browser
Big-AGI deployed on Vercel, accessed via Edge on Windows 11
Screenshots and more
Willingness to Contribute
🙋♂️ Yes, I would like to contribute a fix.
The text was updated successfully, but these errors were encountered:
@powyncify this makes sense. Default behavior for attachments should be a thing.
Do you want the Image at all, or just the OCR of it? I.e. should the option be to pre-check "Add Text" to the images or "Only Text (OCR)"?
That's a great idea, @enricoros ! Yes "Only Text (OCR)" makes sense. But only if it is easy to implement. Otherwise, just pre-check "add text" to the images will do fine.
Environment
GitHub v2-dev branch
Description
Description:
The workflow for attaching and processing images with OCR in version 2.0.0-rc1 has become significantly more cumbersome, requiring multiple clicks and confirmations. This negatively impacts usability, especially for users who primarily rely on OCR for image attachments.
Steps to Reproduce:
This results in a total of 4-5 clicks and two warnings per image before the OCR'd text is sent to the model.
Expected Behavior:
In previous versions (or as a desired enhancement), attaching an image should ideally:
Enhancement Request:
Introduce a setting (e.g., "OCR all attached images by default") to allow users to bypass the manual selection of the "Add Text (OCR)" option. When enabled, any attached image would automatically be processed with OCR.
Additional Notes:
This revised report is more structured, clearly separates the bug (excessive clicks/warnings) from the enhancement request (default OCR setting), and provides precise steps to reproduce the issue. It also emphasizes the user impact and provides context for the requested changes. It is ready to be posted.
Device and browser
Big-AGI deployed on Vercel, accessed via Edge on Windows 11
Screenshots and more
Willingness to Contribute
The text was updated successfully, but these errors were encountered: