LLMOCR

LLMOCR uses a local LLM to read text from images.

You can also change the instruction to have the LLM use the image in the way that you prompt.

Features

Local Processing: All processing is done locally on your machine.
User-Friendly GUI: Includes a GUI. Relies on Koboldcpp, a single executable, for all AI functionality.
GPU Acceleration: Will use Apple Metal, Nvidia CUDA, or AMD (Vulkan) hardware if available to greatly speed inference.
Cross-Platform: Supports Windows, macOS ARM, and Linux.

Installation

Prerequisites

Python 3.8 or higher
KoboldCPP

Windows Installation

Clone the repository or download the ZIP file and extract it.
Install Python for Windows.
Download KoboldCPP.exe and place it in the LLMOCR folder. If it is not named KoboldCPP.exe, rename it to KoboldCPP.exe
If you want the script to download a model for you and have KoboldCpp run it for you, open llm_ocr.bat
If you want to load your own model using KoboldCpp, open llm_ocr_no_kobold.bat

Mac and Linux Installation

Clone the repository or download and extract the ZIP file.
Install Python 3.8 or higher if not already installed.
Create a new python env and install the requirements.txt.
Run kobold with flag --config llm-ocr.kcppt
Wait until the model weights finish downloading and the terminal window says Please connect to custom endpoint at http://localhost:5001
Run llm-ocr-gui.py using Python.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

KoboldCPP for local AI processing
PyQt6 for the GUI framework

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
LICENSE		LICENSE
README.md		README.md
joy-caption.bat		joy-caption.bat
joy-caption.py		joy-caption.py
llm-ocr-gui.py		llm-ocr-gui.py
llm-ocr.kcppt		llm-ocr.kcppt
llm_ocr.bat		llm_ocr.bat
llm_ocr_no_kobold.bat		llm_ocr_no_kobold.bat
llmocr.png		llmocr.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLMOCR

Features

Installation

Prerequisites

Windows Installation

Mac and Linux Installation

License

Acknowledgements

About

Releases

Packages

Languages

License

jabberjabberjabber/LLMOCR

Folders and files

Latest commit

History

Repository files navigation

LLMOCR

Features

Installation

Prerequisites

Windows Installation

Mac and Linux Installation

License

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages