This project enables users to interactively chat with invoice documents and extract structured, formatted data from them. Leveraging advanced Natural Language Processing (NLP) and document parsing techniques, it provides an intuitive interface for querying and retrieving invoice details efficiently.
- Chat-Based Interface: Communicate with the system using natural language to ask questions about invoice documents.
- Automatic Invoice Parsing: Upload invoice files (PDF, image, etc.) and automatically extract key data fields such as invoice number, date, total amount, vendor details, line items, and more.
- Structured Data Output: Receive results in a structured and formatted manner (e.g., JSON, tables) suitable for further processing or integration.
- Multi-Format Support: Supports various invoice formats and layouts, including scanned images and digital PDFs.
- Contextual Understanding: Handles follow-up questions and context, enabling conversational extraction (e.g., "What’s the due date on the last invoice?").
- Export Options: Export extracted data for use in spreadsheets, databases, or accounting software.
- Flexible Deployment: Can be integrated as a web application, chatbot, or API service.
- Streamlit Demo App: Try out the functionality in your browser without setup using our hosted Streamlit app: Invoice Extract AI Streamlit Demo
- Python 3.8+
- (List any additional dependencies or tools required)
- Clone the repository:
git clone https://github.com/akshaykumarbedre/Chat-with-invoice-formated-data-extraction.git cd Chat-with-invoice-formated-data-extraction - Install dependencies:
pip install -r requirements.txt
- Start the application:
python app.py
- Open your browser and navigate to the provided local address.
- Upload an invoice document and start chatting to extract information.
If you want to see the app in action without local installation, use the hosted Streamlit version:
👉 Streamlit Invoice Extract AI Demo
No setup required—just upload your invoice and start chatting!
- Python (Flask)
- OCR (Multimodel llm )
- Framework Libraries (langchain)
- Frontend: (Streamlit)
Contributions are welcome! Please open an issue or submit a pull request for new features, bug fixes, or suggestions.
This project is licensed under the MIT License. See LICENSE for details.
- Open-source NLP and OCR libraries
- Inspiration from community-driven document extraction projects
Created by akshaykumarbedre




