ποΈ AI-Powered Legal Contract Analysis System
A comprehensive agentic AI system built with LangChain, RAG (Retrieval Augmented Generation), local Llama models, and LangGraph for automated legal contract analysis, risk assessment, and compliance checking.
- Contract Type Detection: Automatically identifies contract types (Service Agreement, Employment Contract, NDA, etc.)
- Entity Extraction: Extracts key parties, dates, financial terms, and important clauses
- Risk Assessment: Identifies potential legal, financial, and operational risks
- Compliance Checking: Verifies contracts against legal and regulatory requirements
- Recommendation Engine: Provides actionable suggestions for contract improvement
- Multi-format Support: Processes PDF, TXT, and DOCX files
- Real-time Processing: Live status updates during analysis
- Historical Analysis: Track and compare multiple contract analyses
- Risk Dashboard: Visual analytics for risk trends and patterns
- RAG-Enhanced Analysis: Leverages legal knowledge base for informed decisions
- LangChain: Core AI pipeline and document processing
- LangGraph: Workflow orchestration and state management
- Ollama: Local LLM deployment (Llama 3.1 8B)
- ChromaDB: Vector database for RAG implementation
- FastAPI: Backend API service
- Streamlit: Interactive web interface
- PyPDF: PDF document processing
- Plotly: Data visualization and analytics
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Streamlit UI ββββββ FastAPI ββββββ LangGraph β
β (Frontend) β β (Backend) β β (Workflow) β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β β
β β β
βΌ βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β File Upload β β Contract β β Entity β
β & Processing β β Analysis API β β Extraction β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β
β βΌ
β βββββββββββββββββββ
β β Risk β
β β Assessment β
β βββββββββββββββββββ
β β
β βΌ
β βββββββββββββββββββ
β β Compliance β
β β Check β
β βββββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ
β ChromaDB β β Ollama LLM β
β (Vector DB) β β (Local) β
βββββββββββββββββββ βββββββββββββββββββ
- Python 3.8+
- Ollama installed locally
- Git
- Clone the repository
git clone https://github.com/payal211/ContractAgent-Pro.git
cd ContractAgent-Pro
- Create conda environment
conda create -n contract-analyzer python=3.9
conda activate contract-analyzer
- Install dependencies
pip install -r requirements.txt
- Clone the repository
git clone https://github.com/payal211/ContractAgent-Pro.git
cd ContractAgent-Pro
- Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies
pip install -r requirements.txt
- Install and setup Ollama
# Install Ollama (visit https://ollama.ai for installation instructions)
# Pull the required model
ollama pull llama3.1:8b
- Initialize the knowledge base
python main.py # This will create the legal knowledge base
- Start the FastAPI backend
uvicorn api:app --reload --host 0.0.0.0 --port 8000
or
python api.py
- Launch the Streamlit UI
streamlit run st_app.py
- Access the application
- Open your browser to
http://localhost:8501
- The API docs are available at
http://localhost:8000/docs
- Paste contract text directly into the interface
- Supports any plain text contract content
- Ideal for quick analysis and testing
- Upload PDF, TXT, or DOCX files
- Automatic text extraction and processing
- Supports multi-page documents
- Document Processing: Upload or paste contract text
- Entity Extraction: System identifies key parties, terms, and clauses
- Risk Assessment: Analyzes potential legal and business risks
- Compliance Check: Verifies against regulatory requirements
- Recommendations: Generates actionable improvement suggestions
- Risk Dashboard: Visual analytics for risk patterns
- Analysis History: Track and compare multiple analyses
- Compliance Reports: Detailed compliance scoring
- System Metrics: Performance and usage statistics
# .env file
OLLAMA_MODEL=llama3.1:8b
CHROMA_DB_PATH=./chroma_db
KNOWLEDGE_BASE_PATH=./legal_knowledge_base
API_HOST=0.0.0.0
API_PORT=8000
LOG_LEVEL=INFO
# In main.py
OLLAMA_MODEL = "llama3.1:8b" # Change to your preferred model
EMBEDDING_MODEL = "nomic-embed-text"
{
"contract_type": "Service Agreement",
"key_parties": ["ABC Consulting LLC", "XYZ Corporation"],
"financial_terms": {
"amounts": ["$10,000 per month"],
"payment_terms": "Net 30"
},
"risks": [
"Payment terms not clearly specified",
"Termination clauses may be unclear"
],
"recommendations": [
"Clearly define payment terms including amounts, due dates, and late payment penalties",
"Add liability limitation clauses to cap potential damages"
],
"compliance_issues": [],
"analysis_confidence": 0.85
}
- High Risk: Critical issues requiring immediate attention
- Medium Risk: Important concerns to address
- Low Risk: Minor issues for consideration
- Compliance Issues: Regulatory and legal compliance concerns
legal-contract-analyzer/
βββ main.py # Core analyzer logic
βββ api.py # FastAPI backend
βββ st_app.py # Streamlit frontend
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ legal_knowledge_base/ # Legal documents and rules
βββ chroma_db/ # Vector database storage
βββ tests/ # Test files
- LegalContractAnalyzer: Main analysis engine
- ContractAnalyzerState: LangGraph state management
- ContractAnalysis: Data structure for results
- ContractAnalyzerUI: Streamlit UI controller
- CPU: 8+ cores for optimal performance
- RAM: 16GB+ (8GB minimum)
- Storage: SSD recommended for vector database
- GPU: Optional, but speeds up LLM inference
- Use GPU acceleration for large-scale processing
- Implement caching for repeated analyses
- Consider distributed processing for high-volume scenarios
- Local Processing: All analysis happens locally
- No External APIs: Contracts never leave your environment
- Secure Storage: Vector database encryption available
- Audit Trail: Complete analysis history tracking
- Regularly update the legal knowledge base
- Monitor system logs for security events
- Implement access controls for sensitive contracts
- Regular backup of analysis results