-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
Do you need to file an issue?
- I have searched the existing issues and this feature is not already filed.
- My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
- I believe this is a legitimate feature request, not just a question. If this is a question, please use the Discussions area.
Is your feature request related to a problem? Please describe.
Summary
I would like to propose adding ElasticSearch vector store support to GraphRAG to enhance enterprise adoption and infrastructure compatibility. This feature would allow organizations with existing ElasticSearch deployments to leverage GraphRAG without requiring additional vector database infrastructure.
I’ve attached the PR for the ElasticSearch support I implemented. I would greatly appreciate it if you could review it. I tested the full indexing workflow using the CLI and confirmed that all four search methods—local, global, basic, and drift—are working correctly.
Related PR: #2002
Motivation
ElasticSearch is one of the most widely adopted vector search solutions in the enterprise RAG ecosystem. Many organizations have already invested significantly in ElasticSearch infrastructure for their search and analytics needs. Currently, GraphRAG supports LanceDB and other vector stores, but lacks ElasticSearch integration, which creates a barrier for enterprise adoption.
Key Benefits:
- Infrastructure Reuse: Organizations can leverage existing ElasticSearch clusters
- Enterprise Readiness: ElasticSearch offers enterprise-grade features like security, monitoring, and scaling
- Cost Efficiency: Reduces infrastructure complexity and operational overhead
- Industry Standard: Many RAG implementations in production already use ElasticSearch as their vector backend
Proposed Solution
I propose implementing an ElasticSearchVectorStore
class that follows the existing BaseVectorStore
interface pattern, similar to the current LanceDB implementation. This approach would ensure:
- Full compatibility with existing GraphRAG workflows
- Seamless integration with the current
VectorStoreFactory
pattern - Consistent API surface across all vector store implementations
- No breaking changes to existing functionality
Technical Approach:
- Implement
BaseVectorStore
interface with ElasticSearch backend - Support for KNN vector search using ElasticSearch's native capabilities
- Dynamic vector dimension detection (following LanceDB's flexibility)
- Bulk document operations for efficient indexing
- Proper error handling and connection management
Configuration Example:
vector_store:
default_vector_store:
type: elasticsearch
url: "http://localhost:9200"
index_name: "graphrag_vectors"
# Additional ElasticSearch-specific configurations
Implementation Considerations
- Compatibility: The implementation would follow the exact same patterns as the existing LanceDB vector store
- Testing: Comprehensive integration tests to ensure feature parity
- Documentation: Clear setup and configuration guidelines
- Dependencies: Minimal additional dependencies (elasticsearch-py client)