Skip to content

Commit 320f843

Browse files
committed
fix chroma embeddings when an already seen file is uplaoded
1 parent c5c0bc4 commit 320f843

File tree

1 file changed

+7
-2
lines changed

1 file changed

+7
-2
lines changed

document_qa/document_qa_engine.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -205,11 +205,16 @@ def create_memory_embeddings(self, pdf_path, doc_id=None, chunk_size=500, perc_o
205205
if doc_id:
206206
hash = doc_id
207207
else:
208-
209208
hash = metadata[0]['hash']
210209

211210
if hash not in self.embeddings_dict.keys():
212-
self.embeddings_dict[hash] = Chroma.from_texts(texts, embedding=self.embedding_function, metadatas=metadata, collection_name=hash)
211+
self.embeddings_dict[hash] = Chroma.from_texts(texts, embedding=self.embedding_function, metadatas=metadata,
212+
collection_name=hash)
213+
else:
214+
self.embeddings_dict[hash].delete(ids=self.embeddings_dict[hash].get()['ids'])
215+
self.embeddings_dict[hash] = Chroma.from_texts(texts, embedding=self.embedding_function, metadatas=metadata,
216+
collection_name=hash)
217+
213218

214219
self.embeddings_root_path = None
215220

0 commit comments

Comments
 (0)